Multi-Versioning (Con't) - Page 7
September 27, 2001
What we would like to do is to run the end-of-day report that tells us how much money is in the bank. That is an extremely simple query:
And, of course, in this example the answer is obvious: $1250. However, what happens if we read row 1, and while we're reading rows 2 and 3, an Automated Teller Machine (ATM) generates transactions against this table, and moves $400 from account 123 to account 456? Our query counts $500 in row 4 and comes up with the answer of $1650, doesn't it? Well, of course, this is to be avoided, as it would be an error — at no time did this sum of money exist in the account balance column. It is the way in which Oracle avoids such occurrences, and how Oracle's methods differ from every other database, that you need to understand.
In practically every other database, if you wanted to get a 'consistent' and 'correct' answer to this query, you would either have to lock the whole table while the sum was calculated or you would have to lock the rows as you read them. This would prevent people from changing the answer as you are getting it. If you lock the table up-front, you'll get the answer that was in the database at the time the query began. If you lock the data as you read it (commonly referred to as a shared read lock, which prevents updates but not other readers from accessing the data), you'll get the answer that was in the database at the point the query finished. Both of these methods inhibit concurrency a great deal. The table lock would prevent any updates from taking place against the entire table for the duration of your query (for a table of four rows, this would only be a very short period — but for tables with hundred of thousands of rows, this could be several minutes). The 'lock as you go' method would prevent updates on data you have read and already processed and could actually cause deadlocks between your query and other updates.
Now, I said earlier that you would not be able to take full advantage of Oracle if you did not understand the concept of multi-versioning. Here is one reason why that is true. Oracle uses multi-versioning to get the answer, as it existed at the point in time the query began, and the query will take place without locking a single thing (while our account transfer transaction updates rows 1 and 4, these rows will be locked to other writers — but not locked to other readers, such as our SELECT SUM...query). In fact, Oracle doesn't have a 'shared read' lock common in other databases — it does not need it. Everything inhibiting concurrency that can be removed, has been removed.
So, how does Oracle get the correct, consistent answer ($1250) during a read without locking any data— in other words, without decreasing concurrency? The secret lies in the transactional mechanisms that Oracle uses. Whenever you modify data, Oracle creates entries in two different locations. One entry goes to the redo logs where Oracle stores enough information to redo or 'roll forward' the transaction. For an insert this would be the row inserted. For a delete, it is a message to delete the row in file X, block Y, row slot Z. And so on. The other entry is an undo entry, written to a rollback segment. If your transaction fails and needs to be undone, Oracle will read the 'before' image from the rollback segment and restore the data. In addition to using this rollback segment data to undo transactions, Oracle uses it to undo changes to blocks as it is reading them — to restore the block to the point in time your query began. This gives you the ability to read right through a lock and to get consistent, correct answers without locking any data yourself.
So, as far as our example is concerned, Oracle arrives at its answer as follows:
At time T6, Oracle is effectively 'reading through' the lock placed on row 4 by our transaction. This is how non-blocking reads are implemented — Oracle only looks to see if the data changed, it does not care if the data is currently locked (which implies that it has changed). It will simply retrieve the old value from the rollback segment and proceed onto the next block of data.
This is another clear demonstration of multi-versioning — there are multiple versions of the same piece of information, all at different points in time, available in the database. Oracle is able to make use of these 'snapshots' of data at different points in time to provide us with read-consistent queries and non- blocking queries.
This read-consistent view of data is always performed at the SQL statement level, the results of any single SQL statement are consistent with respect to the point in time they began. This quality is what makes a statement like the following insert a predictable set of data:
The result of the SELECT * FROM T is preordained when the query begins execution. The SELECT will not see any of the new data generated by the INSERT. Imagine if it did — this statement might be a never-ending loop. If, as the INSERT generated more rows in CUSTOMER, the SELECT could 'see' those newly inserted rows — the above piece of code would create some unknown number of rows. If the table T started out with 10 rows, we might end up with 20, 21, 23, or an infinite number of rows in T when we finished. It would be totally unpredictable. This consistent read is provided to all statements so that an INSERT such as the following is predicable as well:
The INSERT statement will with be provided a read-consistent view of T — it will not see the rows that it itself just inserted, it will only insert the rows that existed at the time the INSERT began. Many databases won't even permit recursive statements such as the above due to the fact that they cannot tell how many rows might actually be inserted.
So, if you are used to the way other databases work with respect to query consistency and concurrency, or you have never had to grapple with such concepts (no real database experience), you can now see how understanding how this works will be important to you. In order to maximize Oracle's potential, you need to understand these issues as they pertain to Oracle — not how they are implemented in other databases.