Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

Oracle

Posted Mar 14, 2007

Ordering Events in Oracle

By Steve Callan

One of Oracle’s most critical and important functions is how it keeps track of events. Even within a single processor, “simple” database system, keeping track of events is just as important as recording and maintaining the proper order of events in a distributed system. Recording the “when” in “when did event X take place?” can be problematic, so a process that is complete (in the mathematical sense) is essential. In terms of an algorithm, does an algorithm result in the same/consistent answer each and every time the same circumstances or conditions exist?

Let’s take an astrological event as an example to help frame what “when” means. Suppose you were born in 1970. Twenty years later you see a star go supernova. In one sense or frame of reference, the event occurred in 1990. In another frame of reference, the event, given that the star was 20 light-years away, took place the same year our notional observer was born. Depending on the context, the event time stamps of 1970 and 1990 are equally valid.

In an example closer to home, how would you keep track of when database events (both data manipulation and definition) take place? Within a subset of the database, using your schema and transactions as an example, keeping track of your ordered set of operations is simpler to maintain than when comparing or ordering your transactions with respect to someone else’s. Your ordered set is only partially ordered. Clearly, the ordering of events is essential for many reasons, perhaps the most important being for recovery. Did your commit take place before mine, or did mine occur before yours but was recorded later?

How do you order events?

If you were to describe the situation of recording a series of ordered events, perhaps a title of “Time, Clocks and the Ordering of Events in a Distributed System” would aptly describe the situation. One way to keep track of events is to record their timestamps, but this can introduce another factor, that of maintaining the accuracy and synchronization of distributed clocks. But what type of clock are we referring to? No doubt, the clock you have in mind is a physical clock.

What if the clock were logical instead of physical? This is exactly the point raised by Dr. Leslie Lamport in his 1978 paper titled “Time, Clocks and the Ordering of Events in a Distributed System.” In such a system, a clock condition can be described as:

For any events a, b, if a occurs before b, then C(a) < C(b).

The clock time of event a, however the clock is defined, is before that of event b. In a computing system with multiple processes, the clock condition is met when the two conditions stated below hold.

C1. If a and b are events in process Pi, and a comes before b, the Ci(a) < Ci(b).

C2. If a is the sending of a message by process Pi and b is the receipt of that message by process Pj, then Ci(a) < Cj(b).

Now that we have some conditions, the next item to consider is their implementation. One rule (applies chiefly to condition one) is that process Pi increments Ci between any two successive events. A second rule implements the sending of a timestamp along with a message. Upon receipt of a message from another process, the receiving process sets its time to be greater than or equal to its present value and greater than the timestamp of the message.

Dr. Lamport’s paper presents an algorithm which orders events totally. The context of “totally” is across the entire system, not just partially ordered within one process out of many. If our system of interest is a database with multiple users, all of whom are vying for the same resource, then the algorithm must satisfy three conditions.

  • A process which has been granted the resource must release it before it can be granted to another process.

  • Different requests for the resource must be granted in the order in which they are made.

  • If every process which is granted the resource eventually releases it, then every request is eventually granted.

The algorithm’s five steps can be summarized as follows.

  • To request a resource, a process sends a time stamped message to all other processes.

  • A receiving process places the “request resource” message in its own request queue and sends a time stamped acknowledgement back to the sender.

  • To release a resource, the owning process sends a similar time stamped message (“releases resource”) to all other processes.

  • Upon receipt of a release resource message, the “request resource” message is removed from the request queue.

  • A requesting process is granted the resource when its time stamped request is ordered before any other requests and when it has received an acknowledgement message time stamped later than the initial request.

The relevance of Lamport’s algorithm to Oracle

By now the time stamping of messages, use of clocks, and ordering of events must sound very much like “a stamp that defines a committed version of a database at a point in time.” In other words, the implementation of Dr. Lamport’s algorithm within Oracle is what we know as an SCN generation scheme. A search for “Lamport SCN generation” in Oracle’s documentation will show several references (primarily in RAC) to this scheme, but none of them really defines what the scheme is or how it works.

The astute reader may have noticed I did not say this is “the” scheme (as in the one and only scheme). Oracle uses several schemes in addition to the Lamport SCN generation scheme. Peek inside your alert log and you will see something along the lines of what is shown below.

Picked broadcast on commit scheme to generate SCNs
Picked latch-free SCN scheme 2
Picked latch-free SCN scheme 3

MetaLink Note 260304.1 defines what broadcast on commit is or does: “At every commit, the most recent local SCN will be broadcasted. Read consistency is fully guaranteed for any kind of workload, since every change in the local SCN is immediately propagated.” Aside from this description, there is a general lack of public information about how these schemes work.

In addition to the alert log, a hidden parameter will also reveal the current SCN generation scheme. Is the generation scheme modifiable? A query like the one below shows the answer.

SELECT X.KSPPINM NAME, 
DECODE(BITAND(KSPPIFLG/256, 1), 1, 'TRUE', 'FALSE') SESMOD,
DECODE( BITAND(KSPPIFLG/65536, 3), 1, 'IMMEDIATE', 2, 'DEFERRED', 3, 'IMMEDIATE', 'FALSE' ) SYSMOD,
KSPPDESC DESCRIPTION
FROM SYS.X_$KSPPI X
WHERE X.INST_ID = USERENV('INSTANCE')
AND TRANSLATE(KSPPINM,'_','#') LIKE '#%'
AND X.KSPPINM like '%scn_scheme%';
NAME         SESMOD  SYSMOD     DESCRIPTION
------------ ------- ---------- ------------
_scn_scheme  FALSE   FALSE      SCN scheme

This query results in no rows returned if using 10g Release 2 (the parameter is not present), which goes to the familiar bromide of don’t count on being able to use hidden parameters.

In Closing

The time stamping and ordering of events within Oracle is of prime importance and goes to the heart of how transaction control and serialization are even possible. The total ordering of disparate events, whether within a simple system or a distributed one, is not just essential, but mandatory. Take the date of Dr. Lamport’s paper into consideration and then recall when Oracle’s version 2 was released (no version 1, long marketing spin story documented elsewhere). The tie-in is not hard to miss. There was some very clever thought that went into the design of Oracle, and at the same time, some very clever adaptation of other people’s work, the algorithm stated in Dr. Lamport’s paper being one such example.

References:

Lamport, Leslie, “Time, Clocks and the Ordering of Events in a Distributed System.”
Communications of the ACM 21, 7   (July 1978), 558-565.  Reprinted in several collections, including Distributed Computing: Concepts and Implementations, McEntire et al., ed.  IEEE Press, 1984.

See www.lamport.org for other related articles.

» See All Articles by Columnist Steve Callan



Oracle Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
Oracle Forum
Topic By Replies Updated
Oracle Data Mining: Classification jan.hasller 0 July 5th, 07:19 AM
Find duplicates - Unique IDs Lava 5 July 2nd, 08:30 AM
no matching unique or primary key rcanter 1 April 25th, 12:32 PM
Update values of one table based on condition of values in other table using Trigger Gladiator 3 February 29th, 06:01 PM