Wait events in Oracle RAC
June 19, 2007
Oracle has been self-managing for a long time. We all know that, but the self-tuning capacity is yet to be seen. Most Oracle DBAs go for the Dynamic Performance Views when it comes to analyzing database problems. Here we will primarily look at the Global Cache waits.
What is a Wait Event?
An event can be anything that Oracle has to perform on behalf of a set of instructions sent by the user interface. It can also be its own background process as well. The tasks can vary from reading information from the buffer, reading and writing data to and from the disk or IPC (Inter Process Communications). The term wait is used because every time a user connects to your application, a resource is allocated to perform tasks on its behalf. The waiting comes when a session is waiting for an action, sometimes from a user and at other times from the database. In each case the wait time, which is tracked, is charged to the resource waited upon. For instance, a block that is no longer in memory has to be picked up from the disk and then it has to wait for that block. That wait event can be associated with the file sequential read event.
Global Buffer Cache: How is it different on a RAC?
Let's quickly see how a buffer cache works in the RAC environment. In a typical single node Oracle database there is only one instance and it has only one set of memory segments. Moreover, all OS related operations such as I/O, SQL statements and cache operations are routed via that single set of memory structures. So, as you can see, on a simpler, small scale application it works fine but the whole ball game changes when we move onto clustering the database. Then suddenly you have multiple instances that share a single database. These instances are running on separate hardware, with its own OS. They have their own separate memory structures and the buffer cache has to be split across the nodes.
A requesting node may find the requested block resides on an entirely different node than itself! Processes, local to the machine (the remote node), need to access these buffer caches (which together make up the Global Buffer Cache), for reading. The remote nodes LMS (Lock Manager Service) process will be accessing the global buffer cache. Therefore, you see that the local buffer cache operations are not really local and are spread globally across all the RAC nodes. The operation associated with handling the requests is equally complicated and that makes the wait events in RAC equally different from a typical single node Oracle Server.
For instance, take this comparison. On a typical single node server, the block is requested by a process, pinning the buffer and thus modifying the block. While on a RAC, it may appear to do the same, which it does actually, but since there is the possibility of that modification having already taken place on another node, this makes the modification to the disk a risk as it already may have happened by another node. In addition, Oracle maintains consistency throughout the RAC with lock mastering and resource affinity. Here the requesting node makes a request to the GCS (Global Cache Service) to gain access to the resource currently mastered by the locking node (also called the master node).
In a typical RAC environment, the lock mastering is handled by the Global Resource Directory, which in turn is managed by the GES (Global Enqueue Service) and GCS. The remastering of the resources is based on resource affinity. This is good for performance as it localizes the resources per ownership. The more the resource is used by a particular instance the more of a chance (for performance reasons) that it is dynamically remastered to that node. The parameter _LM_DYNAMIC_REMASTERING = TRUE ensures this behavior. Setting it to FALSE will disable it. Try querying the view V$BH to see its various states:
FREE - unused XCUR - exclusive current SCUR - shared current CR - consistent read READ - being read from disk MREC - in media recovery mode IREC - in instance recovery mode WRI- Write Clone Mode PI- Past Image
Global Cache Wait
Normally, when requesting a block information Oracle first checks its own local cache, should the block not be there then it will request the resource master for shared access to that block. If the blocks are in the remote nodes buffer cache (note: buffer and blocks actually mean the same, it is the data entity that we wish to modify, which is normally referred to as a data block), then the blocks are copied via the backbone or the HIS (High Speed Interconnect). Such tests are excellent to carry out on a typical ESX server Oracle RAC node where the interconnect speed as fast as the PCI speeds. And it is this little time (however little that it may be) required to get that block from the remote cache, recorded as the "global cache cr request" wait event.
Depending on the shared or exclusive mode of the buffer, the time may differ. If it is in a shared mode then the remote node copies the cache to the requesting node. If the number of blocks exceed the _FAIRNESS_THRESHOLD value then the lock might be downgraded. If the buffer is in exclusive mode (XCUR), the Past Image has to be built and copied across the buffer cache.
In a typical scenario the requesting node will wait up to 100cs and then retry reading the same block either from the disk or wait for the remote buffer. If you are experiencing excessive waits then you might have a slow private interconnect.
So you get the idea why we need the infrastructure (dual network cards, for instance) Oracle RAC and how we can enhance our RAC's performance. This is barely the tip of the iceberg when it comes to performance tuning our RAC but does play a crucial role in helping us decide how we should configure our RAC. Understanding the internals will help us understand it even more. We took a brief look at our Global Cache Wait; in a future article, we will go into more detail when we benchmark our RAC on VMware by stress testing it.