Posted in an Oracle forum some time back was a question on the latch
event:
latch: parallel query alloc buffer
It was noted by the person posting the question that there’s not much, if any, information on the latch
event. The question posed in that forum was, essentially, just WHAT is that latch event and why does it occur? To answer this it will be necessary to understand some of the ‘ins-and-outs’ of parallel processing according to Oracle.
Parallel query processing allocates slaves to do the bulk of the work; as a result of that parallel coordinators allocate message buffers to handle intermediate results. Depending on the system configuration and the load the parallel slaves must handle a surprisingly large number of message buffers can be allocated. There are two equations that can be used to compute the number of buffers that will be required; those equations use the parallel degree value that is either calculated by Oracle or specified in a hint; for simplicity that value will be represented by pd
. For a non-RAC Oracle database the number of buffers required would be calculated as:
pd(pd+1)*3
RAC adds a bit of complexity and multiple instances and, as a result, the equation is modified to be:
pd(pd+1)*4
On non-RAC systems various degrees of parallelism will be used to calculate the estimated number of buffers Oracle could allocate. Consider a query executed with a parallel degree set to 4; the first equation produces the following results:
4(4+1) * 3
becomes:
4(5) * 3
and simplifying:
20 * 3 = 60
Four parallel slaves can allocate 60 buffers. Now consider setting a parallel degree of 16. It’s important to know that this buffer calculation is not proportional; a degree of 16 is four times the original setting, but the result is far from four times the original result:
16(16+1) * 3
Combining like terms produces:
16(17) * 3
which reduces to:
272 * 3 = 816
A parallel execution using 16 slaves requires a total of 816 message buffers, almost 14 times the number required by the execution at a parallel degree of 4. Given the computing power in modern servers, it is not a ridiculous thought to set a parallel degree of 128. Performing that calculation produces the following result:
126(128+1) * 3
Continuing as in previous calculations:
128(129) * 3
The result is:
:
16512 * 3 = 45936
This is a considerable increase over the buffers allocated for a parallel degree 4 execution and far larger than what would be expected if the result from the degree 4 execution is multiplied by 32. Having the “latch: parallel query alloc buffer
” wait as one of the top waits isn’t a far-fetched idea after all. A determining factor affecting this wait is where the PX message pool (‘PX msg pool’) resides; both the large pool and the shared pool are locations where this pool is usually found. The query provided below can be executed to learn where it located in a given database configuration:
break on pool skip 1
column name format a32
select
pool,
name,
bytes
from
v$sgastat
where
pool = 'large pool'
or
name like '%PX%'
or
name like '%para%'
order by
pool, name;
An example of the results of this query follows:
BING @ quanghoo > break on pool skip 1
BING @ quanghoo > column name format a32
BING @ quanghoo >
BING @ quanghoo > select
2 pool,
3 name,
4 bytes
5 from
6 v$sgastat
7 where
8 pool = 'large pool'
9 or
10 name like '%PX%'
11 or
12 name like '%para%'
13 order by
14 pool, name;
POOL NAME BYTES
------------ -------------------------------- ----------
large pool PX msg pool 491520
free memory 7897088
shared pool PX QC deq stats 1696
PX QC msg stats 3080
PX list of chunk lists 1088
PX msg pool struct 56
PX scan rate stats 1080
PX server deq stats 1696
PX server msg stats 3080
PX subheap 156344
POOL NAME BYTES
------------ -------------------------------- ----------
shared pool PX subheap desc 256
parallel kcbibr 32
parallel kcbibr dbwr bitv 16
parallel_max_servers 14720
param hash values 15912
parameter blocks 31824
parameter handle 137584
parameter string values 291240
parameter table block 1050648
parameter text value 9400
parameter value memory 1416
21 rows selected.
BING @ quanghoo >
From the example above it can be seen that the ‘PX msg pool’, for this database, is in the large pool. Versions of Oracle older than 11.2 (there are still some running) could be using the shared pool; this could result in the “latch: parallel query alloc buffer
” wait to experience exceptionally long wait times as the result of contention for the available resources.
For RAC configurations the equation changes slightly but that does not mean that the calculated results follow that trend. Using the same parallel degree settings from the first round of calculations the differences start small but soon increase dramatically:
4(4+1) * 4
reduces to:
4(5) * 4
and produces a final result of:
20 * 4 = 80
an increase of 33% simply because this is Real Application Clusters. Moving on to a parallel degree of 16 the equation generates:
16(16+1) * 4
resulting in:
16(17) * 4
giving a final value of:
272 * 4 = 1088
This is 272 more buffers as a result of a clustered database configuration.
Engineered systems, from Oracle and from other vendors, make it possible to have a parallel degree set to 128 or higher. Considering parallel execution at a degree of 128 the computed allocation becomes:
128(128+1) * 4
Combining like terms:
128(129) * 4
produces:
16512 * 4 = 66048
66,048 is a large number of message buffers. Even 10 years ago a parallel degree of 128 could have crippled a transactional database and possibly some smaller data warehouses or data marts with waits for the ‘parallel query alloc buffer’ latch. Depending on the overall resource utilization (disk, memory, swap) some smaller engineered systems could still fall victim to this wait.
If parallel execution plans are rare in a database configuration this latch wait may be of little or no concern. If there is a reasonable chance that parallel execution can occur, then being aware that parallel buffer allocation can be an issue can make DBA life a bit less surprising and may shorten problem investigation and resolution times. This can all lead to a bit less stress on the DBA.
# # #