Automatic Workload Repository (AWR) reports were a tremendous improvement over Statspack reports, primarily due to the depth and breadth of the sample data collected. A more detailed report, with extended metrics, provides a better look at what the database is doing during a given interval. Unfortunately this detail can create unusual information, especially in the SQL Elapsed Time section. For ‘normal’ queries (read that as ‘non-parallel’) the elapsed time is usually pretty accurate. Add in parallel query slaves and the overal elapsed time can grow to several times the actual clock time the query or statement consumed. Let’s set up an example, running in parallel, and see what the AWR report provides.
Let’s create a copy of EMP that has a non-0 degree of parallelism:
SQL> CREATE TABLE EMP
2 (EMPNO NUMBER(4) NOT NULL,
3 ENAME VARCHAR2(10),
4 JOB VARCHAR2(9),
5 MGR NUMBER(4),
6 HIREDATE DATE,
7 SAL NUMBER(7, 2),
8 COMM NUMBER(7, 2),
9 DEPTNO NUMBER(2))
10 parallel 4;
Table created.
SQL>
Now let’s make it a fairly large table:
SQL> begin
2 for i in 1..20 loop
3 insert into emp
4 select * from emp;
5 end loop;
6
7 commit;
8
9 end;
10 /
PL/SQL procedure successfully completed.
SQL> select count(*)
2 from emp;
COUNT(*)
----------
14680064
SQL>
Verifying that parallel is being used to process queries:
SQL> select *
2 from emp
3 where empno < 7400;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
7369 SMITH CLERK 7902 17-DEC-80 800 20
...
7369 SMITH CLERK 7902 17-DEC-80 800 20
1048576 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2873591275
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1073K| 89M| 6543 (1)| 00:01:19 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM)| :TQ10000 | 1073K| 89M| 6543 (1)| 00:01:19 | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 1073K| 89M| 6543 (1)| 00:01:19 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS FULL| EMP | 1073K| 89M| 6543 (1)| 00:01:19 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("EMPNO"<7400)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
17 recursive calls
0 db block gets
173077 consistent gets
86341 physical reads
0 redo size
18036545 bytes sent via SQL>*Net to client
769454 bytes received via SQL>*Net from client
69907 SQL>*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1048576 rows processed
SQL>
Now generate an AWR report for the interval in question:
SQL Ordered by Elapsed Time
- Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code
- % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
- %Total – Elapsed Time as a percentage of Total DB time
- %CPU – CPU Time as a percentage of Elapsed Time
- %IO – User I/O Time as a percentage of Elapsed Time
- Captured SQL account for 98.0% of Total DB Time (s): 18
- Captured PL/SQL account for 0.6% of Total DB Time (s): 18
Elapsed Time (s) | Executions | Elapsed Time per Exec (s) | %Total | %CPU | %IO | SQL Id | SQL Module | SQL Text |
---|---|---|---|---|---|---|---|---|
10.36 | 1 | 10.36 | 58.70 | 19.57 | 73.06 | avrt3mgztynzh | SQL*Plus | select /*+ parallel(12) */ * f… |
4.96 | 1 | 4.96 | 28.12 | 96.15 | 1.89 | 572fbaj0fdw2b | sqlplus.exe | select output from table(dbms_… |
0.20 | 1 | 0.20 | 1.11 | 103.08 | 0.00 | bunssq950snhf | insert into wrh$_sga_target_ad… | |
0.15 | 16 | 0.01 | 0.87 | 111.50 | 0.08 | 3s1hh8cvfan6w | SQL*Plus | SELECT PLAN_TABLE_OUTPUT FROM … |
0.15 | 1 | 0.15 | 0.87 | 91.92 | 0.00 | dayq182sk41ks | insert into wrh$_memory_target… | |
0.15 | 1 | 0.15 | 0.86 | 92.16 | 0.00 | bm2pwrpcr8ru6 | select sga_size s, sga_size_fa… | |
0.14 | 16 | 0.01 | 0.82 | 108.24 | 0.08 | g3f3cw3zy5aat | SQL*Plus | SELECT PLAN_TABLE_OUTPUT FROM … |
0.13 | 16 | 0.01 | 0.72 | 122.57 | 0.00 | f0a1c90fu7jtd | SQL*Plus | SELECT /*+ opt_param(‘parallel… |
0.12 | 121 | 0.00 | 0.69 | 12.75 | 0.00 | 772s25v1y0x8k | select shared_pool_size_for_es… | |
0.09 | 1 | 0.09 | 0.52 | 101.63 | 0.00 | 1bx8mgs8by25x | sqlplus.exe | select coalesce( p1s.parameter… |
Notice that in this AWR report the elapsed time for our query is 10.36 seconds. The actual elapsed time, reported when timing is on, and the execution plan is:
Elapsed: 00:04:29.31
Execution Plan
----------------------------------------------------------
Plan hash value: 2873591275
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1073K| 89M| 2181 (1)| 00:00:05 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM)| :TQ10000 | 1073K| 89M| 2181 (1)| 00:00:05 | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 1073K| 89M| 2181 (1)| 00:00:05 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS FULL| EMP | 1073K| 89M| 2181 (1)| 00:00:05 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("EMPNO"<=7400)
So why the discrepancy? Why does AWR report a longer elapsed time than actually occurred? Oracle Support addresses this in MOS note 1956033.1 and reports that this behavior is not a bug. According to Oracle Support the elapsed time (taken from the V$SQL view) includes time for the Query Coordinator and the parallel query slaves operating on that query. In a non-parallel execution such aggregation produces results very close, if not exactly matching, the wall clock time. In this case the elapsed time computed for the AWR report is over twice as long as the actual execution time. This doesn’t mean that there were two parallel query slaves in use; the query was hinted to provide a parallelism of 12. Thus, the sum of the elapsed time for the Query Coordinator and the elapsed time for each of the 12 parallel query slaves adds up to 10.36 seconds. If we set the degree for the EMP table to 1 and run a non-hinted query to get a serial execution time the result is:
Elapsed: 00:02:04.08
Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1073K| 89M| 23573 (1)| 00:04:43 |
|* 1 | TABLE ACCESS FULL| EMP | 1073K| 89M| 23573 (1)| 00:04:43 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("EMPNO"<=7400)
And another AWR report shows:
SQL Ordered by Elapsed Time
- Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code
- % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
- %Total – Elapsed Time as a percentage of Total DB time
- %CPU – CPU Time as a percentage of Elapsed Time
- %IO – User I/O Time as a percentage of Elapsed Time
- Captured SQL account for 97.1% of Total DB Time (s): 3
- Captured PL/SQL account for 37.2% of Total DB Time (s): 3
Elapsed Time (s) | Executions | Elapsed Time per Exec (s) | %Total | %CPU | %IO | SQL Id | SQL Module | SQL Text |
---|---|---|---|---|---|---|---|---|
1.62 | 1 | 1.62 | 48.69 | 93.32 | 2.70 | 19duabgggdwaj | SQL*Plus | select * from emp where empno … |
1.07 | 0 | 32.07 | 65.73 | 18.91 | 1uk5m5qbzj1vt | SQL*Plus | BEGIN dbms_workload_repository… | |
0.22 | 1 | 0.22 | 6.75 | 104.07 | 0.00 | bunssq950snhf | insert into wrh$_sga_target_ad… | |
0.18 | 1 | 0.18 | 5.55 | 75.93 | 0.14 | 32b51bs6x2ux7 | SQL*Plus | EXPLAIN PLAN SET STATEMENT_ID=… |
0.17 | 2 | 0.08 | 4.98 | 47.02 | 14.20 | gpf6csqpdaymp | SQL*Plus | select * From dba_scheduler_jo… |
0.16 | 1 | 0.16 | 4.81 | 97.32 | 0.00 | dayq182sk41ks | insert into wrh$_memory_target… | |
0.16 | 1 | 0.16 | 4.77 | 98.25 | 0.00 | bm2pwrpcr8ru6 | select sga_size s, sga_size_fa… | |
0.10 | 1 | 0.10 | 2.88 | 0.00 | 74.78 | 6ajkhukk78nsr | begin prvt_hdm.auto_execute( :… | |
0.07 | 66 | 0.00 | 2.23 | 0.00 | 96.67 | 5h7w8ykwtb2xt | INSERT INTO SYS.WRI$_ADV_PARAM… | |
0.06 | 1 | 0.06 | 1.94 | 0.00 | 93.15 | f0dtmk7ap93vx | DBMS_SCHEDULER | DECLARE object_type varchar2(4… |
0.06 | 1 | 0.06 | 1.72 | 27.31 | 87.39 | 350myuyx0t1d6 | insert into wrh$_tablespace_st… | |
0.04 | 2 | 0.02 | 1.13 | 83.07 | 1.25 | 3s1hh8cvfan6w | SQL*Plus | SELECT PLAN_TABLE_OUTPUT FROM … |
0.04 | 2 | 0.02 | 1.08 | 87.09 | 1.31 | g3f3cw3zy5aat | SQL*Plus | SELECT PLAN_TABLE_OUTPUT FROM … |
0.04 | 34 | 0.00 | 1.06 | 0.00 | 0.00 | 772s25v1y0x8k | select shared_pool_size_for_es… |
In this report the elapsed time for the serial query (1.62 seconds) closely aligns with the reported elapsed time from SQL*Plus (2.04 seconds).
The question remains: how to get a fairly accurate elapsed time for queries executing in parallel? As demonstrated here using:
set timing on
is one way to get the elapsed time. Another, more accurate way is to generate a 10046 trace for the query in question. The trace file, and the formatted output from tkprof, will provide the correct elapsed time for the query:
select /*+ parallel(12) */ *
from emp
where empno <= 7400
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.03 0 1 0 0
Execute 1 0.00 0.18 0 15 0 0
Fetch 69907 2.23 2.36 0 0 0 1048576
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 69909 2.23 2.58 0 16 0 1048576
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 184
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1048576 1048576 1048576 PX COORDINATOR (cr=15 pr=0 pw=0 time=1086677 us)
0 0 0 PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us cost=2181 size=93396936 card=1073528)
0 0 0 PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us cost=2181 size=93396936 card=1073528)
0 0 0 TABLE ACCESS FULL EMP (cr=0 pr=0 pw=0 time=0 us cost=2181 size=93396936 card=1073528)
The elapsed time reported by the 10046 trace is less than that reported by ‘set timing on’ and that’s probably due to the time spent sending output to the display. The actual database elapsed time was under 3 seconds, with the elapsed time reported by SQL*Plue as 4.29 seconds. This is an acceptable discrepancy given that displaying results to a terminal screen takes time. I wouldn’t expect the times to match as two different mechanisms are in use to generate elapsed time between the two tools. They are close enough to know that the AWR report, aggregating time from V$SQL, is reporting inflated figures because of the parallel execution.
Parallel query execution can save time in some cases by using a ‘divide and conquer’ approach to result set processing. It is unfortunate tha the V$SQL performance view contains inflated elapsed times for queries run in parallel. Since Oracle Support does not consider this a bug it’s not going to be changed any time soon. Knowing this makes it easier to understand elapsed times in AWR reports when parallelism is used, and now you have at least two ways to generate fairly accurate elapsed execution times for such queries. It makes it a bit more involved to tune queries using parallism but the effort is worth the time and trouble.