In a forum I frequent the following question was raised:
“We see some sqls that have been running for a long time in our prod DB, 11.2. when we check the v$session, i see null sql ids for them. The sql_address shows up as 00. I would like to know what these sqls are doing because they correspond to processes that are burning up the CPU and driving the usage to > 85%.”
An interesting question, indeed. What are those NULL sql_ids doing, and are they actually NULL? To find out I set up an experiment using both a long series of insert statements and a PL/SQL block fetching 100,000 rows at a time. The results were surprising; let’s look at the example code and see what happened when it ran. Here is the code:
set echo on serveroutput on size 1000000
create table emp (empid number not null,
empname varchar2(40),
deptno number);
set timing on
begin
for i in 1..5000000 loop
insert into emp
values (i, 'Smorthorper'||i, mod(i, 40)+1);
end loop;
commit;
end;
/
set timing off
exec dbms_stats.gather_schema_stats('BING')
set timing on
declare
type empcoltyp is table of emp%rowtype;
emp_c empcoltyp;
recct number:=0;
cursor get_emp_data is
select * from emp;
begin
open get_emp_data;
loop
fetch get_emp_data bulk collect into emp_c limit 100000;
exit when get_emp_data%notfound;
for i in 1..emp_c.count loop
recct:=recct+1;
end loop;
dbms_output.put_line('Total records processed: '||recct);
dbms_lock.sleep(3);
end loop;
end;
/
set timing off
It’s a fairly simple script, really. I used the bulk collect and sleep to create a long-running fetch cycle to see if that displayed the observed behavior. Now, let’s run the script and see how the sql_id column in V$SESSION behaved. I used two sessions, one running the script and one to check the sql_id values from V$SESSION Session 1 generated the following output for the long-running insert:
SQL> create table emp (empid number not null,
2 empname varchar2(40),
3 deptno number);
Table created.
SQL>
SQL> set timing on
SQL>
SQL> begin
2 for i in 1..5000000 loop
3 insert into emp
4 values (i, 'Smorthorper'||i, mod(i, 40)+1);
5 end loop;
6
7 commit;
8 end;
9 /
PL/SQL procedure successfully completed.
Elapsed: 00:02:43.27
SQL>
SQL> set timing off
Session 2 reported the following behavior for the sql_id column:
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c gc37k1290dm1c
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
782 8469 gc37k1290dm1c gc37k1290dm1c
SQL>
Notice that occasionally the sql_id becomes NULL then returns to its previous value, as reported in prev_sql_id. When long-running statements are being processed it’s possible that the sql_id column can go NULL, and the sql_address be reported as 00, even though the same statement that was started minutes or even hours ago is still running. In such cases the prev_sql_id holds the sql_id for the currently running SQL. This doesn’t hold true for PL/SQL blocks, even with a sleep added. Session 1 reports the PLSQL block and it’s apparent progress::
SQL> set timing on
SQL> declare
2 type empcoltyp is table of emp%rowtype;
3 emp_c empcoltyp;
4 recct number:=0;
5
6 cursor get_emp_data is
7 select * from emp;
8
9 begin
10 open get_emp_data;
11 loop
12 fetch get_emp_data bulk collect into emp_c limit 100000;
13 exit when get_emp_data%notfound;
14
15 for i in 1..emp_c.count loop
16 recct:=recct+1;
17 end loop;
18 dbms_output.put_line('Total records processed: '||recct);
19 dbms_lock.sleep(3);
20 end loop;
21 end;
22 /
Total records processed: 100000
Total records processed: 200000
Total records processed: 300000
Total records processed: 400000
Total records processed: 500000
Total records processed: 600000
Total records processed: 700000
Total records processed: 800000
Total records processed: 900000
Total records processed: 1000000
Total records processed: 1100000
Total records processed: 1200000
Total records processed: 1300000
Total records processed: 1400000
Total records processed: 1500000
Total records processed: 1600000
Total records processed: 1700000
Total records processed: 1800000
Total records processed: 1900000
Total records processed: 2000000
Total records processed: 2100000
Total records processed: 2200000
Total records processed: 2300000
Total records processed: 2400000
Total records processed: 2500000
Total records processed: 2600000
Total records processed: 2700000
Total records processed: 2800000
Total records processed: 2900000
Total records processed: 3000000
Total records processed: 3100000
Total records processed: 3200000
Total records processed: 3300000
Total records processed: 3400000
Total records processed: 3500000
Total records processed: 3600000
Total records processed: 3700000
Total records processed: 3800000
Total records processed: 3900000
Total records processed: 4000000
Total records processed: 4100000
Total records processed: 4200000
Total records processed: 4300000
Total records processed: 4400000
Total records processed: 4500000
Total records processed: 4600000
Total records processed: 4700000
Total records processed: 4800000
Total records processed: 4900000
Total records processed: 5000000
PL/SQL procedure successfully completed.
Elapsed: 00:02:34.16
SQL>
SQL> set timing off
Session 2 again is used to display the sql_id throughout the PL/SQL run; notice the sql_id does not go NULL at any time during the run [the output has been abbreviated to reduce the large number of repeated lines]:
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
717 6367 gjcn0r825u7wz 9babjv8yq8ru3
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
717 6367 gjcn0r825u7wz 9babjv8yq8ru3
...
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
717 6367 gjcn0r825u7wz 9babjv8yq8ru3
...
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
717 6367 gjcn0r825u7wz 9babjv8yq8ru3
...
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
717 6367 gjcn0r825u7wz 9babjv8yq8ru3
SQL> /
SID SERIAL# SQL_ID PREV_SQL_ID
----------- ----------- ------------- -------------
717 6367 gjcn0r825u7wz 9babjv8yq8ru3
SQL>
At no time did the sql_id display as NULL during the PL/SQL execution. I suspect it’s because the repeated fetch operations, limited to 1/50th of the total data, kept the cursor and its associated SQL statement, active the entire duration of the run.
It would appear (and I think I have a valid test case as proof) that during long-running SQL statements the sql_id can be reported as NULL even though the statement is still executing. In such cases the prev_sql_id usually provides the identifier for the currently running SQL. Of course not all NULL sql_ids are associated with long-running statements and V$SESSION_LONGOPS does not, by default, report every long-running statement or operation to hit a database. The ‘long-running’ criteria according to Oracle is any statement that runs for more than 6 seconds. A partial list of operations that are recorded follows:
•Table scan
•Index Fast Full Scan
•Hash join
•Sort/Merge
•Sort Output
•Rollback
•Gather Table Index Statistics
NESTED LOOP JOINS are not automatically recorded in V$SESSION_LONGOPS although HASH JOINS are. It is possible to circumvent this by using the DBMS_APPLICATION_INFO.SET_SESSION_LONGOPS procedure to register your long-running operation with the view:
declare
type empcoltyp is table of emp%rowtype;
emp_c empcoltyp;
recct number:=0;
r_idx BINARY_INTEGER;
l_no BINARY_INTEGER;
ttlwrk number;
so_far number;
obj BINARY_INTEGER;
cursor get_emp_data is
select * from emp;
begin
r_idx := dbms_application_info.set_session_longops_nohint;
so_far := 0;
ttlwrk := 50;
select object_id
into obj
from user_objects
where object_name = 'EMP';
open get_emp_data;
loop
fetch get_emp_data bulk collect into emp_c limit 100000;
exit when get_emp_data%notfound;
while so_far < 50 loop
for i in 1..emp_c.count loop
recct:=recct+1;
end loop;
dbms_output.put_line('Total records processed: '||recct);
dbms_lock.sleep(3);
so_far := so_far + 1;
dbms_application_info.set_session_longops(r_idx, l_no, 'Bounded fetch ', obj, 0, so_far, ttlwrk, 'EMP', 'fetches');
end loop;
end loop;
end;
/
V$SESSION_LONGOPS now reports progress for the bulk collect fetch operations:
SQL> select message from v$session_longops where sofar <> totalwork;
MESSAGE
-----------------------------------------------------------------
Bounded fetch : EMP 113315: 1 out of 50 fetches done
SQL> /
MESSAGE
-----------------------------------------------------------------
Bounded fetch : EMP 113315: 2 out of 50 fetches done
SQL> /
MESSAGE
-----------------------------------------------------------------
Bounded fetch : EMP 113315: 3 out of 50 fetches done
SQL>
...
Oracle can exhibit ‘strange’ behavior with the sql_id values in V$SESSION, but just because the sql_id is NULL that doesn’t mean Oracle isn’t processing a long-running statement. It’s likely processing the same SQL reported in prev_sql_id so you can look to that column to find (possibly) active SQL. As a side note you can also get V$SESSION_LONGOPS to report on any operation you consider to be long using the example code as a guide. Obviously you can’t get V$SESSION_LONGOPS to report on ‘canned’ application operations if they don’t meet the reporting criteria (that would require the vendor to modify their code, not an easy task to accomplish). The idea to remember is that the sql_id for some long-running operations isn’t lost when the sql_id value goes NULL; it’s just been moved to the prev_sql_id column.