Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Tips Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
SQL Scripts & Samples
Tips
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

Oracle

Posted December 14, 2017

WEBINAR:
Live

Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js


Date: 1/31/2018 @ 2 p.m. ET

Oracle's Bulk Collect and Limit Can Be A Problem

By David Fitzjarrell

For several years the FORALL loop, in conjunction with the BULK COLLECT operation, has allowed PL/SQL developers to more efficiently process data by eliminating the row-by-row fetch cycle in favor of a bulk load into memory. Such a change can speed processing up considerably, but it can also throw a 'monkey wrench' into the works if the LIMIT doesn't evenly divide the row count. What can be more frustrating is that the same code works sometimes and fails sometimes, all without modifying a single character. Let's look at why this can happen and what can be done to fix it.

BULK COLLECT does two things, basically: it fetches 'chunks' of data into one or more tables of records and it keeps track of the remaining data, in a rudimentary way, by monitoring for 'no data found', signaling the end of the result set. Unfortunately, the fetch that generates that 'no data found' condition also sets %NOTFOUND for the cursor and when the row source volume doesn't evenly divide by the LIMIT selected that last fetch is a partial fetch. It does populate the record variable or variables, so the data is there at the time it is fetched. The issue is when a FORALL loop, with a LIMIT set, exits with a cur%NOTFOUND the partial fetch of data is lost as the cursor is closed. Let's diagram (in a way) that behavior:

Fetch 1, limit of 12, result set count of 37, loop still runs:

'A'
'B'
'C'
'D'
'E'
'F'
'G'
'H'
'I'
'J'
'K'
'L'

25 rows remain.  Fetch 2, same conditions, loop still runs:

'M'
'N'
'O'
'P'
'Q'
'R'
'S'
'T'
'U'
'V'
'W'
'X'

13 rows remain.  Fetch 3, same conditions, loop still runs:

'Y'
'Z'
'AA'
'AB'
'AC'
'AD'
'AE'
'AF'
'AG'
'AH'
'AI'
'AJ'

1 row remains.  Fetch 4, same conditions, loop looks for %NOTFOUND and terminates before last row can be processed:

'AK'
[NODATAFOUND]

If that's still confusing the example below should clear things up. Let's take the ubiquitous EMP table, with 14 rows of data, and try to copy that to a table named EMP_TEST, an exact copy of EMP in definition but completely empty:


SQL> 
SQL> --
SQL> -- Create empty table just like EMP
SQL> --
SQL> -- We'll populate this with a FORALL loop
SQL> -- using BULK COLLECT and LIMIT
SQL> --
SQL> -- Well, we'll try to but the first attempt will fail
SQL> -- to completely populate the table
SQL> --
SQL> create table emp_test as select * From emp where 0=1;

Table created.

SQL> 

Let's now set up an example that will definitely fail to populate EMP_TEST; we'll exit the loop with the %NOTFOUND condition and a LIMIT that will leave records behind:


SQL> 
SQL> --
SQL> -- Populate the table
SQL> --
SQL> -- Exit the loop in the usual way
SQL> --
SQL> -- Given the LIMIT value this won't get
SQL> -- all of the records inserted
SQL> --
SQL> declare
  2  	     type empcoltyp is table of emp%rowtype;
  3  	     emp_c empcoltyp;
  4  
  5  	     cursor get_emp_data is
  6  	     select * from emp;
  7  
  8  begin
  9  	     open get_emp_data;
 10  	     loop
 11  	     fetch get_emp_data bulk collect into emp_c limit 9;
 12  	     exit when get_emp_data%notfound;
 13  
 14  	     for i in 1..emp_c.count loop
 15  		     insert into emp_test (empno, ename, sal)
 16  		     values (emp_c(i).empno, emp_c(i).ename, emp_c(i).sal);
 17  	     end loop;
 18  
 19  	     end loop;
 20  
 21  	     commit;
 22  
 23  end;
 24  /

PL/SQL procedure successfully completed.

SQL> 
SQL> --
SQL> -- Verify that the data was inserted
SQL> --
SQL> -- The count from emp_test should match the count
SQL> -- from emp
SQL> --
SQL> -- It doesn't
SQL> --
SQL> select count(*) from emp;

  COUNT(*)
----------
        14

SQL> select count(*) from emp_test;

  COUNT(*)
----------
         9

SQL> 
SQL> --
SQL> -- Let's see what's missing
SQL> --
SQL> select * From emp where empno not in (select empno from emp_test);

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30

SQL> 
SQL> --
SQL> -- Seems that we are 5 rows short
SQL> --
SQL> -- Unfortunately NOTFOUND is set on the incomplete fetch
SQL> -- from emp (the fetch that returned less than LIMIT)
SQL> -- so the loop exited before processing the remaining data
SQL> --
SQL> -- Ooooops
SQL> --
SQL> 

Had there been 18 rows, or had the LIMIT been 7, the insert would have completed, but it would have been a false sense of security as the next FORALL/BULK COLLECT/LIMIT process could fail because the row source wasn't an even multiple of the LIMIT. To fix this properly we need to use a property of the record variable itself, count and check for a value of 0, indicating that all of the fetched records have been processed. It requires one more pass through the loop but that doesn't add any appreciable time to the execution. Re-writing the process to check the record count, rather than for %NOTFOUND, populates the EMP_TEST table completely:


SQL> 
SQL> --
SQL> -- Let's try this again
SQL> --
SQL> truncate table emp_test;

Table truncated.

SQL> 
SQL> --
SQL> -- Change the exit strategy to look for a 0 count
SQL> -- for the populated variable
SQL> --
SQL> -- This will get all of the rows from EMP into EMP_TEST
SQL> --
SQL> declare
  2  	     type empcoltyp is table of emp%rowtype;
  3  	     emp_c empcoltyp;
  4  
  5  	     cursor get_emp_data is
  6  	     select * from emp;
  7  
  8  begin
  9  	     open get_emp_data;
 10  	     loop
 11  	     fetch get_emp_data bulk collect into emp_c limit 9;
 12  	     exit when emp_c.count = 0;
 13  
 14  	     for i in 1..emp_c.count loop
 15  		     insert into emp_test (empno, ename, sal)
 16  		     values (emp_c(i).empno, emp_c(i).ename, emp_c(i).sal);
 17  	     end loop;
 18  
 19  	     end loop;
 20  
 21  	     commit;
 22  
 23  end;
 24  /

PL/SQL procedure successfully completed.

SQL> 
SQL> --
SQL> -- Check counts to ensure all data was inserted
SQL> --
SQL> select count(*) from emp;

  COUNT(*)
----------
        14

SQL> select count(*) from emp_test;

  COUNT(*)
----------
        14

SQL> 
SQL> --
SQL> -- Let's see what's missing
SQL> --
SQL> -- This time there won't be any rows returned
SQL> --
SQL> select * From emp where empno not in (select empno from emp_test);

no rows selected

SQL> 

It's a problem that can be frustrating to work on when knowledge of how a FORALL sets the cursor status with a LIMIT-based fetch is missing. Not all loops behave in the same way; a straight FOR loop fetches records one at a time so that when %NOTFOUND is set it's actually at the end of the data. A BULK COLLECT/LIMIT fetch sets %NOTFOUND when it can't find any more data and that could be from a partial fetch (again where the fetch count is less than the set LIMIT value). Exiting with %NOTFOUND will close the cursor and 'lose' the remaining data that was fetched, resulting in incompletely processing the result set. And, since this behavior depends on both the LIMIT and the total number of rows in the result set it can work just as often as it fails.

It's been said that knowledge can be a dangerous thing, but not in the database world since more knowledge means better understanding. Understanding how a BULK COLLECT fetch sets %NOTFOUND when a LIMIT is set can make a huge difference in how such a loop exit is written. And that knowledge can help ensure that records don't go mysteriously 'missing'.

See all articles by David Fitzjarrell



Oracle Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
Oracle Forum
Topic By Replies Updated
Oracle Data Mining: Classification jan.hasller 0 July 5th, 07:19 AM
Find duplicates - Unique IDs Lava 5 July 2nd, 08:30 AM
no matching unique or primary key rcanter 1 April 25th, 12:32 PM
Update values of one table based on condition of values in other table using Trigger Gladiator 3 February 29th, 06:01 PM