Oracle: Choosing and Using the Right Code
November 17, 2006
Sometimes a programming background can make a task harder than it is when you lose sight of what SQL can do independently of PL/SQL. How often do you find yourself using PL/SQL to loop through records where an update for each record depends on some test or condition? That is certainly a valid approach, but not necessarily the best approach each and every time. Lets look at two cases where it is and is not. Our first example problem will be a table requiring one column to be updated depending on the value of another column.
Brute Force PL/SQL Problem Example
In simple terms, the problem is to update a table and set column Y to some value based on a value in column X. Column Xs values can be many. The PL/SQL brute force (meaning, were explicitly evaluating each and every record) approach could go something like whats shown below.
Declare Cursor c is Select rowid update_rowid, col_x From my_table; Begin For r in c loop If r.col_x = 'A' then Update my_table Set col_y = 1 Where rowid = r.update_rowid; Elsif r.col_x = 'B' then Update my_table Set col_y = 2 Where rowid = r.update_rowid; End if; End loop; End; /
There is a lot of unnecessary code in the above PL/SQL block. The UPDATE statements are repeated and if there were many variations, then there would be that many more lines of code to maintain. The concept of lines of code or LOC is mentioned frequently in software engineering texts, more often as KLOC for thousand lines of code. What is your companys cost per KLOC? Predictive models attempt to provide a good cost estimate of a development project based on the KLOC.
Do you really need to loop through each record to perform a test before updating my_table? Obviously not, and in fact, PL/SQL is not necessary at all in this case. Why not take advantage of the CASE option for an update? The PL/SQL block can be boiled down to a more elegant update using a CASE construct.
Update my_table Set col_y = Case When col_x = 'A' then 1 When col_x = 'B' then 2 End;
This is obviously much simpler to code and can be further reduced to a few lines of code. Of course, you should consider what happens when col_x is not A or B, that is, what else should happen? In addition to tests for specific values, what if the update condition is based on a range? Lets suppose the range brackets are 0-2, 3-20, and 21 or more (which get coded values of 1, 2 or 3). How would you incorporate the ranges in a CASE-type of update statement?
Note the switch between defaulting to 3 (on the left) to 2. Each does the same thing, but the code on the right is a bit simpler. If the range is 0 to 2, 3 to 20 and so on, you don't need to evaluate in that order (0-2, then 3-20, etc.). You can test or evaluate the range brackets in any order you want (e.g., start with 0-2, then have the next test be for greater than 20, and then what's left, which is 3-20). Although the WHENs are number to number comparisons, you are not locked into using the same datatypes. Oracle supports a searched and a simple CASE, and the simple CASE is what you may run across where CASE is being used to perform the same function as a DECODE. Overall, a pure SQL approach is the best way to perform this type of conditional update.
Putting ROWID to work
Coming back to the PL/SQL block, note the use of the ROWID pseudocolumn. ROWID is aliased to UPDATE_ROWID (or, going for the shorthand approach, UPD_ROWID). The ROWID pseudocolumn is specific to Oracle, whereas CASE is ANSI compliant. You could select the primary key from the table or make the cursors select statement include for update and then use where current of c to perform the update on the current record.
Next up is a situation where ROWID can be very useful when used for update purposes in a PL/SQL block. If you have to join two unrelated tables (e.g., add a survey PIN or code column to another table), you can zipper them together by joining on the ROWNUM and include the base tables ROWID (stored as a column) in the join table. The idea is that you join up the relevant columns between the two tables, and then do an update back on the main table using the stored ROWID. Remember, ROWID is guaranteed to be unique within a table, but not across tables.
Shown below is an example of the code to join uncorrelated data to another table. There are three tables involved: the source table for PINs, the main table (the one having PINs inserted/assigned to records), and a join table (used to correlate the main tables ROWID and a PIN).
1. Create the PIN table and populate it (CTAS, SQLLDR, etc.).
2. Create the join table (jc means join condition).
create table my_join_table as select * from (select rowid main_rowid, rownum jc_a from main_table <where optional conditions>) a, (select rowid pin_rowid, pin, rownum jc_b from pin_table) b where a.jc_a = b.jc_b;
If you need to keep track of assigned PINs, add a used column on one of two tables. The code above assumes were going to update the PIN table, not the join table.
alter table pin (or my_join) add (used char default 'n');
3. Perform the update(s).
set serveroutput on declare v_counter number := 0; --optional cursor c is select main_rowid, pin_rowid, pin from my_join_table; begin for r in c loop update main_table set pin = r.pin where rowid = r.main_rowid; --optional, if PIN usage needs to be tracked update pin_table set used = 'y' where rowid = r.pin_rowid; --optional counter/commit to check progress --query the pin table by used, count(*) v_counter := v_counter + 1; if mod(v_counter,10000)=0 then commit; end if; end loop; commit; dbms_output.put_line('Rows updated '||v_counter); end; /
Updating (or performing other DML) based on a records ROWID is going to be quite fast and can be done without the overhead of an index. Alternatively, if the main table were primary keyed, then the join table could have been created by selecting the PK value from the main table.
One tip you may want to incorporate in day-to-day ad hoc coding is to use a shorthand notation for the cursor and record names. For a simple block, c and r are sufficient to identify the cursor and record names. How often do you want to type my_cur and my_rec when you can accomplish the same thing with one-sixth the characters?
Take advantage of Oracles improvements in SQL, where improvements mean being more ANSI compliant. If something can be done in SQL, it can almost always be done within PL/SQL. You pay a price in performance for context switching (going back and forth between SQL and PL/SQL), so if you can perform an operation in SQL, stay there unless there is a specific need to switch over. On the other hand, if you are certain your application code is always going to be within Oracles realm, then take advantage of features such as the ROWID pseudocolumn.