CONNECTing With Your Inner Hierarchy: Hierarchical Query Enhancements
Oracle already provides elegant features for traversing and reporting on the contents of hierarchical data structures. Some of the more common hierarchical structures that I have encountered and needed to parse include:
- Corporate organization structures, especially the relationships of employees to their supervisors
- Part-subpart relationships, e.g., components made up of subcomponents that are themselves made up of subcomponents
- Geopolitical boundaries that describe relationships between countries, states, counties, townships, and cities
- Taxing structures that are usually intimately connected to geopolitical boundaries
Thankfully, Oracle offers the CONNECT BY operator for connecting together elements of a hierarchy in a query. I will utilize a common example with which we are all familiar -- a company's organization chart -- to illustrate:
-- A traditional hierarchical query (before Oracle 10g)
COL mgr_id FORMAT 99999 HEADING 'MgrID'
COL mgr_name FORMAT A24 HEADING 'Manager Name'
COL level FORMAT 999 HEADING 'Lvl'
COL emp_id FORMAT 99999 HEADING 'EmpID'
COL emp_name FORMAT A24 HEADING 'Employee Name'
,(M.last_name || ', ' || M.first_name) mgr_name
,E.last_name || ', ' || E.first_name emp_name
WHERE E.manager_id = M.employee_id
START WITH E.manager_id = 101
CONNECT BY PRIOR E.employee_id = E.manager_id
ORDER BY SIBLINGS e.manager_id, e.employee_id;
MgrID Manager Name Lvl EmpID Employee Name
------ ------------------------ ---- ------ ------------------------
101 Kochhar, Neena 1 108 Greenberg, Nancy
101 Kochhar, Neena 1 200 Whalen, Jennifer
101 Kochhar, Neena 1 203 Mavris, Susan
101 Kochhar, Neena 1 204 Baer, Hermann
101 Kochhar, Neena 1 205 Higgins, Shelley
108 Greenberg, Nancy 2 109 Faviet, Daniel
108 Greenberg, Nancy 2 110 Chen, John
108 Greenberg, Nancy 2 111 Sciarra, Ismael
108 Greenberg, Nancy 2 112 Urman, Jose Manuel
108 Greenberg, Nancy 2 113 Popp, Luis
205 Higgins, Shelley 2 206 Gietz, William
11 rows selected.
Note that I used the PRIOR operator of the CONNECT BY operator to tell Oracle to link up each employee to his or her manager based on the value contained in each row's MANAGER_ID column. In this case, it interrogates the expression that follows immediately after the PRIOR operator for the parent row of the current row. I also used the optional START WITH operator to direct Oracle to start querying the hierarchy at employee #101 (Neena Kochar) and to show only those employees that report up to that employee. Finally, I employed the LEVEL pseudo-column to show which level in the hierarchy each employee belongs to.
As powerful as these methods are, Oracle 10g has expanded upon them considerably:
Traversing To the Top Node in a Hierarchy. Oracle 10g's new CONNECT_BY_ROOT unary operator makes it easy to gather information directly from the "root" or top node in a hierarchy for any child row. In addition, another new function, SYS_CONNECT_BY_ROOT, traverses the hierarchy from the child up to and including the root node in the hierarchy. This function will then return a list of values separated by the character string chosen as a delimiter.
Listing 1.2 illustrates two queries: one whose root node starts at the top of the hierarchy, and one whose root node starts at a lower point in the hierarchy based on the value supplied to the START WITH operator.
Establishing Child vs. Parent Status. Oracle 10g also provides a new pseudo-column, CONNECT_BY_ISLEAF, that establishes whether a returned row itself has additional children in the hierarchy, or is at the top of the hierarchy. If a row has additional children, it will return a value of zero (0) for this pseudo-column; otherwise, if the row has no further descendant nodes, it will return a value of one (1). Listing 1.3 shows how to utilize this new pseudo-column to find all rows that have no children.
CONNECT_BY_ISCYCLE: Handling Recursion Within Hierarchies. Most hierarchies have a simple tree-like node structure that can be traversed from the "root" entry or entries down through the parent to child relationships. However, in some cases, it is possible that a hierarchical relationship may "loop back" upon itself. This may occur as the result of a data entry error, or it may even be intentional.
To illustrate, I will update the MANAGER_ID column value for a few employees so that a recursion is introduced. Note that I am making Employee #902 the parent of Employee #901, and further disrupting the hierarchy by making Employee #903 the parent of Employee #902:
SET manager_id = 902
WHERE employee_id = 901;
SET manager_id = 903
WHERE employee_id = 902;
Now when I attempt to run a query against the EMPLOYEES table, Oracle will detect the "loop-back" condition and return an error:
,(last_name || ', ' || first_name)
,CONNECT_BY_ROOT last_name "Manager"
CONNECT BY PRIOR employee_id = manager_id
ORDER BY SIBLINGS employee_id;
ORA-01436: CONNECT BY loop in user data
To detect this recursion, I can use the new CONNECT_BY_ISCYCLE pseudo-column. This introduces a powerful, elegant method for detecting unwanted recursion, and would be especially useful as part of an AFTER statement trigger for data validation of hierarchies. Note that CONNECT_BY_ISCYCLE can only be queried when the new NOCYCLE directive of the CONNECT BY operator is employed. When a loop-back condition is detected, CONNECT_BY_ISCYCLE will return a value of one (1); otherwise, it will return a zero (0). The query in Listing 1.4 returns the entries that are participating in the recursion.
New Nested Table Functions: COLLECT, CARDINALITY, and SET
Over the past year, I have spent a lot of my spare time investigating, experimenting with, and transforming existing PL/SQL code to use Oracle's implementation of PL/SQL collections. I have found that they are an excellent alternative to global temporary tables for temporarily storing and manipulating moderate amounts of data during complex processing (please see my prior article on creating table functions that return PL/SQL collections.)
Encouraged by my progress with collections, I have now progressed to some initial implementations of nested tables. I have found them extremely useful for storing multiple entries of complex data in a single column in a table. The good news is that Oracle 10g has expanded support for powerful capabilities by adding some new functions for querying, manipulating, converting, and translating nested tables:
COLLECT. This new function is designed to transform values from any column selected from a table in a SELECT statement to a collection in the form of a nested table. COLLECT must be used within a call to the CAST function; it accepts a column from any type of table as input, and then returns a nested table of the input type from the rows selected. If the COLLECT function is applied against a column that is itself a collection, then it will return a nested table of collections.
CARDINALITY. On the other hand, the CARDINALITY function simply returns the number of elements present in a table's nested table column. This is extremely useful when determining the amount of data stored in a nested table column, or for controlling a FOR loop. If the nested table column is empty, the CARDINALITY function simply returns a NULL value.
SET. Unfortunately, there is no DISTINCT operator that can be applied against nested table values to compress them down to only unique entries. However, the new SET function accepts the value(s) in an existing nested table as input and returns a set of non-duplicate values with the same TYPE as the nested table itself.
I will illustrate these three handy and powerful new features with a simple but realistic business scenario: the need to capture and query telephone numbers for every employee in a company. I will first create a traditional table named HR.TELEPHONE_NBRS to store this information, and then I will create a simple TYPE and a second table named HR.LOCATION_COMM_IDS to store each location's telephone numbers in a nested table column. Listing 1.5 shows the statements to create these new tables, create the new TYPE, and then load the tables with sample data.
I will then query the traditional table's column with COLLECT to return a collection, apply the CARDINALITY function to count the number of entries for each of the nested table columns in the second table, and engage the SET function to return only the unique phone number values stored in the nested table column. See Listing 1.6 for the SQL queries and sample output.
Oracle 10g has significantly improved hierarchical queries, expanded the utility of the MERGE statement for data maintenance, and upgraded the capabilities for querying nested tables. These new features and enhancements to Structured Query Language make an already-robust database query and data manipulation engine even more powerful and make a positive impact on any Oracle DBA's most mundane tasks.
The next article in this series will concentrate on the enhanced analytical reporting capabilities that Oracle 10g provides via its new partitioned outer join and data densification features.
References and Additional Reading
While there is no substitute for direct experience, reading the manual is not a bad idea, either. I have drawn upon the following Oracle 10g documentation for the deeper technical details of this article:
B10750-01 Oracle Database New Features Guide
B10759-01 SQL Reference
» See All Articles by Columnist Jim Czuprynski