Extending Hierarchical Data Modeling Demonstrated in SQL
June 17, 2010
The capability of extending the limits of combining multiple node hierarchical structures has not been fully explored. Michael M. David presents a solution to advanced structure combining that is simple to use, generic and freely extends the way hierarchical structures can be semantically combined to produce advanced new hierarchical data structure mashups that dynamically increase the value of the data.
Modeling a hierarchical structure is a fairly basic operation where the structure is modeled a hierarchical node at time, usually top to bottom, and building on the hierarchical semantics gradually a node at a time. However, the capability of extending the limits of combining multiple node hierarchical structures has not been fully explored. This is because the extended semantics of combining structures comprised of many nodes in new ways presents problems with determining the semantics and structures of what is considered hierarchically valid. This article will present a solution to advanced structure combining that is simple to use, generic, easily understandable to apply, and freely extends the way hierarchical structures can be semantically combined. This produces advanced new hierarchical data structure mashups that dynamically increase the value of the data.
Background on Combining Hierarchical Structures
The simplest and most semantically understood way to combine two hierarchical structures is to always link to the lower structure at its root from any node in its upper structure. Linking anywhere below the lower level structure root has not been definitively pinned down semantically and if permitted has usually been driven by the internal operation of the particular database. Figure 1 demonstrates one such solution linking to the Z node which is below the root of the lower level structure. This causes the Z node's upper pathway starting at the X node to be bent around the lower structure link point Z node making Z the root node of the lower structure. This changes the semantics and structure of the lower structure in order to combine it without producing hierarchical semantic problems as shown in Figure 1.
The changing of the lower level structure and its semantics shown in the derived structure in Figure 1 is not necessary. Combing data structures should not change the basic structure and semantics of the structures involved as was shown above. Combining structures should maintain their basic structures. Changing the structure and semantics should be left to transformation operations.
A Valid Semantics for Linking Below Root
In Figure 2, the lower level structure's root node X has semantic control over node Z even though the Z node is used as the lower structure's link point. This linking operation should not hierarchically elevate the Z node above the X node. The data link points between the structures being combined are used to match up data relationships between the structures being combined. This is actually a data filtering operation applied at the lower structure's Z node and will filter the entire lower structure hierarchically from that point outward in all path directions.
The lower level structure has its hierarchical structure established before the linking process and should be treated as a single fixed entity regardless of where it is linked to even if it is a logical structure. Since node Z is already affected and controlled by other nodes above it, the hierarchical data modeling link point is the root node of the lower structure as is shown in Figure 2 while the data relationship data filtering occurs between nodes C and Z. This consistently preserves the data semantics and structures.
SQL Testing of Linking Below Root
A previous article in this series, ANSI SQL Hierarchical Data Processing Basics, has described how hierarchical structures can be modeled and hierarchically processed using ANSI SQL. Using the SQL LEFT JOIN to perform this processing, two hierarchical structures are shown defined in Figure 3. Both of these view structures, ABC and XYZ, can be separately processed directly in ANSI SQL and they can also be dynamically combined interactively at runtime using a single LEFT JOIN as in SELECT * FROM ABC LEFT JOIN XYZ FROM C.c=Z.z. This will link the upper level ABC structure to the lower level XYZ structure using the C and Z nodes data relationship as was shown in Figure 2.
The flexible LEFT Outer Join syntax can be used to model logical relational data and physical XML data. Since the different hierarchically modeled data types are both processed using the same LEFT Outer Join syntax they can be heterogeneously and seamlessly processed together automatically. This can be seen below in Figure 4 where view ABC can represent relational data and view XYZ can represent XML hierarchical data or visa versa.
How it Works Automatically in SQL
The act of combining views in SQL is accomplished automatically by joining the view definitions together as shown in Figure 4. This demonstrates how this works internally by showing the SQL statement with its ABC and XYZ views expanded. The ABC and XYZ expanded views are in bold Italics. In order for linking below the root to work properly, the lower XYZ structure must be fully processed (materialized) before joining to the upper structure. When the lower level structure root is reached in the processing of the LEFT JOIN string, it can not be joined at this time to the already completed upper ABC structure because the lower level link point is located further into the incomplete constructed lower structure. The lower level XYZ structure needs to be fully materialized first. This is standard processing for SQL views.
A closer look at the expanded LEFT JOIN processing in Figure 4 demonstrates how this linking below the lower level root was accomplished automatically. The matching ON clause for joining both structures together occurs naturally at the end of the expanded LEFT JOIN string causing both structures to be fully materialized before joining the two structures together. This nicely allows heterogeneous logical and physical structures to operate the same correctly and seamlessly. This follows exactly the semantics described earlier for linking below the root.
A Detailed Look at LEFT Outer Join Processing
In the expanded structure definition box in Figure 4, notice that node X is preceded directly by the LEFT JOIN operation, which was dynamically introduced from the invoking join query shown. This produces the shown expanded syntax containing LEFT JOIN X LEFT JOIN Y ON. This double LEFT JOIN with delayed ON clause syntax is valid. It causes the nesting of the currently active relational working set now containing the completed ABC view with the incomplete introduced LEFT JOIN and the allocation of the new active working set containing the X root node of the XYZ view. This sets the stage for the complete processing of the XYZ view.
The completion of the XYZ view is indicated by the second ON clause of the two back to back ON clauses shown in Figure 4 as ON X.x=Z.z ON C.c=Z.z . This second ON clause un-nests the previous ABC view with its incomplete LEFT JOIN syntax and joins it with the now complete XYZ view using the currently available ending ON clause predicate, which specifies the data relationship link between the two views. Some believe this nesting is controlled by parentheses placed around the views to control the order of execution when working directly at this low level, but it is controlled by the syntax arrangement as shown. Parentheses can be used for external nesting clarity, but have no effect and are ignored.
This semantic process of linking below the root works in general for all types of hierarchical structures and it is not limited to SQL. This significantly increases the value of data by greatly increasing the number of ways hierarchical structures can be combined. This capability is also useful in supporting other hierarchical capabilities like hierarchical data structure transformations and look-ahead capabilities to be described in a future article. Having ANSI SQL automatically perform these hierarchical data structure mashups helps prove this new data modeling operation and capability works correctly. The SQL hierarchical examples in this article can be tested with XML using my company's online interactive ANSI SQL Transparent XML Hierarchical Processor Prototype at www.adatinc.com/demo.html.