Extending Hierarchical Data Modeling Demonstrated in SQL


The capability of extending the limits of combining multiple node hierarchical structures has not been fully explored. Michael M. David presents a solution to advanced structure combining that is simple to use, generic and freely extends the way hierarchical structures can be semantically combined to produce advanced new hierarchical data structure mashups that dynamically increase the value of the data.

Modeling a hierarchical structure is a fairly basic
operation where the structure is modeled a hierarchical node at time, usually
top to bottom, and building on the hierarchical semantics gradually a node at a
time. However, the capability of extending the limits of combining multiple node hierarchical
structures has not been fully explored. This is because the extended semantics
of combining structures comprised of many nodes in new ways presents problems with
determining the semantics and structures of what is considered hierarchically valid.
This article will present a solution to advanced structure combining that is
simple to use, generic, easily understandable to apply, and freely extends the
way hierarchical structures can be semantically combined. This produces advanced
new hierarchical data structure mashups that dynamically increase the value of
the data.

Background on Combining Hierarchical Structures

The simplest and most semantically understood way to combine
two hierarchical structures is to always link to the lower structure at its
root from any node in its upper structure. Linking anywhere below the lower
level structure root has not been definitively pinned down semantically and if
permitted has usually been driven by the internal operation of the particular
database. Figure 1 demonstrates one such solution linking to the Z node which
is below the root of the lower level structure. This causes the Z node’s upper
pathway starting at the X node to be bent around the lower structure link point
Z node making Z the root node of the lower structure. This changes the
semantics and structure of the lower structure in order to combine it without
producing hierarchical semantic problems as shown in Figure 1.

Linking below root

The changing of the lower level structure and its semantics
shown in the derived structure in Figure 1 is not necessary. Combing data
structures should not change the basic structure and semantics of the
structures involved as was shown above. Combining structures should maintain
their basic structures. Changing the structure and semantics should be left to
transformation operations.

A Valid Semantics for Linking Below Root

In Figure 2, the lower level structure’s root node X has semantic
control over node Z even though the Z node is used as the lower structure’s
link point. This linking operation should not hierarchically elevate the Z node
above the X node. The data link points between the structures being combined are
used to match up data relationships between the structures being combined. This
is actually a data filtering operation applied at the lower structure’s Z node
and will filter the entire lower structure hierarchically from that point
outward in all path directions.

The lower level structure has its hierarchical structure
established before the linking process and should be treated as a single fixed entity
regardless of where it is linked to even if it is a logical structure. Since
node Z is already affected and controlled by other nodes above it, the
hierarchical data modeling link point is the root node of the lower structure as
is shown in Figure 2 while the data relationship data filtering occurs between
nodes C and Z. This consistently preserves the data semantics and structures.

Linking below root diagram

This new data modeling capability can be demonstrated in ANSI SQL when SQL is performing
hierarchically and its operation follows the hierarchical data modeling
semantics of joining below the root as described above. This SQL capability adds
further to the validity for this methodology of joining below the root,
allowing flexibility in how the structures are related and their predictable
combined structure.

SQL Testing of Linking Below Root

A previous article in this series,
ANSI
SQL Hierarchical Data Processing Basics
, has described how
hierarchical structures can be modeled and hierarchically processed using ANSI
SQL. Using the SQL LEFT JOIN to perform this processing, two hierarchical
structures are shown defined in Figure 3. Both of these view structures, ABC
and XYZ, can be separately processed directly in ANSI SQL and they can also be
dynamically combined interactively at runtime using a single LEFT JOIN as in SELECT
* FROM ABC LEFT JOIN XYZ FROM C.c=Z.z
.
This will link the upper level
ABC structure to the lower level XYZ structure using the C and Z nodes data relationship
as was shown in Figure 2.

Separate structure definitions

The flexible LEFT Outer Join syntax can be used to model
logical relational data and physical XML data. Since the different hierarchically
modeled data types are both processed using the same LEFT Outer Join syntax
they can be heterogeneously and seamlessly processed together automatically.
This can be seen below in Figure 4 where view ABC can represent relational data
and view XYZ can represent XML hierarchical data or visa versa.

How it Works Automatically in SQL

The act of combining views in SQL is accomplished automatically
by joining the view definitions together as shown in Figure 4. This demonstrates
how this works internally by showing the SQL statement with its ABC and XYZ
views expanded. The ABC and XYZ expanded views are in bold Italics. In order
for linking below the root to work properly, the lower XYZ structure must be fully
processed (materialized) before joining to the upper structure. When the lower
level structure root is reached in the processing of the LEFT JOIN string, it
can not be joined at this time to the already completed upper ABC structure
because the lower level link point is located further into the incomplete
constructed lower structure. The lower level XYZ structure needs to be fully materialized
first. This is standard processing for SQL views.

A closer look at the expanded LEFT JOIN processing in Figure
4 demonstrates how this linking below the lower level root was accomplished
automatically. The matching ON clause for joining both structures together
occurs naturally at the end of the expanded LEFT JOIN string causing both
structures to be fully materialized before joining the two structures together.
This nicely allows heterogeneous logical and physical structures to operate the
same correctly and seamlessly. This follows exactly the semantics described
earlier for linking below the root.

Combined expanded structlure definition

A Detailed Look at LEFT Outer Join Processing

In the expanded structure definition box in Figure 4, notice
that node X is preceded directly by the LEFT JOIN operation, which was dynamically
introduced from the invoking join query shown. This produces the shown expanded
syntax containing LEFT JOIN X LEFT JOIN Y ON. This double LEFT
JOIN with delayed ON clause syntax is valid. It causes the nesting of the currently
active relational working set now containing the completed ABC view with the incomplete
introduced LEFT JOIN and the allocation of the new active working set containing
the X root node of the XYZ view. This sets the stage for the complete
processing of the XYZ view.

The completion of the XYZ view is indicated by the second ON
clause of the two back to back ON clauses shown in Figure 4 as ON X.x=Z.z
ON C.c=Z.z
. This second ON clause un-nests the previous ABC view with its
incomplete LEFT JOIN syntax and joins it with the now complete XYZ view using
the currently available ending ON clause predicate, which specifies the data
relationship link between the two views. Some believe this nesting is controlled
by parentheses placed around the views to control the order of execution when
working directly at this low level, but it is controlled by the syntax
arrangement as shown. Parentheses can be used for external nesting clarity, but
have no effect and are ignored.

Conclusion

This semantic process of linking below the root works in
general for all types of hierarchical structures and it is not limited to SQL. This
significantly increases the value of data by greatly increasing the number of
ways hierarchical structures can be combined. This capability is also useful in
supporting other hierarchical capabilities like hierarchical data structure
transformations and look-ahead capabilities to be described in a future
article. Having ANSI SQL automatically perform these hierarchical data
structure mashups helps prove this new data modeling operation and capability
works correctly. The SQL hierarchical examples in this article can be tested with
XML using my company’s online interactive ANSI SQL Transparent XML Hierarchical
Processor Prototype at www.adatinc.com/demo.html.

»


See All Articles by Columnist

Michael M. David

Michael M. David
Michael M. David is the founder of Advanced Data Access Technologies, Inc. Previously, he was the lead XML architect for NCR/Teradata, and served as their representative to the ANSI SQLX Group. Before that he was a staff scientist for Teradata and designed high level multi-featured SQL utilities. He has more than 25 years of experience researching and designing commercial nonprocedural heterogeneous database hierarchical query processing products using flat, relational and hierarchical data. From this experience, he authored the book Advanced ANSI SQL Data Modeling and Structure Processing, as well as numerous papers and articles on this subject.

Latest Articles