ANSI SQL Can Combine Advantages and Principles of Relational and Hierarchical Data Processing


Michael M. David describes how relational and hierarchical data processing can be seamlessly combined in a way that supports and preserves both hierarchical and relational processing advantages while avoiding their disadvantages.

Relational and hierarchical data processing are both
controlled by very principled structured data processing operations. However,
they are very different in operation and for this reason, they have different advantages
and disadvantages in data processing characteristics. These characteristics are
listed and described below. This article will describe how these two different
principled data processing disciplines can be seamlessly combined in a way that
supports and preserves both hierarchical and relational processing advantages while
avoiding their disadvantages.

Relational and Hierarchical Processing Advantages and Disadvantages

Relational and hierarchical processing is very different so
there is no surprise that what is good for one is often bad for the other. This
will be reflected directly below where these characteristics are listed and
explained. These good and bad characteristics reflect what is still assumed today
such as relational processing cannot support hierarchical processing, which is
not an accurate statement today and will be covered later in this article.

Relational Advantages

  • Data Independence
  • Can handle M-to-M data relationships
  • No duplication of stored data with normalization

Data Independence
means that data can be dynamically related through data relationships so that
the structures are not fixed and can be created dynamically on the fly. This
was the reason that relational databases supplanted hierarchical databases. This
data independence also allows Many-to-Many Data Relationships to be performed
easily today. This is a capability not supported in physical hierarchical databases.
Flat relational data with its Normalized Tables in relational databases and
its data independence capability does prevent data duplication in stored data.

Relational Disadvantages

  • Explosive data replications during processing
  • Joins are expensive
  • Only flat structures are available

While relational database systems can store data without
duplication as described in the paragraph above, relational join processing
with its explosive data replication can occur during its Cartesian product processing.
This is also a significant reason relational joins can be very expensive and
time consuming to process compared to other non relational methods. Flat
structures can also be very limiting on their own because of their lack of
structure and semantics.

Hierarchical Advantages

  • Useful hierarchical structure with inherent semantics
  • Easy and powerful semantic queries
  • Data value naturally grows nonlinearly

Hierarchical structures get their inherent powerful
processing capability from the semantics contained in their data structure enabling
more powerful semantic queries. This self-contained structure semantics allows
for powerful semantic queries enabling terse queries that are quick and easy to
specify. As hierarchical structures naturally grow larger over time, so does their
power with their ever-increasing inherent semantics. This allows the value of their
data and querying power to keep increasing naturally and nonlinearly as
described further below.

Hierarchical Disadvantages

  • Fixed inflexible structure
  • Inflexibility resulting in duplicated data
  • No M-to-M data structure support

While hierarchical structures offer the ability to model
structures, they are fixed making them inflexible. This inflexibility also
causes the problem of duplicate data because of the need to have different
physical structures comprised of the same data. This lack of data independence
means that M-to-M structures cannot be fully supported.

How Relational and Hierarchical Processing are Combined

The advent of the Left Outer Join in ANSI SQL-92 with its
hierarchical data preservation operation and its new ON clause replaces the
WHERE clause enabling specifying join criteria at each join point. This capability
enables full hierarchical processing naturally and inherently without ambiguity
problems. This also means that SQL’s navigationless processing also continues
to work normally with hierarchical structured data. The LEFT Outer Join syntax
and the new ON clause allow the precise data modeling of full multipath
hierarchical structures while its hierarchical semantics perform hierarchical
processing. This complete SQL hierarchical processing is handled directly by
the ANSI SQL processor.

In addition and unexpectedly, this natural hierarchical
processing from the LEFT Outer Join also inherently supports the Lowest Common
Ancestor (LCA) processing required to support powerful multipath queries. This allows
the unrestricted query to reference any number of multiple pathways without
concern of the data structure. This overlapping of relational and hierarchical
processing allows for the creation and seamless processing of logical hierarchical
structures, which possess the capabilities of both relational and hierarchical
capabilities. This means that ANSI SQL can now naturally and dynamically define
logical hierarchical structures, which are not fixed. Logical hierarchical
structures defined by ANSI SQL keep all of their hierarchical principles while taking
on relational principles too. This produces results that are both relationally
and hierarchically accurate.

How Relational Structures Support Hierarchical Processing

Relational logical hierarchical processing can support the
best of both worlds, but how do flat tabular structures contained in and
processed in relational rowsets maintain and support hierarchical processing? They
also have to naturally support multiple pathways, which can also be variable in
length. This is performed by the hierarchical semantics modeling the multipath
hierarchical structure in flat relational structures. The newer LEFT Outer Join
with its ON clause operates hierarchically on the working rowset allowing for
the specific hierarchical processing of fragment portions of the structure by using
the multiple separate ON clauses as it is being built. The null data filler naturally
produced by the LEFT Outer Join allows for variable length pathways in
relational rowsets.

Combined Relational and Hierarchical Advantages

The following list of combined relational and hierarchical processing
advantages below allows the user to pick and choose from a list of characteristics
to match their specific need. Each characteristic usually has a good and bad characteristic
depending on its use. Relational and hierarchical capabilities can be
intermixed because hierarchical processing is actually a subset of relational
processing. Intermixed relational and hierarchical operations include the
seamless integration of relational and hierarchical data; combining unrelated fixed
hierarchical structures using a relational association table; and the direct
SQL hierarchical joining of hierarchical structures. Logical hierarchical
structures can also take on most relational and hierarchical advantages, which can
now support previously mutually exclusive capabilities. An example is relational
data independence while utilizing inherent hierarchical semantics.

Relational: * Data Independence
* Can handle M-to-M data relationships
* No replication of stored data with normalization
Hierarchical: * Useful structure with inherent semantics
* Easy powerful semantic queries
* Data value naturally increases as structure grows

Combined Relational and Hierarchical Disadvantages

The above list of advantages allows avoiding items from the list
of disadvantages below. The relational disadvantages below are avoided by selecting
from the hierarchical advantages above. And the hierarchical disadvantages
below are avoided by using the relational advantages above.

Relational: * Cartesian product explosive data replications
* No inherent structure semantics
* Joins are expensive operations
Hierarchical: * Can’t model other structures
* No support for M-to-M data relationships
* Hierarchical duplicate data

Synergy Derived New Capabilities

The synergy from the combined relational and hierarchical
processing above also produces powerful new capabilities and advancements in hierarchical
data processing. Utilizing relational processing’s flexible dynamic querying along
with its basic hierarchical processing capability, hierarchical structures can
be dynamically and hierarchically mashedup. This processing still results in a valid
hierarchical structure to be query processed. By utilizing hierarchical
processing occurring naturally in relational rowsets, powerful new data
structure transformations and data structure virtualizations can also be
performed easily in SQL using the synergy of relational and hierarchical
processing.

Conclusion

ANSI SQL’s inherent hierarchical processing offers the
capability to freely select from relational and hierarchical advantages and
capabilities. This paper is meant as an overview for this SQL hierarchical
processing topic. For more information on this topic, see a list and
description of SQL hierarchical and relational capabilities performed in ANSI
SQL described in my previous article, The
Top 10 SQL Hierarchical Data Processing Capabilities
.

»


See All Articles by Columnist

Michael M. David

Michael M. David
Michael M. David
Michael M. David is the founder of Advanced Data Access Technologies, Inc. Previously, he was the lead XML architect for NCR/Teradata, and served as their representative to the ANSI SQLX Group. Before that he was a staff scientist for Teradata and designed high level multi-featured SQL utilities. He has more than 25 years of experience researching and designing commercial nonprocedural heterogeneous database hierarchical query processing products using flat, relational and hierarchical data. From this experience, he authored the book Advanced ANSI SQL Data Modeling and Structure Processing, as well as numerous papers and articles on this subject.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles