Introduction to Databases for the Web: Pt. 1 - Page 3
August 16, 1998
Besides differentiating databases according to function, databases can also be differentiated according to how they model the data.
What is a data model?
Well, essentially a data model is a "description" of both a container for data and a methodology for storing and retrieving data from that container. Actually, there isn't really a data model "thing". Data models are abstractions, oftentimes mathematical algorithms and concepts. You cannot really touch a data model. But nevertheless, they are very useful. The analysis and design of data models has been the cornerstone of the evolution of databases. As models have advanced so has database efficiency.
Before the 1980's, the two most commonly used Database Models were the hierarchical and network systems. Let's take a quick look at these two models and then move on to the more current models.
As its name implies, the Hierarchical Database Model defines hierarchically-arranged data.
Perhaps the most intuitive way to visualize this type of relationship is by visualizing an upside down tree of data. In this tree, a single table acts as the "root" of the database from which other tables "branch" out.
You will be instantly familiar with this relationship because that is how all windows-based directory management systems (like Windows Explorer) work these days.
Relationships in such a system are thought of in terms of children and parents such that a child may only have one parent but a parent can have multiple children. Parents and children are tied together by links called "pointers" (perhaps physical addresses inside the file system). A parent will have a list of pointers to each of their children.
This child/parent rule assures that data is systematically accessible. To get to a low-level table, you start at the root and work your way down through the tree until you reach your target. Of course, as you might imagine, one problem with this system is that the user must know how the tree is structured in order to find anything!
The hierarchical model however, is much more efficient than the flat-file model we discussed earlier because there is not as much need for redundant data. If a change in the data is necessary, the change might only need to be processed once. Consider the student flatfile database example from our discussion of what databases are:
As we mentioned before, this flatfile database would store an excessive amount of redundant data. If we implemented this in a hierarchical database model, we would get much less redundant data. Consider the following hierarchical database scheme:
However, as you can imagine, the hierarchical database model has some serious problems. For one, you cannot add a record to a child table until it has already been incorporated into the parent table. This might be troublesome if, for example, you wanted to add a student who had not yet signed up for any courses.
Worse, yet, the hierarchical database model still creates repetition of data within the database. You might imagine that in the database system shown above, there may be a higher level that includes multiple course. In this case, there could be redundancy because students would be enrolled in several courses and thus each "course tree" would have redundant student information.
Redundancy would occur because hierarchical databases handle one-to-many relationships well but do not handle many-to-many relationships well. This is because a child may only have one parent. However, in many cases you will want to have the child be related to more than one parent. For instance, the relationship between student and class is a "many-to-many". Not only can a student take many subjects but a subject may also be taken by many students. How would you model this relationship simply and efficiently using a hierarchical database? The answer is that you wouldn't.
Though this problem can be solved with multiple databases creating logical links between children, the fix is very kludgy and awkward.
Faced with these serious problems, the computer brains of the world got together and came up with the network model.
In many ways, the Network Database model was designed to solve some of the more serious problems with the Hierarchical Database Model. Specifically, the Network model solves the problem of data redundancy by representing relationships in terms of sets rather than hierarchy. The model had its origins in the Conference on Data Systems Languages (CODASYL) which had created the Data Base Task Group to explore and design a method to replace the hierarchical model.
The network model is very similar to the hierarchical model actually. In fact, the hierarchical model is a subset of the network model. However, instead of using a single-parent tree hierarchy, the network model uses set theory to provide a tree-like hierarchy with the exception that child tables were allowed to have more than one parent. This allowed the network model to support many-to-many relationships.
Visually, a Network Database looks like a hierarchical Database in that you can see it as a type of tree. However, in the case of a Network Database, the look is more like several trees which share branches. Thus, children can have multiple parents and parents can have multiple children.
Nevertheless, though it was a dramatic improvement, the network model was far from perfect. Most profoundly, the model was difficult to implement and maintain. Most implementations of the network model were used by computer programmers rather than real users. What was needed was a simple model which could be used by real end users to solve real problems.
Enter the Relational Database Model.