Normalizing an Existing Database
March 6, 2001
Last week a Swynk author posted an article on database normalization. To make the database analysis for this project more interesting David Faour has been asked to comment on the normalization of the database. Without getting into a threaded discussion it should provide some additional insight. This should allow you to get an opinion on the strengths and weaknesses of the design from an outside perspective. David's recent article on database normalization is a good introduction to this article.
The Normalization Process
This is the second article in the series on the process of redesigning a program. As mentioned in the previous article (Modifying an Existing Program (Overview)) the starting point in normalizing the current database was to analyze the existing program and database. Because the program is old and has been updated many times there is no programmer available to discuss design issues. The new database design will be based on user input, the existing program, and the existing database. The original user interface was built using visual FoxPro and consisted of 17 tabs. The tabs allowed the information to be organized based on common characteristics but gave a cluttered appearence to the user interface. In addition, in many cases information categories had to use more than one tab to fit all the relevant information. The interface updated a FoxPro database. The database is not first normal with the many repeating address fields in the multiple tables. In addition the tables do not contain data relavant to one area. In some cases the tables contain data from many categories.
The primary table in the new database is the firm table (tblFrim). This table contains the unique information for each firm. Although the address of the firm would normally be in the same table in this situation it became clear that each firm could potentially be linked to a number of different addresses. The addresses a firm could be linked to was based on the use of the address or role. The role table became the basis for organizing the addresses, contacts, phone numbers, email addresses and web addresses. A table was created for each of these categories (tblContact, tblAddresses,tblEmail,tblWeb,tblNumbers) and the tables were linked to the firm table (tblFirm) and the role table (tblRoleLookup). A diagram of the entire database depicts these relationships they are on the left side of the figure.
The next type of information were options and policies. There were three options tables; order, copyright and returns (tblOrder,tblCopyRight,tblReturns). The three tables allowed organized fields by category. Each of these three option tables had a one to one relationship with the firm table(tblfirm). A policy table exists for discounts, returns and copyrights (tblDiscountPolicies, tblReturnPolicies, tblCopyRightPolicies). Each table has a many to one relationship with the firm table (tblFirm). This many to one relationship dictated a separate table for this type of information, instead of adding the information to the appropriate options table. The added benefit is if a different policy needs to be added it will not affect table structure nor program structure.
Of the remaining tables tblOrderLanguage, tblOrderCode, tblOrderClassifiction are join tables that relate information from orders to tblLanguageType, tblCodeType, and tblClassificationType, respectively. The join tables allow the many to many relationships between the orders table the other three tables. The thlOrderISBNImprint has a many to one relationship with tblOrder and is used to hold ISBN information for each firm.
The last set of tables is going to be used by the program. The tblChangeTracking and tblModification tables hold information about when, where and who made modifications to the data. This can be used to resolve any questions about when data was updated and who updated the data.
Although the data is normalized it is not in third normal form. A third normal database is the ideal in some situations denormalization can be used to help optomize queries on the database. The following are some of the areas where the data is not normalized and the reasons for denormalization. The true test to whether denormalization makes sense is in the use and maintenance of not only the database but also the program the database supports.