Welcome to the second installment in the Database Buyer’s Guide series. The purpose of this guide is to help you whittle down the extensive and potentially overwhelming array of Database Management Software (DBMS) products and services to a short list for further consideration.
In Part one, we began by exploring the major database types and some important features to consider when choosing a DBMS. Features included Storage Mechanisms, Data Integrity, and Supported Languages, which encompass SLQ flavors as well as programming languages such as PL/TCL, PL/Perl, and PL/python.
In today’s installment, I’ll be presenting other additional important DBMS features that you would do well to include in your decision making process.
Supported Data Types
In the past few years, large object support has become increasingly popular. Examples include user-defined data types, geographic features, known as GIS (Geographic Information System), images, and extremely large or unlimited text fields. Besides being able to store such large and complex data types, databases need to offer ways to find and retrieve them. One way is called Content-Based Image Retrieval (CBIR), which is a class of techniques used for information retrieval from large image collections using one or more example images. Another technique uses indexed keyword searching.
Performance Monitoring and Tuning Capabilities
The larger and more complex your data schema is, the more vital performance monitoring and tuning capabilities become. For optimal query performance, single column, multi-column, unique, and primary key index support is a must. Analysis tools can be indispensable in getting maximum performance out of your database. I recently wrote a couple of articles on MySQL analysis tools: Identify Slow Queries using MySQL’s Analysis Tools and Using MySQL’s Analysis Tools to Improve Query Performance, for which I couldn’t resist subtitling “MySQL’s Explain Extended Explained”!
Database Security Features
A rash of well-publicized database hacking incidents has brought security to the forefront of issues facing the database community. The following features will help you ensure that your database systems are secure from unauthorized access: encryption, support for replication, ability to define user roles & groups, and fine-grained permissions granting. You also have to consider where your database will be physically located. What’s the point of spending thousands of dollars on security software when someone can just walk in and take a copy of your data?! When large amounts of money are involved, inside jobs are a far more common threat than you might think!
Heterogeneous vs. Cross-Platform Support
Not all databases run on every platform. Some of the bigger players like Oracle, DB2, and even some smaller ones like MySQL, sell databases for multiple operating systems. Others are tied in to a specific O/S (e.g., Windows, Linux). When evaluating cross-platform products, pay close attention to how each version of the software behaves as you move across DBMSes, including variations in workflow, navigation, and look-and-feel, as this will impact the acclimatization time for each new employee and existing team members’ learning curve for each new DBMS.
Collaboration
These days, it is not uncommon to have a lot of overlap between database developer and administrator roles throughout an application’s lifecycle. For example, at my shop, programmers often dynamically create SQL statements in Java to execute against the database from the application. This is just one instance where collaboration features can be of great benefit. Database tools may also facilitate collaboration and communication with management, QA, and partners, to help foster a more connected environment. Some common uses include being able to pinpoint and communicate problems to get management, development and QA up to speed. Report generation can also be invaluable for communicating with management and other team members. Having the ability to distribute ongoing schema changes between development and production personnel is another use for collaboration tools.
Some DBMSes even allow you to place projects under a central version control system with just a few mouse clicks. I find that the bigger your team and the more intricate your platform environments is, the more important version control becomes. My department has two concurrent networks, rated by data sensitivity, each with their own development, QA, and production support environments, making version control indispensable. Last but not least, when evaluating database tools, it’s important to consider how each tool will help you collaborate and communicate with internal and external stakeholders.
Data Visualization
Related to collaboration, having the ability to visualize data can be beneficial for both the primary developer/administrator and others who may require the information. People’s preference on how they like to analyze information can be a very personal and individual one. Some prefer a graphical approach; some want streams of raw data; while still others benefit from both. Take an inventory of your own team members’ preferences and choose your tools accordingly. As just one example, the past few years have seen the emergence of novel SQL tuning techniques, including Visual SQL Tuning (VST). VST is a way to visualize SQL statements graphically in order to quickly understand the major components of a query, its flow and best execution path. These types of graphical representations of the data can improve your SQL tuning and leverage the different ways in which your team members can analyze data.
Performance and Availability
DBAs and developers are under constant pressure to improve database and application performance. It makes sense; the faster users can find what their looking for, the more transactions can be made and the greater the resulting productivity. You may want to consider tools that can prevent, find and fix performance problems before they even occur. DBAs and developers who deal with the stress of production issues rely on 24×7 monitoring and alerts so they can catch problems early and fix them quickly before outages occur. As you can well appreciate, when a crisis occurs, it’s hard to overestimate the value of being able to quickly pinpoint and resolve the problem! At that critical moment, how much is it worth to you to be able to quickly scan a performance dashboard vs. having to navigate through dozens of screens?
Backups
When it comes to backups, open-source databases may not completely fulfill your needs. Some databases only provide a simple text dump of your database data and its schema. Other solutions also provide methods for doing a hot-database backup, or backing up your database without shutting it down. Many commercial backup tools, such as Vertias NetBackup or Tivoli TSM, have agents that provide online backups of commercial databases. Backups also include simple database recovery from database crashes and unexpected power failures. Some DBMSes, such as PostgreSQL, use a system called Write Ahead Logging to provide database consistency checking, while MySQL has database consistency checking only for InnoDB table types.
Data Migration
Many DBMS solutions have database migration utilities to help migrate data to and from other popular commercial databases. Other times, third party utilities are available as either open-source or commercial tools. Obviously, the more complex your schema, the more difficult the conversion will be, and some of these tools may not completely migrate everything perfectly, requiring long hours of manual labor.
Conclusion
I know that I presented a lot of features here today. While you get over the shock of informational overload, realize that the alternative is to cut corners on this crucial step and pay the price only after you’ve invested a considerable amount of time and money into a solution which won’t quite right after all! I can recall a number of situations where I was hired to repair damage caused by an improper initial assessment of data requirements. One of the most pervasive issues of the twenty first century Web 2.0 world is that of communication. Therefore, be especially cognizant of options that allow you to share data across platforms and DBMSes. Useful features in this regard include the ability to be able to import and export in different formats such as XML, JSON, and other network/platform agnostic standards.