It used to be that open source software was the niche of uber–geeks and programmers who weren’t afraid to get dirty in the code. And no wonder; it was a free-for-all whose only support was the project sites’ bug reporting pages and community groups.
A lot has changed since then. Just as Eclipse has revolutionized Java development, open source databases like MySQL and PostgreSQL have become a real threat to the big database makers’ bottom line in the small to medium sized company market.
This is an area where I have some real working knowledge, having been the primary MySQL contributor for Database Journal for the past several years. Take it from me, these products offer tremendous value!
You may want to compare these databases with those that are sold by the major RDBMS manufacturers, namely Oracle, IBM, and Microsoft, to which I dedicated one article each in this database buying guide series.
The complete range of DBMS types out there, including Desktop, Server, Web-enabled, Cloud DBaaS, and open source counterparts, were outlined in parts one and two, along with some of the more critical evaluation considerations. Finally, we looked at the top three products in the Desktop databases category in part three. If you’re a small business owner, you’ll definitely want to check those out for comparative purposes.
How Much Data Storage Do You Need?
One of the worst things you can do to a database is overload it with transaction and data volume that are beyond its capabilities. There have been many times that I worked with clients who had managed over time to stuff too many rows in their main tables for the database to handle. At that point, crashing had become a common occurrence. When that happens, you’re running the risk of data loss! Therefore, it might be nice to know when you storage requirements are beyond that of a medium-sized database.
While there are no absolute hard and fast rules, thanks to the numerous companies who use open source databases, we can make an approximation based on their experiences. Both MySQL and PostgreSQL have performed well with many hundreds of gigabytes of data. Oracle, DB2, and Microsoft enterprise RDBMSes, by comparison, can easily accommodate terabytes of data.
How Safe Is My Data?
Getting back to my clients for a moment, they were greatly concerned for their data, because even the slightest amount of data loss or corruption would be unacceptable. Most business owners feel the exact same way. Some have expressed concern that open source products may not be suitably robust to protect their important business data. In the case of MySQL and PostgreSQL, such fears are simply unfounded.
The attitude toward data reliability within the PostgreSQL community has not changed since the initial product launch over a decade ago. They have always placed data integrity as their highest priority. You can rest assured that data stored in a PostgreSQL system will be consistent and reliable, even if you accidentally shut off your server or pull the plug.
The MySQL story took a slightly different route.
There was an attitude some time ago that no critical data should ever be stored in MySQL. Mind you, there’s a chance that these rumors were perpetuated by PostgreSQL communities to some extent. However accurate these claims may or may not have been, MySQL releases of the last decade have been extremely reliable even in the most challenging of environments.
Perhaps a more valid concern is that of data integrity. That is to say, is the database doing something to the data that changes it somewhat? It would seem odd that a database would alter your data in any way, but it actually happens a lot. The usual culprit is poor up-front validation. In that regard, PostgreSQL tends to be quite stringent and will often refuse to accept invalid data. MySQL is a little more forgiving and flexible. As a result, you can wind up with inaccurate data being stored.
For instance, MySQL will allow you to enter a date of ‘2011-02-30’ and will convert it to ‘0000-00-00’, while PostgreSQL will reject it and display a “date/time field out of range” error message.
A Quick Feature Comparison
Standard SQL Compliance
MySQL uses SQL92 as its foundation and can run on several platforms. It supports both left and right outer joins using both ANSI and ODBC syntax. Subquery support was added in version 4.1 from and views were included with release 5. One feature that sets MySQL apart from many other databases is its ability to construct queries that join tables from different databases.
PostgreSQL can use most of the SQL92/99 syntax plus some additional object-oriented features. PostgreSQL also supports complex routines and rules as well as declarative SQL queries, subqueries, views, multi-user support, transactions, query optimization, inheritance, and arrays. However, it does not possess the ability to select data across different databases.
MySQL provides highly fine-grained access control. You can grant and revoke user and group rights based on factors such as user name, group membership, table name and client host name.
PostgreSQL has many of the same features, with roles, permission, authentication and authorization features. PostgreSQL can limit logins based on different criteria, including network segment or user ID.
MySQL includes a number of shared server-side C libraries. There is also a Perl DBI module that provides a generic interface for database access. Of the two databases, MySQL comes with more powerful admin tools, including mysqladmin, hot backup, and file corruption recovery. Its command-line tools are also highly useful and user friendly; you can view database and table structures using the describe and show commands.
PostgreSQL rules, triggers, and server-side functions can be written in several languages including C, pgsql, python, perl and tcl languages. PostgreSQL schemas allow users to create objects in separate namespaces, so that multiple people and/or applications can have tables with the same name.
Remember that, like most databases, MySQL and PostgreSQL are both database servers. As such, they only offer a command line interface to communicate with them. Some DBAs are perfectly okay with that, while others prefer to use a GUI. Many such tools are available for MySQL and PostgreSQL in both open source and commercial offerings. These tools come in the form of applications that run natively on your operating system and Web-based tools. Both look great as they are closely modeled after tools available to commercial databases.
I have been a longtime user of the HeidiSQL MySQL Windows client. It’s quite easy to use and offers many useful features:
Figure 1: HeidiSQL
You can download it directly from their website.
There is an exhaustive list on the PostgreSQL Community Wiki page.
Figure 2: SQuirreL SQL
MySQL fans often have claimed that their system is faster, particularly for read-only tasks, making it a superior choice for Web sites where most data is read. PostgreSQL advocates, in contrast, claim that their system holds up to big loads much better than MySQL. Studies have shown that MySQL is indeed faster than PostgreSQL when working with a small number of connections or with read-only data. However, several comparisons made over the last few years indicate that as more clients are added to the system, PostgreSQL handles the load better.
To help close this gap, some companies, such as Schooner, have come up with versions of MySQL with performance optimized database engines.