With it being close to the end of the year, and things winding down for many of my colleagues (not me!), I thought I would have some fun, and write an article entitled PostgreSQL vs MySQL: Which is better? I am expecting the article to be read by PostgreSQL fanatics, MySQL fanatics, newbies wanting an easy answer to the question, and those who know better, but still have voyeuristic tendencies for this sort of thing. Of course, I could have entitled the article "DB2 vs MySQL", or "SQL Server vs PostgreSQL", but that would have resulted in me entangling myself in a long-winded discussion of the merits of Open Source and proprietary software. There is way too much written on the topic, admittedly mostly rants, but if you have been around a bit, and tried both, and still haven't made up your mind yet, I won't be able to help you. So, PostgreSQL vs MySQL it is.
I know it will all be in vain, but first, some disclaimers. I know perfectly well that 'best' is a loaded term, and what is best in one situation is not best in another. Therefore, the correct answer would be neither is best, and both have their place. However, still the debates rage as to which is better. More interesting to me is the psychology of why people care. A recent column in MySQL expert Jeremy Zawodny's blog covered just this topic. He had previously mentioned that a certain company used MySQL. A reader got into a froth saying that he knew for certain that they did not, and that they would not use such a poor product. So, the flame wars began (which Jeremy wisely kept out of). I found the whole thread quite amusing, and it was probably the inspiration for this article. It seems there is a level of insecurity that drives people to fight passionately for causes they know nothing about, to insist upon things they have only a partial understanding of. At its extreme, it leads to people charging over trenches, dying for their leaders, but in the IT world it manifests itself in page upon page of ranting, claiming that one technology or another is better. Linux vs FreeBSD, KDE vs Gnome - created dualisms for the insecure. Nevertheless, let's get more specific, and give a hypothetical example from MySQL vs PostgreSQL.
Person A learns to use MySQL for a website, and finds it particularly fast. Much faster in fact than the previous Access database he was using. Person B in the meantime learns to use PostgreSQL, and finds it extremely useful, especially the stored procedures functionality. They meet in a forum, debating which database is the best. Person A claims it is MySQL, and Person B claims PostgreSQL. Neither has used the other product, but they argue vehemently for 'their' particular technology. So why do they waste their energy? The main reason is probably a desire to be on the 'winning' side, to feel a sense of belonging to something important and powerful. Their loyalty to something as abstract as a database technology then becomes completely irrational, and a sort of conservatism comes in, an unwillingness to budge from a preconceived view, which is really just a form of laziness.
To add fuel to the fire, Person A now responds to the criticism, and loads PostgreSQL. They quickly install a default installation, run a few benchmarks, and find PostgreSQL to be 50 (0r 100, or 1000) times slower. No consideration is given to the fine-tuning required to get any application to perform optimally. Their preconceptions cemented, PostgreSQL is forevermore dog slow. Or the reverse. Even experienced users find benchmarks tricky to evaluate, as an article by Tim Perdue, and the subsequent comments, showed. In benchmarks he ran, he found that PostgreSQL 7.1 was faster than MySQL 3.23.26beta (both versions are now seen as from the ark, so don't get excited by the details). The variety of responses was again interesting, people supporting 'their' database of choice, usually without much foundation.
Perhaps I have managed to scare off those who came to the article with preconceived views. However, there may still be some newbies reading, telling me to get on with it, enough of that - let's look at some details of how we can make a choice between PostgreSQL and MySQL. Let's see if I can help.
Features. Here PostgreSQL has the upper hand. The stable version of MySQL does not support subqueries, stored procedures, subqueries, cursors or views, all of which PostgreSQL does. One of their more serious mistakes was for the MySQL developers to justify the exclusion of many of these features (and even more fundamental features such as referential integrity, still only partially integrated) by claiming that they were not necessary. Of course this is true in many cases, but to hardened DBA's many of these features are vital, and this lack gave MySQL a reputation as a 'toy' database, from which it is still recovering today. However, many of the contributors to the flame wars mentioned above have not been keeping up with what MySQL now offers - it does support transactions and referential integrity, in spite of what you read all the time! You can see the roadmap here. MySQL has committed to implementing all ANSI-SQL (standard SQL) features, so in about 2 years MySQL should support all of the listed features. So PostgreSQL seems to 'win' this one, but you need to consider whether you actually need these features. The Open Source databases claim the database market is becoming commoditized, and most databases offer all the features you need. So the other factors assume a greater importance.
Support. Support can mean many things. MySQL is much more widely used, so many more applications support MySQL, and there is a larger community ready to assist you with problems, as well as more books and resources on MySQL. MySQL AB, the commercial company guiding MySQL, and who employ most of the developers, offer various levels of support contracts. Of course, PostgreSQL has active mailing lists, and there are commercial companies offering support as well, so you are not likely to go too far wrong with either.
Ease-of-use. Another highly contentious issue. Debate usually goes along the lines of "A: MySQL/PostgreSQL is much easier to use because... B:You idiot. PostgreSQL/MySQL is just as easy because...". Often it is simply whichever one the person uses is the one that is easiest to use, which is not that helpful. An astronaut may find flying the space shuttle easier than writing a document on a PC, but that tells us more about them, not about how easier we would find either. If you are migrating to one of the databases, it depends where you come from. And, it depends on what you are doing. If you regularly use sub-selects or triggers, rewriting them in MySQL or a scripting language will seem unecessarily complex. PostgreSQL's extra functionality can translate into complexity if you do not require any of it. It also depends on what tools you are using - phpMyAdmin for MySQL is a well-developed tool, while phpPgAdmin is not as fully-featured. So if you are looking for a web interface in PHP, and for none of the features MySQL lacks, MySQL would be your choice here. But perhaps you do not need the extra features of phpMyAdmin? They both do everything you want!
Stability. MySQL claims in its press releases to be extremely stable, but the 'word on the street' is that this isn't true. It is easy to blindly repeat mantras, but again, it depends on your needs. Running a website with 10 users a day? Even MS-Access would be stable! I have used MySQL on high-volume websites, both its non-transactional MyISAM tables, and the transaction-capable InnoDB table types. And yes, I have experienced table corruption numerous times, but this could always (I think!) be blamed on faulty hardware, and I have never had a problem recovering (with the simple REPAIR TABLE command turning me into a master DBA). MySQL is used in extremely high volume environments without problems. PostgreSQL's advanced features are more likely to be stable than the newer MySQL equivalents, having been implemented for longer. However, replication is much newer in PostgreSQL than MySQL, so the reverse applies. But here again, the supposed commoditization of databases means that database stability is taken relatively for granted, and the software tends to be a lot more stable than the hardware it relies on.
Speed. MySQL aimed first to be a fast database, while PostgreSQL aimed to be a fully-featured database, and both are converging in the other's direction. Used appropriately, MySQL's MyISAM tables are indeed extremely lightweight.
Existing skills. Where I work, we use MySQL because that is what we were running when I arrived. The team had MySQL skills, and it made sense to continue this. There was an ill-conceived attempt to move to Informix, but while the team battled to handle the move, I learned to tune MySQL (which taught me most of what I know about MySQL, and provided great case studies for my book), and the move was eventually shelved.
Licensing. MySQL AB is often used as a model for Open Source companies attempting to make money. MySQL is released under the GNU GPL (General Public License), which requires derivative works to be similarly licensed, but also offers commercial licenses for those who do not want to be restricted in this way. PostgreSQL is distributed under the BSD license, which basically allows any use of the code as long as the credits are maintained. BSD vs GPL is another topic for a flame war!
In the interests of transparency, I have to confess to having used MySQL much more than PostgreSQL, so feel free to reread the article to scour for instances of my obviously blatant bias!
In conclusion, if I have to give any advice, it is not to listen to anyone without investigating their claims, not to get involved in pointless flame wars, to thoroughly understand your own present and future needs (crystal balls come in handy) and the capabilities of the software you are considering, and you are sure to enjoy years of happy databasing with either MySQL or PostgreSQL, and for most people, probably both. Both databases are developing quickly, improving all the time, and can do everything most users require. Good luck!