Open Source Databases: A brief look at the Berkeley DB, Derby, Firebird, Ingres, MySQL and PostgreSQL DBMS'
March 8, 2005
This month I take a brief look at Open Source Database Management Systems. I do not aim to find the 'best' one, or perform an in-depth feature-by-feature analysis, but rather a introduce you to the variety of solutions out there. I am presuming readers of my columns by now already have an understanding of what Open Source is, and why one would use it. If not, see the resources at the end of the article.
So which are the Open Source databases? Most readers have probably heard of MySQL. Its byline is The World's most popular Open Source database after all. There is also PostgreSQL (The World's Most Advanced Open Source Database), as well as Firebird (The Relational Database for the New Millenium). However, there are others too. BerkeleyDB, although not directly comparable, deserves a mention, while Computer Associates recently released Ingres, and IBM did the same to Cloudscape (under the name Derby).
So, in brief then, let's take a brief look at each of these products.
Unlike most of the others listed here, Berkeley DB is not a full DBMS, complete with graphical reporting tools and the like. It does not even contain a query-processing layer. Instead, it is a lightweight embedded relational storage engine, meant for applications that do not require a separate DBMS installation. Users of MySQL have the option to use the Berkeley DB storage engine (in the early days it was the only way to ensure transactional capability).
If Derby has to have a byline, perhaps it should be World's Most Popular Embedded Java Database. That should give you an idea of the focus of this niched product. IBM released the commercial Cloudscape database as Derby under an Open Source license to the Apache foundation, where it remains under consideration as an incubator project, shortly after Computer Associates released Ingres. Cloudscape is a Java-oriented embedded database, meaning that the DBMS becomes part of the Java application, has a very light footprint and is designed to run without the need for a dedicated DBA. It does not aim to be 'enterprise quality', but rather to serve the needs of Java developers in particular.
Inprise (now Borland) were one of the first commercial companies to release a database under an Open Source license when they did so with their Interbase product in early 2000. Interbase 6 was released under a variant of the Mozilla license. Borland have since released version 6.5, 7, 7.1 and 7.5 of the commercial product, but Firebird has continued to develop based on the Interbase 6.0 source code. Firebird has a small but active user base, and until recently, it was usually forgotten in the MySQL vs PostgreSQL debates. Firebird has always been more fully featured than MySQL, and has, unlike PostgreSQL, it has always worked well on Windows as well as Linux and other 'Nix variants. Firebird was however slightly hamstrung by Borland upon its release, as they did not release all the elements, and the build scripts did not even work! One important lack was a replication engine (though Firebird does have a shadowing feature, which keeps an identical copy of the live database). There are projects afoot, many commercial, but the Open Source ones do not seem that mature, or integrated into the main Firebird code. Firebird development has also been relatively slow, with the recently released 1.5 aimed mainly at rewriting the code in C++ (from C), and not adding many new features. However, Firebird was a relatively mature DBMS to start with, and its aim remains replacing Oracle and SQL server in commercial applications.
There are two flavors available: Classic and Super Server, with the Super server being less resource-intensive, and more suitable for high-volume environments (as well as where most of the development seems to be focused). There are also three variants of SQL, varying in differing degrees in their adherence to ANSI standards,
Firebird has a loyal user-base, and is a good product to work on, and if it can increase its development speed, as well as improve its marketing, it will be a strong contender.
Ingres began as a relational database project at the University of California, Berkeley (the same American university directly or indirectly responsible for so many contributions to the Open Source world, including BSD Unix, Postgres, vi and Tcl. It became one of the most influential projects in the database world. With code released under a variant of the BSD license, the proprietary products Sybase, SQL Server and Informix all have their roots in Ingres, as does PostgreSQL. In 1982, Michael Stonebraker, one of the founders of the project, started Ingres Corporation, and attempted to commercialize the product. ASK and, eventually, Computer Associates bought the product, and continued to market and develop it. However, with Open Source databases making serious inroads in the commercial database world, they released it under an Open Source license in 2004.
Ingres is known as a reliable and scalable database, but suffers from a reputation that it lags behind in features, with Computer Associates seen as not having done much development since they bought it in 1994. Arguably, once ahead of Oracle, it is still deployed in more enterprise environments than any other Open Source database, and is ahead of MySQL in features, although online documentation is poor in comparison to the others. Important unusual features include:
CA also put some financial commitment behind the project in the form of the Million Dollar Challenge, whereby the Open Source community was invited to create solutions enabling Oracle, SQL Server, Sybase, DB2, Informix, and/or MySQL users to migrate to Ingres. Six winners will be announced at CA's annual user conference, April 17-21, 2005. Once at the top of the pile, releasing Ingres under an Open Source license has given it a new lease of life, and I expect to see some rapid growth in the product.
MySQL's byline of most popular is certainly correct, as, fuelled by the Internet boom, it has become widely-used for websites. Initially fast and easy-to-use, but with limited functionality, MySQL has worked on adding features to the product, and although it still lags the enterprise-level databases, its popularity, and huge user-base mean there are lots of tools and applications that work with MySQL, with lots of unofficial support to add to MySQL's official support. MySQL lacks views, triggers and stored procedures, important features for many users. Have a look at the full list of features, as well as what is lacking in the article "MySQL, Still Tomorrow's Contender?".
MySQL achieved an important coup when SAP gave MySQL their SAPDB database. SAPDB is a full-featured, mature DBMS that, although not yet well integrated into the rest of MySQL, gives MySQL an enterprise-ready answer to their critics. MySQL has tremendous momentum behind them, and if they can quickly add the missing features, whilst maintaining the product's ease of use and speed, they may have an unbeatable combination.
PostgreSQL first existed as a name in 1996, when Postgres95 was renamed, reflecting the importance of the recently added SQL interpreter. Before that Postgres, also a project of the famous Michael Stonebraker and designed as a successor to Ingres, used QUEL, which was very similar to SQL, and arguably more consistent in structure. However, the dominance of the SQL-based Oracle and DB2 meant that QUEL fell from favor. Fortunately, Postgres was released under a BSD license, and even though the Berkeley-led project had ended, the code was freely available, and the project has continued to evolve into what it is today. Until the recently released version 8.0, PostgreSQL was not available for Windows, but I expect its market share to increase quickly as Windows users start to discover it. Its claim of most advanced is not lightly made. It has almost all of the features required by enterprise-level databases, as well as a number of other unusual and potentially useful features:
PostgreSQL was slow in not providing native Windows binaries, expecting their full feature-set alone to boost them. Now that they have rectified this, the momentum they have as the number two Open Source database may see them experience some rapid growth.