In the last piece of our database comparison, we hit upon
various areas such as triggers, views, and stored procedures. Though
the feature sets certainly differ between Oracle, Postgresql, and
MySQL, most of your daily needs are met in these areas. This time around, we’ll hit upon some areas where the platforms differ
more significantly, most importantly in their handling of complex SQL
and optimizing choices.
Complex SQL (optimizing engines)
SQL is the fundamental, and most
crucial way you will interact with your database, whichever one you
choose. It also happens to be the area where the three
platforms really start to part ways. Oracle supports tremendously
complex queries, with almost limitless numbers of tables, joins of all types,
and combinations. What’s more Oracle’s bread and butter, it’s real crown jewel is it’s Cost Based Optimizer, which
analyzes SQL, rewrites and simplifies if possible, chooses indexes based on
cost, decides on driving tables, and all manner of other magic under the
hood.
Read MySQL’s documentation and you
will find admissions to performance biases and these are the types of vendor
specific details which make optimization and
performance tuning complex on any platform. MySQL’s
stated maximum is 61 tables in any JOIN or in a VIEW. Again
I personally feel a query with that many tables would be unwieldy in any
application anyway, so as mentioned above it is indeed the optimizer that wins
the day not the query max table specs, and so on.
Versions of 8.x Postgresql support the full SQL92
specification with few limitations. Again, I think where you’ll see one database standing out over the others is in
areas of optimization. Complex queries get messy, and query plans become
your best friends for diagnosing performance bottlenecks.
Index Types
Index technology is very crucial to database performance, and Oracle has plenty of options to choose from. There are a plethora of different index types from standard b-tree, to reverse key, function- based, the oft-misused bitmap indexes, and even index-only tables. As add-on technologies, the DBAs have Oracle Text available, providing indexes, which allow you to search CLOB (character large objects), and Oracle Spatial providing indexes for location based data.
In MySQL, we find b-tree, hash, fulltext, and GIS indexes (for location based data). There are also cluster indexes, but if my experience with Oracle is any indication, these are not often relevant for most applications. Heck, most of the time b-tree indexes are the only ones I see in Oracle, MySQL or Postgres applications. In addition, although for instance, function-based indexes are not available in MySQL, they can be simulated by creating another column with the data using that function, and then adding a trigger to keep it populated.
Postgresql offers b-tree and hash, as well as r-tree, and its own custom GiST index type, which allows for user defined types, and function based indexes to be created. Oracle has offered a similar type of functionality where its function-based indexes can be used for pl/sql based functions instead of just the standard pre-defined system functions such as trunc, UPPER, that you might otherwise use. Beware though that indexes like these are likely to be extremely slow access-wise, and when it comes to getting data in and out of your table, you don’t even want the word slow to enter into the discussion.
Again, it is really the implementation, and how the optimizer chooses indexes where you really win out with Oracle.
Auditing
Oracle allows you to enable auditing to a table, or a file
via the audit trail facility. Once enabled, you can audit insert, update,
or delete on a specific table, for instance, or logins, or even all access by a particular user. There are quite a few
options, and it’s easy to enable.
Postgresql too has this functionality and it seems to be as
flexible and configurable as Oracle’s.
MySQL on the other hand does not seem to provide this out of
the box, however you can certainly build your
own stored procedures, and triggers to do what you like, and dump relevant info
into a table, it’s just a little more work.
Datatypes
Oracle, MySQL and Postgresql all support large binary and
text data up to 4GB. All of the datatypes we
know and love are also available, such as number, character, and date.
Each to some degree also offers some custom data types although I rarely see
these used in applications.
Now one thing I’ll add is that
Postgresql and MySQL have gotten beyond the legacy roots, and are not afraid to
implement a good auto incrementing column type which we use so often.
Oracle’s argument is that sequences do the job, and are more efficient, but
still. Oracle also doesn’t have a SET datatype, which is great to have. Nor does it have a
time-only datatype, which Postgresql and MySQL both
have. But functionally you’ll find you can do
everything you want with dates and time on all three of these database
platforms from working with timezones, to dealing
with intervals and so on.
Another thing I like about Postgresql and MySQL is their
support for various math optimal numeric types from smallint
to decimal, real, double, and so on. These
take advantage of the underlying architecture implementations, and match what
is available in a programming language like C for instance.
Transaction Support
In the database arena, proper transaction handling has
gotten the acronym to ACID compliant, which means Atomicity, Consistency,
Isolation, and Durability. Atomicity means a transaction is one complete
unit, all committed or all rolled back. Consistency means you move from
one *VALID* state to another, ie you implement proper
constraints that can enforce business logic. Isolation means one
transaction cannot see what another is doing, until it is complete
(committed). Durability means once committed, the change is permanent,
and *crucially* you are protected from hardware failure.
I have a few things to say about this issue, hopefully sidestepping the culture wars here. I’ve seen sweeping statements like any enterprise-class
application would never use a database that doesn’t implement full ACID
compliance, and non-transactional tables are completely useless.
These are fools statements. For instance, Oracle’s own performance
views in its data dictionary are not transactional. The
reason? Performance of the overall system would be abysmal if they
tried that. Secondly, they’re just not necessary
in that context. There are many applications like that. I read
about an airline ticketing system, which needed to upgrade and add a second
server for hosting Oracle. They were looking at all the associated
licensing costs for the software, and hardware costs of big iron. Then
they stepped back and looked at the application. Someone had the bright
idea to realize that 90% of the traffic on the airline site was browsing for
flights (read-only) and only 10% was for actually
buying the ticket. So, they setup a
fleet of low-cost MySQL servers for browsing flights, and they reroute requests
to the big-iron Oracle box for doing the ticketing. What a great hybrid
solution!
Yes MySQL has come a long way in
the transaction department with its InnoDB
tables. That may explain why Oracle purchased Innobase.
Some still argue that MySQL is just an SQL interface for LDAP or NFS.
Nevertheless, MySQL has indeed come a long way and continues to challenge the
detractors.
Postgresql is more complete in this regard, so I would say
primarily you will only see performance differences from Oracle, and that’s about it.
Conclusion
As you can see from our cross-section of database platforms,
there is a lot to consider when choosing a database platform. From feature completeness, to vendor support and community
support, to performance and optimization. Don’t
over invest before you’ve learned enough about the application you’re building,
and what’s its real requirements are. In the end those may be nebulous, and
hard to pin down, but with a little creativity, and hard thinking on the topic,
and a good development sandbox, you should be able to flesh out a solution
which is both cost-effective, and robust.