Expert One-on-One: Oracle: Pt. 2
September 27, 2001
I was working on a project recently where they decided to use only the latest, greatest technologies: everything was coded in Java with EJBs. The client application would talk to the database server using beans — no Net8. They would not be passing SQL back and forth between client and server, just EJB calls using Remote Method Invocation (RMI) over Internet Inter-Orb Protocol (IIOP).
If you are interested in the details of RMI over IIOP you can refer to http://java.sun.com/products/rmi-iiop/.
This is a perfectly valid approach. This functionality works and can be extremely scalable. The people responsible for the architecture understood Java, EJBs, the protocols involved — all of that stuff. They felt they were in a strong position to successfully build such a project. When their application would not scale beyond a couple of users they decided that the database was at fault and severely doubted Oracle's claim to be the 'most scaleable database ever'.
The problem was not the database but a lack of knowledge of how the database worked — a lack of knowledge that meant that certain key decisions were made at design time that doomed this particular application to failure. In order to deploy EJBs in the database Oracle must be configured to run in MTS mode rather than dedicated server mode. What the team did not understand, crucially, was how using MTS with EJBs in the database would affect them. Without this understanding, and without a general knowledge of how Oracle worked, two key decisions were made:
These two seemingly minor decisions guaranteed that the project would fail — utterly guaranteed it. They made it so that a highly scalable database would literally fall over and fail with a very small user load. A lack of knowledge of how the database worked more then overwhelmed their intricate knowledge of Java beans and distributed processing. If they had taken time to learn a bit more about the way Oracle worked, and consequently followed the following two simple guidelines, then their project would have had a much better chance of success the first time out.
Do not run Long Transactions Under MTS
The decision to run 45+ second transactions under MTS betrayed a lack of understanding of what MTS was designed to do and how it works in Oracle. Briefly, MTS works by having a shared pool of server processes that service a larger pool of end users. It is very much like connection pooling — since process creation and management are some of the most expensive operations you can ask an operating system to perform, MTS is very beneficial in a large-scale system. So, I might have 100 users but only five or ten shared servers.
When a shared server gets a request to run an update, or execute a stored procedure, then that shared server is dedicated to that task until completion. No one else will use that shared server until that update completes or that stored procedure finishes execution. Thus, when using MTS your goal must be to have very short statements. MTS is designed to scale up On-Line Transaction Processing (OLTP) systems — a system characterized by statements that execute with sub-second response times. You'll have a single row update, insert a couple of line items, and query records by primary key. You won't (or shouldn't) run a batch process that takes many seconds or minutes to complete.
If all of our statements execute very rapidly, then MTS works well. We can effectively share a number of processes amongst a larger community of users. If, on the other hand, we have sessions that monopolize a shared server for extended periods of time then we will see apparent database 'hangs'. Say we configured ten shared servers for 100 people. If, at some point, ten people simultaneously execute the process that takes 45 seconds or longer then every other transaction (including new connections) will have to wait. If some of the queued sessions want to run that same long process, then we have a big problem — the apparent 'hang' won't last 45 seconds, it will appear to last much longer for most people. Even if we only have a few people wanting to execute this process simultaneously rather than ten, we will still observe what appears to be a large degradation in performance from the server. We are taking away, for an extended period of time, a shared resource and this is not a good thing. Instead of having ten shared servers processing quick requests on a queue, we now have five or six (or less). Eventually the system will be running at some fraction of its capability, solely due to this resource being consumed.
The 'quick and dirty' solution was to start up more shared servers, but the logical conclusion to this is that you need a shared server per user and this is not a reasonable conclusion for a system with thousands of users (as this system was). Not only would that introduce bottlenecks into the system itself (the more servers you have to manage — the more processing time spent managing), but also it is simply not the way MTS was designed to work.
The real solution to this problem was simple: do not execute long running transactions under MTS. Implementing this solution was not. There was more then one way to implement this and they all required fundamental architectural changes. The most appropriate way to fix this issue, requiring the least amount of change, was to use Advanced Queues (AQ).
AQ is a message-oriented middleware hosted in the Oracle database. It provides the ability for a client session to enqueue a message into a database queue table. This message is later, typically immediately after committing, 'dequeued' by another session and the content of the message is inspected. This message contains information for the other session to process. It can be used to give the appearance of lightening fast response times by decoupling the long running process from the interactive client.
So, rather than execute a 45-second process, the bean would place the request, along with all its inputs, on a queue and execute it in a loosely coupled (asynchronous) rather than tightly coupled (synchronous) fashion. In this way, the end user would not have to wait 45 seconds for a response — the system would apparently be much more responsive.
While this approach sounds easy — just drop in 'AQ' and the problem is fixed — there was more to it than that. This 45- second process generated a transaction ID that was required by the next step in the interface in order to join to other tables — as designed, the interface would not work without it. By implementing AQ, we were not waiting for this transaction ID to be generated here — we were just asking the system to do it for us at some point. So, the application was stuck. On the one hand, we could not wait 45 seconds for the process to complete, but on the other hand, we needed the generated ID in order to proceed to the next screen and we could only get that after waiting 45 seconds. To solve this problem, what we had to do was to synthesize a pseudo-transaction ID, modify the long running process to accept this generated pseudo ID and have it update a table when it was done, by which mechanism the real transaction ID was associated with the pseudo id. That is, instead of the transaction ID being an output of the long running process, it would be an input to it. Further, all 'downstream' tables would have to use this pseudo-transaction ID — not the real one (since the real one would not be generated for a while). We also had to review the usage of this transaction ID in order to see what impact this change might have on other modules and so on.
Another consideration was the fact that, while we were running synchronously, if the 45-second process failed then the end user was alerted right away. They would fix the error condition (fix the inputs, typically) and resubmit the request. Now, with the transaction being processed asynchronously under AQ, we don't have that ability. New functionality had to be added in order to support this delayed response. Specifically, we needed some workflow mechanism to route any failed transaction to the appropriate person.
The upshot of all this is that we had to undertake major changes in the database structure. New software had to be added (AQ). New processes had to be developed (workflows and such). On the plus side, the removal of 45 seconds of lag time from an interactive process not only solved the MTS architecture issue, it enhanced the user experience — it meant that the end user got much faster 'apparent' response times. On the down side, all of this delayed the project considerably because none of it was detected until immediately before deployment, during scalability testing. It is just too bad that it was not designed the right way from the beginning. With knowledge of how MTS worked physically, it would have been clear that the original design would not scale very well.