Providing a definition for database workload can be a
daunting task. Unfortunately, it depends on what you’re trying to accomplish.
Early in my database administration days, I purchased a
book entitled ‘The Art of Computer Systems Performance Analysis’. This book,
back then and probably now too, has been very influential in my understanding
of how to tune not only databases but complete systems. Part of the complete
picture of performance evaluation is picking a workload that is appropriate so
that you can effectively understand if your system is performing well, if your
tuning efforts are working, and if you can scale a system. With this article, I’d
like to dive into what a workload is as there seems to be some confusion in the
database community.
To me, it is always interesting to check a dictionary
definition first. These are not always right within the context of database
systems but do tend to serve a purpose (which we will see). So, from
dictionary.reference.com we get the following:
Workload – The amount of
work that a machine produces or can produce in a specified time period.
This, to me, is a very interesting definition and we should
not be confused or sidetracked by this. When evaluating this definition we must
keep in mind that while a machine is mentioned, this in no way relates to a
computer or even a database. This machine is static, has no options or chance
of improved performance. For instance, if we look at a copier machine, a copier
can copy just so many pages per minute regardless of how high we stack the
paper. For a closed and static copier machine, the workload/load/work produced
is always the same.
When we look at computer systems, workload takes on a completely
new meaning. Workload is, in its purest sense, the total requests made by users
and applications of a system. For an Oracle database, the workload would be all
of the SQL statements that have been submitted to the instance to work on
regardless of being executed, returning results back to their respected users,
or even actually being seen within the Oracle database itself. This concept is
very difficult for many to understand–mostly because it is hard to tune
anything we cannot see. After all, we as database administrators can only tune
what we see.
Therefore, as database administrators, when all we are
working with is the database, we must begin to take off our purist hats and
take a more liberal definition on what workload means to us. In this case, we
need to define workload as what the database sees regardless of what is
happening outside the database. This means we must sample, measure, track, etc.
those statistics within Oracle that measure, quantify, or characterize what the
workload is. It is critical that we choose the statistics within Oracle that
clearly measure workload opposed to those statistics that depend on the
performance of our database system. For instance, statistics like CPU usage,
elapsed time, and various wait event times would not be good examples of
measuring a workload as they depend too greatly on the database system that SQL
statements are being executed on. More appropriate workload metrics would be
groupings for things like number of SQL executions, active sessions, users, or
applications.
When we define a workload, regardless of that workload
metric, we must ensure that the workload we track can be related to the
resources being consumed within our database systems. When we do this, we can
then tell what performance impact is experienced, or will be experienced, based
on increases or decreases of workloads.
Ok, now I know what you are thinking here. You’re thinking
that since I’ve just tied workload to resources being used, why can’t I just measure
the resources being used and forget about taking the purist point of view for workload?
Well, you are right in one way and wrong in another.
When you look at a database, it really is a closed system in
a sense. We have a set number of parameters that are static that interact with
“mostly” static external resources such as CPU, memory, and disk. As a static
system, as we mentioned previously in the copier example, it doesn’t matter how
man applications we stack against our database it will only be able to respond
and provide result sets back to an application or user. Thus, our consumption
of resources within the database can nearly be thought of as workload metrics;
and if you’re only concerned about tuning your database this is 100% ok in my
eyes.
However, for instances where you might want to do some
scalability testing or port to a new platform it is really hard to determine if
scalability or performance is better, worse, or indifferent. This is where
tying to conform to a more purist workload definition is needed. Look at
Oracle’s Database Replay feature. While I have not used this feature, it is
exactly what this use case suggests. You need to capture a workload from a
system and then port that workload to another system for testing, benchmarking,
load analysis, etc. This is the only way to say if one system will perform
better than another will.
That being said, if you look at Oracle’s Database Replay
feature, you will notice that even it does not hold to a complete purist point
of workload capture. It too is restricted to capture only those requests seen
by the database. Not only that but Database Replay has some restrictions that might
skew or negate any attempts to benchmark workloads on other systems.
So, what are we left with? It seems to me that regardless of
how you define workload, the only real imperative is that you are able to, at
the end of the day, be able to tie resources consumed within the database to
higher level SQL statements, users, applications, server farms, etc. Those high-level
ties provide extreme power in tuning a database properly.