A fairly
common event in a database’s lifecycle is that of the migration from version "older"
to version "newer." Migrating from one version to another may be as
simple as exporting the old and importing into the new, but chances are there
is a lot more involved than first meets the eye. It is not uncommon to also
incorporate other significant changes such as an operating system change, a
schema modification, and changes to related applications. Each change has its
own inherent risk, but lumping them together in one operation flies in the face
of common sense, even more so without having tested the migration from start to
end. Amazingly, this situation occurs all too often.
From a
software engineering standpoint, is it safe or a best practice to heap so many
significant changes together in one step? Further, wouldn’t it seem obvious
that you would want to one, not only practice the migration, but two, test the
changes before actually applying them to your live/production environment?
Here is
something else to consider: break a dependency chain before it breaks you and
the migration process. Given the scenario of migrating from Oracle8i to 10g,
changing the underlying operating system to Linux from Solaris, modifying major
tables within a schema, and running newer/modified versions of related
applications, where are the places you can break the dependency chain? Put
another way, what are the safer/well-known/"charted by many others before
you" steps, and which are the uncharted/"applies only to you"
steps?
Separate the known from the unknown (where versus how)
For
non-leading edge/early adopter/early implementers ("sure, we’re more than
happy to provide our production environment as a beta testing ground for the
rest of the world") of a new version of Oracle, by the time you (and your
company) are ready to migrate from an older version of the RDBMS software to a
newer one, many others will have gone before you. Likewise, many others have
already crossed over to the dark side by having adopted Linux as their
underlying OS.
Considering
the combined RDBMS/OS version change as the known, this combination is also the
"where" part of "where versus how." Where your production
database lives in terms of version and OS is a logical place to break the
dependency chain. In an all-or-nothing do-or-die migration scenario, failure
means losing the time spent on what is perhaps the simplest part of the
scenario, namely, the hours spent on exporting and importing. If you can
separate the overall migration into at least two distinct stages, you will have
broken the dependency chain into smaller chains. The guiding principle/lesson
to be learned here is to move from point A to D via safe, incremental steps.
Unfortunately,
no one can authoritatively tell you what the best approach is for "how."
How your database operates with respect to schema and application interaction
is up to you to determine. Until you have thoroughly test driven schema and
application changes, this part of the overall migration process stays in the
realm of the unknown. Going live and finding out – for the first time – that
the new application/database code results in cascading triggers (thereby
bringing an instance to its knees, so to speak) is obviously a poor time to
become aware of this situation. Developers and testers using 100 records as a
test size when the production environment contains tens of millions records is
hardly a thorough test.
Export and Import via a proactive approach
With respect
to the export and import utilities, you do not have to accept the default
parameters. In fact, you owe it to yourself to use quite a few non-default
settings, and doing so makes the process easier to perform and saves time when
it is time do it for real. Let’s look at the indexfile parameter as a start. There
are (at least) four excellent reasons to use indexfile=filename on an import.
The first
is that the output documents the storage of tables and indexes (all or some,
depends on what was included in the export dump file). Where is your source
code for schema creation? If you do not have source code, this parameter (along
with a fairly simple query that returns everything else) goes a very long way
towards providing that information. The query part is spooling out the contents
of all or user_source. Code for packages, package bodies, procedures,
functions, and triggers will be included in the output. With very little
editing such as adding "create or replace" and cleaning up SQL*Plus
artifacts (i.e., feedback, heading, page breaks – if these weren’t suppressed
to begin with), you are left with the current source for a significant portion
of a schema.
The
second is that if you are going to do any housecleaning or rearranging of
tables and indexes, now is the time to edit the indexfile and update tablespace
mappings and storage parameters. If the logical layout is to remain the same,
then the third reason comes into play.
Separate
the tables from the indexes, that is, separate the SQL create statements (one
script for tables, the other for indexes). Do as much as you can on the target
database before it is time to do the actual migration. Part of this includes
creating the same/new tablespaces and running the create tables script. Run the
create tables script ahead of time for two reasons: one is to validate the
logical layout, the other is to help speed up the import (concepts question:
how does import work if an object exists or does not exist?).
The
fourth reason comes back to the indexes listed in the indexfile.
Performance-wise, when doing bulk inserts, is it better to have indexes or not?
What happens when a new record is inserted? One or more indexes have to be
updated (assuming there is at least a primary key for that record). Oracle’s recommendation
is that (for large databases) you should hold off on creating indexes until
after all the data has been inserted. Again, this comes back to the importance
of the indexfile because it is the link between export using "indexes=n"
(the default is y) and your being able to re-create the indexes after the data
has been loaded.
In Closing
In the
next article about migration, I will provide a checklist/plan covering steps
and procedures for the before, during, and after phases. Even if you are forced
to bundle together four major changes at the same time, there are proactive
measures you can take to mitigate and reduce risk.