Oracle 10gR2 Transportable Tablespace Enhancements: Tablespace Versioning and Data Jukeboxes

Synopsis. Oracle 10g Release 2 (10gR2) augments
the capabilities of Transportable Tablespaces to provide the ultimate in data
availability: the capability to store and retrieve multiple versions of any
tablespace’s data. The final article in this series discusses how to utilize
these new features to create a data “jukebox,” as well as how to transport
tablespaces directly from Recovery Manager backup files without incurring any
database unavailability.

In the previous
article, I demonstrated how Oracle 10gR1’s new transportable tablespace
features make it simple to “plug in” a single tablespace into another
database, even when that tablespace is on a different platform and operating
system. I also showed that with a few simple commands, Oracle 10gR1 can easily
transport an entire database between platforms.

Although these enhancements expand greatly the
flexibility and range of transportable tablespaces, they do have one
unfortunate drawback: the database on the source platform is either slightly or
completely unavailable until transport operations are completed. For example, to
transport a tablespace between platforms, that tablespace must be put into READ
ONLY mode on the source platform, and to transport a database between
platforms, the entire database must be opened in READ ONLY mode on the source
platform.

While these interruptions in availability are certainly
temporary, it does mean that some operations are prohibited while Oracle
prepares the necessary transport scripts and the appropriate datafile(s) are
copied between the source and target platforms. Many IT shops have increasingly
adopted extremely tight time frames under which scheduled maintenance may
occur; in fact, the trend toward 100% uptime has continued to accelerate over
the past several years. Therefore, I may simply be unable to limit my database’s
availability for any length of time while I prepare it for transportation to
another platform.

Transporting a Tablespace from RMAN Backups
Tablespace Versioning: One Tablespace, Many Versions
Building a Data “Jukebox” with DBMS_FILE_GROUP
Creating Tablespace Versions With Tablespace Cloning
“Plugging In” a Tablespace Version
“Unplugging” a Tablespace Version
Managing the Data Jukebox
Querying the Data Jukebox’s Contents
Conclusion

Transporting a Tablespace from RMAN Backups

Thankfully, Oracle 10gR2 offers a new transportable
tablespace feature that addresses this concern: the ability to create a
transportable tablespace set directly from existing RMAN backup sets.
Oracle leverages all available RMAN backups to restore a copy of the
tablespaces’ datafile(s), and then it simply applies all available archived
redo logs and online redo logs to roll forward all available transactions to
bring the tablespace up to its most current status. Even better, I can override
this default behavior and create the transportable tablespaces as of a
particular log sequence number, timestamp, or SCN.

To illustrate, in Listing
2.1 I have created one additional tablespace, LMT_SFH, that contains one new table, SH.SALES_FORCE_HIERARCHY,
that is then populated with the contents of the current sales force hierarchy.
Once the tablespace and its objects were created, I then created an RMAN backup
of the new tablespace for its eventual transport.

Listing
2.2 shows the RMAN command script that creates a transportable
tablespace from the available RMAN backups using the new TRANSPORT TABLESPACE
command. Note that RMAN automatically determines which backups, archived redo
logs, and online redo logs to use to recover the tablespace to the appropriate
point in time. RMAN then automatically recovers the LMT_SFH tablespace by creating an auxiliary
instance in the directory specified by the AUXILIARY DESTINATION keyword, and the
transportable tablespace itself is stored in the directory specified by the TABLESPACE DESTINATION
keyword. Output
2.1 shows the results of executing this script in the Windows NT
environment, and Output
2.2 shows the similar results when run in a Linux environment.

Tablespace Versioning: One Tablespace, Many Versions

One of the biggest advantages that IBM mainframe file
storage systems have over UNIX-based or NT-based file storage systems – and one
that I truly miss! – is the capability to store

multiple versions of a file in a generation data group
(GDG). When I wanted to keep, say, up to five different versions of a daily
payroll transaction file, I would simply create a base GDG entry, and then tell
the mainframe’s operating system that I wanted to add a new version of the file
into that GDG. As each new version of the file arrived and was placed into the
GDG, the OS kept track of the total number of versions available, and when the
sixth version of the file arrived, the original version simply “rolled off” the
GDG.

The good news is that Oracle 10gR2 has implemented a similar
structure to the GDG known as a file group repository that lets me keep
multiple versions of a file within that repository. Even better, I can now
store several related files together — collectively known as a version
– within a file group repository so that at a later time, I can retrieve and
restore those files for use. This feature is at the heart of a new set of tools
designed to facilitate a concept called information provisioning: the
ability to create, retain, recall, and manage multiple versions of the same
data structures over time. (Section 8 of the Oracle 10gR2 Streams Concepts and
Administration guide provides an excellent high-level discussion of
information provisioning and how these new features fulfill those requirements,
by the way.)

Finally, Oracle 10gR2 gives me one more powerful tool for
information provisioning: the ability to create a file group repository that
stores different versions of tablespace sets. In this context, a tablespace
version includes all the tablespaces’ datafiles, all DataPump
metadata that was exported for those tablespaces, and the DataPump export
log file generated when I created the tablespace version. These
tablespace versions can be stored within the same Oracle 10gR2 database that
typically needs to “plug in” the selected version of the tablespace; however,
they can also be stored in a remote Oracle 10gR2 database to provide a central
repository for retrieval of tablespace versions for direct plug-in” to any
other Oracle 10gR2 database.

With this feature, I can now retain a plethora of
information over time, but manage it easily should it need to be retrieved for
use in the near (or far!) future. Here are a few business scenarios that
benefit from information provisioning through tablespace versioning:

Retaining historical data to meet long-term legal
requirements. The onset of Sarbanes-Oxley’s Section 404 requirements have
shown in recent years that it is important to retain data for long periods of
time to handle requests from our auditors. Since Section 404 requirements tend
to be vague, it is not unlikely that we could be asked to provide data from
several years (or even decades) hence. Since it is impractical to keep that
data inside one database, some sort of repository structure simply makes good
business sense. Some industries (financial, banking, and insurance, to name
just a few) also require long-term data retention to handle transaction queries
from dark in a client’s past, to enable tracking of fraudulent claims, or to
comply with US federal or state government guidelines and requirements.

Creating a “fallback position” during large-scale
software deployments. I have participated in many deployments of complex
applications during my career, and that means I have seen my share of
occasionally horrendous failures during these deployments, too. During those
failures, I had wished that I had a reasonably quick method to roll back all
the changes made to the database during the deployment of the application
release, perhaps to a point just prior to the related changes. With tablespace
versioning, I can now simply create versions of the affected tablespace sets
just before the application changes are deployed, and that gives me a fallback
position in case my application developers and quality assurance analysts have
completely dropped the ball during their development and testing efforts.

Building “data jukeboxes” for complex data warehousing
requirements. As data warehouses continue to grow ever larger into the petabyte
and exabyte range, many DBAs have turned to partitioning tables as a solution
for purging old data that’s no longer deemed useful. Once the old data has been
rolled off from a table partition into just another table, it can be exported
and then deleted from the database. However, I have been in situations when my
application users suddenly need to get access to the data that has just been
purged with this method – perhaps to fulfill an emergency audit request, or
some type of specialized data warehousing report that needs all available data.

These special requests can be time-consuming to complete
because the data needs either to be restored to the same database from exports
or backups, or restored to an auxiliary instance. With tablespace versioning,
however, I can create what I like to call a data jukebox that has all
available data stored within separate file groups. I can then easily “plug in”
just the needed versions of the tablespace(s) to an auxiliary instance, give my
users access privileges to that auxiliary instance, and once their reporting
requests are complete, I can just as easily “unplug” the tablespace versions
from the auxiliary instance.

Building a Data “Jukebox” with DBMS_FILE_GROUP

To illustrate how tablespace versioning can work to my
advantage in this third scenario, I will construct a data jukebox to supply
information provisioning capabilities for a selected set of tablespaces. Before
I start, though, let’s review the four components of any tablespace version and
file group, from the highest to lowest level of granularity:

File Group Repository. At the highest level is the file
group repository. When a file group repository is storing tablespace versions, it
is also termed a tablespace repository.
File Group. Each file group repository consists of
one or more file groups.
Versions. Multiple versions of each
tablespace set can be created within the file group. Each version corresponds
to a specific directory within the platform’s file system.
Files. Each tablespace version consists of three files:
the actual copy of the tablespace’s datafile, the DataPump export
dumpset that contains the metadata for that tablespace, and the DataPump
export log file that lists how DataPump exported the metadata for
the tablespace set.

To initialize the repository, I will create some new DIRECTORY objects:

DBFILES_ORCL102
to represent the directory in which all of the datafiles for my database’s
tablespaces are stored
RPTREPOS_TSV_DIR_V1
for storage of the first version of my tablespace version’s files
RPTREPOS_TSV_DIR_V2
for storage of the second version of my tablespace version’s files
RPTREPOS_TSV_DIR_V3
for storage of the third version of my tablespace version’s files

I will then call DBMS_FILE_GROUP.CREATE_FILE_GROUP to create a new
file group repository called SH.GRP_RPTREPOS.
Listing
2.3 contains the code that illustrates these initial housekeeping
tasks. Output
2.3 and Output 2.4
show the resulting entries that are written into my database’s alert log and
export log, respectively, when this code has been executed.

Creating Tablespace Versions With Tablespace Cloning

Now that my tablespace repository has been set up, I will
use the tablespaces I created for demonstration purposes for both articles in
this series as my sources for versioned tablespaces. I will invoke the DBMS_STREAMS_TABLESPACE_ADM.CLONE_TABLESPACES procedure
to clone a copy of the LMT_REF, LMT_SFH,
and LMT_XACT
tablespaces into the tablespace repository as the first version of these
tablespaces.

Listing
2.4 shows the code I executed to achieve this. I have also included a
snippet from my database’s alert log and the export log file that Oracle
generated when it created this version of the tablespaces. Note that Oracle
automatically determined that the target tablespaces were not in READ ONLY mode
when the process began, so it automatically made the tablespaces READ ONLY,
then performed the export of the tablespace’s data files and metadata, and
finally brought the tablespaces back into READ WRITE mode when the tablespace
version was completed.

Once the first version is completed, I reloaded the SH.AGGR_SALES
table in the LMT_XACT
tablespace with new data reflecting the next period of sales, and then updated
a few rows in the SH.SALES_FORCE_HIERARCHY
table in the LMT_SFH tablespace.
I then created the second version of these tablespaces with a similar
invocation of the CLONE_TABLESPACES
procedure. The code I used to update the tables and then create the next
version of the tablespace set is shown in Listing 2.5.

“Plugging In” a Tablespace Version

Now that I have multiple tablespace versions to choose
from, I will utilize a sample Windows NT database as my target for plugging in
the tablespace versions. To accomplish this, I need to first create a database
link on the target server that references the database on the source server.
Then I will create directories on the target server with the same names as
those on the source server. (Of course, I can use different directories on the
target server for storage of the datafiles and metadata export files.)

Once these setup steps are complete, I will invoke DBMS_FILE_TRANSFER.GET_FILE
to pull the corresponding files from the source server’s directories and copy
them to my target database server’s directories. To attach the first version of
these tablespaces to the target database, I will run the DBMS_STREAMS_TABLESPACE_ADM.ATTACH_TABLESPACES
procedure. This procedure will plug these tablespaces into my target database
and make them immediately available for access. The code in Listing
2.6 shows how to perform all of these steps, and Output
2.5 shows the results of querying the first version of the tablespace.

To plug in the second version of the tablespace set,
however, I first need to drop the tablespaces, including their objects and
datafiles, on the target database. I then simply run the code shown in Listing 2.7
to plug in the second version of this tablespace set; Output
2.6 shows the results of querying the second version of these
tablespaces.

“Unplugging” a Tablespace Version

I can also “unplug” an existing tablespace set from a
database; however, be aware that this also has the possible undesirable
side-effect of physically removing the detached tablespaces from the database. I
have illustrated how to perform this at the end of Listing
2.7.

I will first remove the second version of the tablespace
set from the repository via the DBMS_FILE_GROUP.DROP_VERSION procedure, and then I
will recreate the version with the DBMS_STREAMS_TABLESPACE_ADM.DETACH_TABLESPACES
procedure. When it has completed executing, the tablespaces will only exist in
the tablespace repository and will have been physically removed from the
database.

Managing the Data Jukebox

DBMS_FILE_GROUP
offers several procedures for maintenance of the tablespace repository. Here is
a brief summary of some of the more pertinent procedures:

Procedure Name	Description
ALTER_FILE_GROUP	Modifies the attributes of a file group or tablespace repository, including retention period, minimum number of versions to retain, and maximum number of versions to retain
ALTER_FILE_VERSION	Modifies the attributes of a file version or tablespace repository version
ALTER_FILE	Modifies the attributes of a file that’s part of a file version or tablespace repository version
PURGE_FILE_GROUP	Purges all obsolete versions from a file group or tablespace repository based on the predefined retention criteria
REMOVE_FILE	Removes an existing file from a version in a tablespace repository or file group
DROP_VERSION	Removes a complete set of tablespaces within one tablespace version from a tablespace repository, or a complete set of files from a file group
DROP_FILE _GROUP	Removes all versions, all files, and the specified file group

Listing
2.8 shows sample invocations of these procedures.

Querying the Data Jukebox’s Contents

Finally, I have included some useful queries in Listing
2.9 for gathering information from the database’s data dictionary on
repository file groups, tablespace versions, and the files and directories that
correspond to these objects. Here’s a list of the
data dictionary views that will be helpful when working with tablespace
versioning:

View Name	View Description
DBA_FILE_GROUPS	Shows high-level metadata about file groups in the repository
DBA_FILE_GROUP_VERSIONS	Shows high-level metadata about versions stored in each file group
DBA_FILE_GROUP_EXPORT_INFO	Shows export-related information for each tablespace version
DBA_FILE_GROUP_FILES	Shows file sets for each versioned file group
DBA_FILE_GROUP_TABLESPACES	Shows information about the tablespaces present in the file set for versions that contain a DataPump dump file
DBA_FILE_GROUP_TABLES	Shows information about the tables that could be imported using the file set

Output
2.7 shows the results of running these queries against a sample of my
tablespace repository metadata.

Conclusion

Oracle 10gR2 provides powerful data retention
capabilities by allowing the storage of multiple versions of a tablespace – a
flexible solution for creating data jukeboxes for complex data warehousing
requirements, providing a “fallback position” during large-scale software
deployments, and retaining data historically to meet long term legal
requirements. The ability to perform cross-platform transfers of individual
tablespaces while incurring absolutely no downtime by creating transportable tablespaces
from RMAN backups offers extremely high availability for the source database as
well.

References and Additional Reading

Even though I have hopefully provided enough technical
information in this article to encourage you to explore with these features, I
also strongly suggest that you first review the corresponding detailed Oracle
documentation before proceeding with any experiments. Actual implementation of
these features should commence only after a crystal-clear clear understanding
exists. Please note that I have drawn upon the following Oracle 10gR2
documentation for the deeper technical details of this article:

B14191-02 Oracle Database
10gR2 Backup and Recovery Advanced User’s Guide

B14194-03 Oracle Database
10gR2 Backup and Recovery Reference

B14214-01 Oracle Database
10gR2 New Features Guide

B14229-01 Oracle 10gR2
Streams Concepts and Administration

B14231-01 Oracle Database
10gR2 Administrator’s Guide

B14257-01 Oracle 10gR2
Streams Advanced Queuing User’s Guide and Reference

B14258-01 PL/SQL
Packages and Types Reference (10gR2)

»

See All Articles by ColumnistJim Czuprynski