Archiving Oracle Databases

Archiving older data is a complex task; local, national and sometimes international regulations dictate when, how and for how long the archived data must remain available. Add to that the seemingly insurmountable task of storing all of this data electronically and what appears, from those outside the IT arena, to be a simple act can end up as anything but simple. Within the context of an Oracle database there are methods of archiving data, some simple, some a bit more complex but still within the realm of possibility. Let’s look at those options and what they can, and cannot, offer.

The first option that comes to mind (mine, anyway) involves partitioning, an Enterprise Edition option (which should not be a surprise since companies who generate reams of data to archive usually install this edition). Archiving in this scenario is fairly easy: convert the relevant partitions to stand-alone tables in their own tablespace, separate from the ‘live’ production data. If this data is now on its own storage it can even be moved to another database server to facilitate access and not impact daily production. Let’s look at the steps involved with this option. First let’s create a partitioned table:

 CREATE TABLE archive_test
dusty DATE,
vol VARCHAR2(60),
PARTITION really_old
VALUES LESS THAN ( TO_DATE('01-apr-1999','dd-mon-yyyy'))
TABLESPACE older_than_dirt,
PARTITION quite_old
VALUES LESS THAN ( TO_DATE('01-jul-2004','dd-mon-yyyy'))
TABLESPACE old_as_dirt,
PARTITION sorta_new
VALUES LESS THAN ( TO_DATE('01-oct-2009','dd-mon-yyyy'))
PARTITION really_new
VALUES LESS THAN ( TO_DATE('01-jan-2012','dd-mon-yyyy'))

-- Create local prefixed index

CREATE INDEX i_archives_l ON archive_test ( dusty,vol )
PARTITION i_otd_one TABLESPACE i_otd_one,
PARTITION i_oad_two TABLESPACE i_oad_two,
PARTITION i_nwr_three TABLESPACE i_nwr_three,
PARTITION i_nwst_four TABLESPACE i_nwst_four

The last partition of our table is set to accept all data through 01/01/2012 so archiving data simply involves converting the desired partition to a stand-alone table, preferably stored on a different diskgroup or array than the current production data. For the sake of illustration let’s put the destination tablespace, ARCHIVED_TS, in a separate ASM diskgroup (doing this allows for the movement of the diskgroup to another physical server for use by a separate Oracle instance). To archive the partition REALLY_OLD to a stand-alone table named REALLY_OLD_TBL:

-- Create empty table matching partition definition
create table really_old_tbl
dusty date,
vol varchar2(60),
info number
) tablespace archived_ts; -- Tablespace created in separate ASM diskgroup or on separate storage

-- Check row count in desired partition

select count(*)
from archive_test partition(really_old);

-- Move partition data to stand-alone table
alter table archive_test
exchange partition really_old with table really_old_tbl with validation;

-- Verify all rows written to destination table

select count(*)
from really_old_tbl;

-- Drop now-empty partition presuming row counts match
alter table archive_test drop partition really_old;

The data is now archived to a separate table and will no longer be available in ARCHIVE_TEST; this, however, makes partition QUITE_OLD the first partition resulting in any DUSTY value less than the upper partition limit being stored there, including values that should have been in REALLY_OLD. This may not be an issue as values that old may no longer be generated but it is an aspect to consider when archiving older data from a partitioned table.

A second method is available for those not using partitioning, which involves creating an archive table from the source table by selecting the desired data (this will also work for partitioned tables and may be the option of choice if a single archive table is desired as the above illustrated method creates a new table for each partition to be archived):

-- Create table and copy data
create table really_old_tbl
tablespace archived_ts
as select *
from archive_test
where dusty <= [some date here];

-- Verify all data copied successfully
select *
from archive_test
where (dusty,vol,info) not in (select * from really_old_tbl)
and dusty <= [some date here];

-- Delete from source table
from archive_test
where dusty <= [some date here];


The data is now archived to a separate table. Changing the create table statement to an insert statement can allow for ‘newer’ archived data to be stored in the same archive table; again a similar condition exists as any data within the archived range can still be inserted into the source table as no date limits may exist to restrict inserts. A trigger can be used to restrict such inserts as shown below:

SQL> create or replace trigger ins_chk_trg
2 before insert on archive_test
3 for each row
4 declare
5 mindt date;
6 begin
7 select max(dusty) into mindt from really_old_tbl;
8 if :new.dusty <= mindt then
9 raise_application_error(-20987,'Data ('||:new.dusty||') outside of acceptable date range', true);
10 end if;
11 end;
12 /

Trigger created.

SQL> insert into archive_test
2 values (to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') - 1, 'Test value x', -1,to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') );
old 2: values (to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') - 1, 'Test value x', -1,to_date('&mindt', 'RRRR-MM-DD HH24:MI:SS') )
new 2: values (to_date('1999-03-08 14:17:40', 'RRRR-MM-DD HH24:MI:SS') - 1, 'Test value x', -1,to_date('1999-03-08 14:17:40', 'RRRR-MM-DD HH24:MI:SS') )
insert into archive_test
ERROR at line 1:
ORA-20987: Data (07-MAR-99) outside of acceptable date range
ORA-06512: at "BLORPO.INS_CHK_TRG", line 6
ORA-04088: error during execution of trigger 'BLORPO.INS_CHK_TRG'


Such a trigger can be used on partitioned and non-partitioned tables to police the inserts and reject those bearing dates present in the archive table. As the archive table data increases (due to subsequent inserts) the trigger will recognize the new maximum and use it to reject inserts.

Lest we forget the external utilities both exp/imp and expdp/impdp can be used to archive data; the QUERY option to both exp and expdp allows extraction of specific data from a given table so that only the oldest data will be exported. Oracle recommends using a parameter file when using the QUERY option to avoid operating system specific escape characters. Additionally, expdp allows for one query per table and multiple table:query pairs when specified with the schema.table:query format. A sample parameter file is shown below:

TABLES=employees, departments
QUERY=employees:'"WHERE department_id > 10 AND salary > 10000"'
QUERY=departments:'"WHERE department_id > 10"'

This creates tables with the source names and a limited subset of the source data, which can be imported into a different schema or different database. The imported tables can be renamed with the usual command (in releases 10g and later):

rename employees to employees_arch;
rename departments to departments_arch;

If you’re using a release older than 10g the process is a bit more time consuming, involving creating a new table with the desired name from the imported table then copying any index/constraint definitions to the new table, finally dropping the imported table once you’re certain the ‘renamed’ table has all necessary indexes and constraints in place.

Archiving older data is not a terribly difficult task (at least in an Oracle database) but it does take planning and attention to detail to ensure all of the desired data is properly archived and available for the end users. Maintaining the archived table (or tables) also takes planning as applications may need to be written to directly access the archive and, in the case of multiple archive tables, be ‘smart’ enough to be able to access the newer additions as they arrive. Remember, too, that the specifications for the archiving revolve around local, state, federal (in the U.S.) and possibly international regulations and the archiving scheme must be flexible enough to provide the required ‘window’ of access. It’s also true that archived data may outlast the regulations that established it (unless legal issues preclude maintaining the archive beyond the prescribed date range); in such cases a sound storage strategy is a must and it’s not unusual for archived data to go from Tier II (slower, cheaper disk) storage to Tier I (tape) as long as the data is still accessible as access speed is not a requirement for archived data.

See all articles by David Fitzjarrell

David Fitzjarrell
David Fitzjarrell
David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Latest Articles