Creating NOT NULL Columns in Huge Oracle Tables - Page 2June 10, 2004 Parallel ProcessingParallel processing can be used for running huge updates. This is achieved by setting the table's degree to a higher value. This would result in an UPDATE statement being fired in multiple threads. The number of threads invoked would be equal to what is set as the table degree. The maximum threads used by Oracle does not exceed the value specified by the initialization parameter PARALLEL_MAX_SERVERS. The tables V$PX_PROCESS, V$PX_SESSION, V$PX_SESSTAT and V$PX_PROCESS_SYSSTAT are used to monitor parallel processing activities. Before running the update process, one large rollback segment is made available for it, all the rest are made offline. If using Undo tablespace, the size should be verified and increased if necessary. In my case, I keep a rollback segment of 500MB online. session 1:
SQL> select name, value from v$parameter
where name = 'parallel_max_servers';
NAME VALUE
-------------------------------------- --------------
parallel_max_servers 8
SQL> alter rollback segment rbs01 offline;
-- This is done for all rbs (not system!)
Session 2 (While the above statement is running):
SQL> select * from v$px_process;
SERV STATUS PID SPID SID SERIAL#
---- --------- ---------- --------- ---------- ----------
P000 IN USE 62 5478 26 19690
P001 IN USE 65 5480 81 946
P002 IN USE 68 5482 78 1465
P003 IN USE 70 5484 9 16174
P004 IN USE 71 5486 56 6246
P005 IN USE 72 5488 82 272
6 rows selected.
SQL> select sid, serial# from v$session where program like '%P00%';
SID SERIAL#
---------- ----------
9 16174
26 19690
56 6246
78 1465
81 946
82 272
6 rows selected.
SQL> select saddr, sid, serial#, server#, degree, req_degree
2 from v$px_session
3 where sid in (9, 26, 56, 78, 81, 82);
SADDR SID SERIAL# SERVER# DEGREE REQ_DEGREE
-------- ---------- ---------- ---------- ---------- ----------
8C0A7ED8 26 19690 1 6 6
8C0C4E0C 81 946 2 6 6
8C0C34C8 78 1465 3 6 6
8C09EFAC 9 16174 4 6 6
8C0B7B80 56 6246 5 6 6
8C0C5678 82 272 6 6 6
6 rows selected.
SQL> select * from v$px_process_sysstat;
STATISTIC VALUE
------------------------------ ----------
Servers In Use 6
Servers Available 0
Servers Started 18
Servers Shutdown 12
Servers Highwater 6
Servers Cleaned Up 0
Server Sessions 6
Memory Chunks Allocated 1
Memory Chunks Freed 0
Memory Chunks Current 1
Memory Chunks HWM 1
STATISTIC VALUE
------------------------------ ----------
Buffers Allocated 104146
Buffers Freed 104128
Buffers Current 18
Buffers HWM 28
15 rows selected.
Session 1:
The session completes update.
2205961 rows updated.
SQL> commit;
Commit complete.
SQL> alter table mtl_trx modify lovind default 'X' not null;
Table altered.
SQL> alter table mtl_trx noparallel;
Table altered.
SQL> select degree from dba_tables where table_name = 'MTL_TRX';
DEGREE
----------
1
The last step of setting the degree back to 1 should not be skipped, as it may affect online queries. The total process took around 16-18 minutes (excluding statistics generation). This can be further reduced by increasing the number of threads. Advantage: Oracle provided threading option is used, process time reduced. Disadvantage: Resource intensive, undo segment errors are common if sizing is not done before hand, no commits are possible in intervals, the complete process fails if a single thread returns an error (rolling back changes takes considerable time), temporary alteration of degree is required for the table (the initialization parameter PARALLEL_MAX_SERVERS may also need to be set). Exp/ImpData can be exported and re-imported with a default value on the column. I personally don't recommend this, as the idea of taking data out of the table in a production database is risky and sometimes dependencies exists between objects (e.g.: materialized views, foreign key relations etc.). A copy could be made for safety sake but this would depend on space availability. Running export at OS level: $ exp userid=apps/apps file=mtl_trx.dmp log=1.log tables=mtl_trx rows=y triggers=y grants=y indexes=y Export: Release 8.1.7.4.0 - Production on Wed May 26 15:06:32 2004 (c) Copyright 2000 Oracle Corporation. All rights reserved. Connected to: Oracle8i Enterprise Edition Release 8.1.7.4.0 - Production With the Partitioning option JServer Release 8.1.7.4.0 - Production Export done in US7ASCII character set and US7ASCII NCHAR character set About to export specified tables via Conventional Path ... . . exporting table MTL_TRX 2205961 rows exported Export terminated successfully without warnings. Instead of truncating, I rename the tables and the indexes and create a new structure. This will have the new column with the default value. SQL> rename mtl_trx to bkp_mtl_trx; Table renamed. SQL> alter index mtl_trx_u1 rename to bkp_mtl_trx_u1; Index altered. SQL> create table mtl_trx as select * from bkp_mtl_trx where 1 = 2; Table created. SQL> alter table mtl_trx add xtcind varchar2(1) default 'N' not null; Table altered. data imported back: $ imp userid=apps/apps file=mtl_trx.dmp log=2.log buffer=2097152 ignore=y commit =y tables=mtl_trx Import: Release 8.1.7.4.0 - Production on Wed May 26 15:59:18 2004 (c) Copyright 2000 Oracle Corporation. All rights reserved. Connected to: Oracle8i Enterprise Edition Release 8.1.7.4.0 - Production With the Partitioning option JServer Release 8.1.7.4.0 - Production Export file created by EXPORT:V08.01.07 via conventional path import done in US7ASCII character set and US7ASCII NCHAR character set . importing APPS's objects into APPS . . importing table "MTL_TRX" 2205961 rows imported Import terminated successfully without warnings. The new column is populated with the default value when the import takes place. The total process takes around 20-24 minutes (excluding statistics generation). Advantage: Process time reduced, commits are possible at intervals with BUFFER and COMMIT setting in import, no Undo-space issues. Disadvantage: Complete downtime is required, structural changes are done, sufficient space is required at both OS and database level, a risk involved as data is being moved out of the database, the more indexes that are present on the table the more the process time increases, connected objects like triggers and materialized views have to be evaluated for their role. |