SHARE

Oracle Hybrid Columnar Compression in Exadata

Written By

Jun 10, 2013

6 minute read

Hybrid Columnar Compression is a compression option available only for tables residing on Exadata storage, which besides Exadata includes Axiom, the SPARC SuperCluster and the Sun ZFS Storage Appliance. If you try this on any other system you will receive the following informative message:

SQL> alter table emp move compress for query high;
alter table emp move compress for query high
            *
ERROR at line 1:
ORA-64307: hybrid columnar compression is not supported for tablespaces on this storage type


SQL>

The phrase ‘this storage type’ means any storage other than those listed at the beginning of this section.

HCC works differently than Basic or OLTP compression as it re-organizes data into Compression Units (CU). There are two basic types under the HCC banner, QUERY and ARCHIVE, each having the options of HIGH and LOW. The first pair of HCC options we will discuss fall under the Query type. We will then discuss the ARCHIVE type.

Query

This type is the less aggressive of the two, but it still initially compresses data more than OLTP compression. We first compress our EMP table for query high:

SQL> 
SQL> --
SQL> -- Current storage for the EMP table
SQL> -- (this is simply a test table for this example)
SQL> --
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                   713031680
EMP_IDX                               478150656
SALGRADE                                  65536

Elapsed: 00:00:00.82
SQL> 
SQL> --
SQL> -- Compress the table for query high (use HCC)
SQL> --
SQL> -- Note elapsed time to compress
SQL> --
SQL> alter table emp move compress for query high;

Table altered.

Elapsed: 00:00:35.65
SQL> 
SQL> --
SQL> -- Index is now invalid
SQL> --
SQL> -- Must rebuild to make it usable
SQL> --
SQL> -- Note elapsed time
SQL> --
SQL> alter index emp_idx rebuild;

Index altered.

Elapsed: 00:01:13.70
SQL> 
SQL> --
SQL> -- Current compression type, storage for table/index
SQL> -- initially after compression is enabled
SQL> --
SQL> select table_name, compression, compress_for
  2  from user_tables;

TABLE_NAME                     COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
DEPT                           DISABLED
BONUS                          DISABLED
SALGRADE                       DISABLED
DUMMY                          DISABLED
EMP                            ENABLED  QUERY HIGH

Elapsed: 00:00:00.20
SQL> 
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                     8388608
EMP_IDX                               260046848
SALGRADE                                  65536

Elapsed: 00:00:00.03
SQL>

The resulting size is much smaller than either of the Basic or OLTP compression options. Even at the Query Low compression rate the size is still less than OLTP compression can provide:

SQL> 
SQL> --
SQL> -- Initial storage
SQL> --
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                   713031680
EMP_IDX                               478150656
SALGRADE                                  65536

Elapsed: 00:00:00.25
SQL> 
SQL> --
SQL> -- Compress for QUERY LOW
SQL> --
SQL> -- Note elapsed time
SQL> --
SQL> alter table emp move compress for query low;

Table altered.

Elapsed: 00:00:16.16
SQL> 
SQL> alter index emp_idx rebuild;

Index altered.

Elapsed: 00:00:43.08
SQL> 
SQL> --
SQL> -- These figures are the same as those generated
SQL> -- AFTER the HCC compressed data was updated the first time
SQL> --
SQL> select table_name, compression, compress_for
  2  from user_tables;

TABLE_NAME                     COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
DEPT                           DISABLED
BONUS                          DISABLED
SALGRADE                       DISABLED
DUMMY                          DISABLED
EMP                            ENABLED  QUERY LOW

Elapsed: 00:00:00.02
SQL> 
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                    14680064
EMP_IDX                               260046848
SALGRADE                                  65536

Elapsed: 00:00:00.02
SQL>

Archive

The HCC compression type ARCHIVE is definitely the most aggressive in terms of space savings, but it’s also intended for data that is designated read-only and is or will be archived for occasional use. It certainly isn’t intended for actively updated tables. We’ll compress our standard EMP table for ARCHIVE HIGH and see what space savings are generated:

SQL> --
SQL> -- Current storage for the EMP table
SQL> -- (this is simply a test table for this example)
SQL> --
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                   713031680
EMP_IDX                               478150656
SALGRADE                                  65536

Elapsed: 00:00:00.02
SQL> 
SQL> --
SQL> -- Compress the table for archive high
SQL> --
SQL> -- Note elapsed time to compress
SQL> --
SQL> alter table emp move compress for archive high;

Table altered.

Elapsed: 00:00:38.55
SQL> 
SQL> --
SQL> -- Index is now invalid
SQL> --
SQL> -- Must rebuild to make it usable
SQL> --
SQL> -- Note elapsed time
SQL> --
SQL> alter index emp_idx rebuild;

Index altered.

Elapsed: 00:00:39.45
SQL> 
SQL> --
SQL> -- Current compression type, storage for table/index
SQL> -- initially after compression is enabled
SQL> --
SQL> select table_name, compression, compress_for
  2  from user_tables;



TABLE_NAME                     COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
DEPT                           DISABLED
BONUS                          DISABLED
SALGRADE                       DISABLED
DUMMY                          DISABLED
EMP                            ENABLED  ARCHIVE HIGH

Elapsed: 00:00:00.02
SQL> 
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                     4194304
EMP_IDX                               260046848
SALGRADE                                  65536

Elapsed: 00:00:00.01
SQL>

The space savings are substantial, taking the table from its original size of 680 megabytes down to 4 megabytes, a savings of 99.41 percent. Using ARCHIVE LOW instead of ARCHIVE HIGH still produces impressive results:

SQL> --
SQL> -- Initial storage
SQL> --
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                   713031680
EMP_IDX                               478150656
SALGRADE                                  65536

Elapsed: 00:00:01.31
SQL> 
SQL> --
SQL> -- Compress for ARCHIVE LOW
SQL> --
SQL> -- Note elapsed time
SQL> --
SQL> alter table emp move compress for archive low;

Table altered.

Elapsed: 00:00:34.16
SQL> 
SQL> alter index emp_idx rebuild;

Index altered.

Elapsed: 00:00:48.44
SQL> 
SQL> --
SQL> -- These figures are the same as those generated
SQL> -- AFTER the HCC compressed data was updated the first time
SQL> --
SQL> select table_name, compression, compress_for
  2  from user_tables;

TABLE_NAME                     COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
DEPT                           DISABLED
BONUS                          DISABLED
SALGRADE                       DISABLED
DUMMY                          DISABLED
EMP                            ENABLED  ARCHIVE LOW

Elapsed: 00:00:00.03
SQL> 
SQL> select segment_name, sum(bytes) total_space
  2  from user_segments
  3  group by segment_name;

SEGMENT_NAME                        TOTAL_SPACE
----------------------------------- -----------
DEPT                                      65536
DUMMY                                     65536
EMP                                     8388608
EMP_IDX                               260046848
SALGRADE                                  65536

Elapsed: 00:00:00.02
SQL>

Using ARCHIVE LOW the table ends up twice as large as it did using ARCHIVE HIGH but the table is 98.82 percent smaller than it was before compression. This is the same level of compression afforded by QUERY HIGH.

Updates to either of these compression types introduce some interesting effects; the tables silently change compression type to OLTP (even though the data dictionary still reports the original compression type) and they consume more space, although not as much as they did before compression. OLTP compression provides for automatic re-compression in what I would call a ‘batch mode’ — as blocks fill to the PCTUSED limit they are compressed, making room for more uncompressed data. The cycle continues until no additional data can fit in the block. This leaves blocks in mixed states, some compressed, some not, so the table size is not likely to be as small after updates as it was initially.

Is HCC for everyone? No, and it wasn’t intended to be. Consider using the HCC Query type on tables that aren’t heavily updated if you feel you MUST use it (it was originally intended for Data Warehouse applications), remembering that after updates it reverts to OLTP compression. HCC Archive type compression should be used for archived data, data that is no longer updated and that you need to keep but want to make its ‘footprint’ smaller.

HCC is a wonderful level of compression but in an actively updated database consider making OLTP compression your first choice.

See all articles by David Fitzjarrell

David Fitzjarrell

David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Oracle Hybrid Columnar Compression in Exadata

Query

Archive

David Fitzjarrell

Company

Categories