Saturday, August 28, 2010

HYBRID COLUMNAR COMPRESSION - OVERVIEW



I couldn't find much from Oracle Documentation.However there are very good bits and pieces on the net. Here are my takeaways from these readings.

1). HYBRID COLUMNAR COMPRESSION [HCC] is only for EXADATA.There seems to be some technical difficulties in implementing it for the traditional storage (Don't know what it is!).

2). Concept:

Compression is very effficient for a data which is quite repetetive. With repetitive values, we can replace the Values with the smaller values (eg Symbols/Special Characters) and create the metadata to map these values; this metadata can be used to decompress the data.In a nutshell ,Compression saves space at the cost of processing power (CPU Usage).

Below diagram shows a typical Oracle Data block Stores the Data.C1-5 are Columns.



Values with Lower Cardinality (more duplicated elements) are the potential candidates for the efficient compression. With this, Columns are set to have more duplicated values than the rows.

So with HCC, Columns are grouped together in a logical area called Compression Unit.Which is a logical grouping of multiple datablocks.

So the compression unit would look like.


3). There are 2 types of HCC

- Warehouse Compression - On Average, 10 Times Space Savings.

- Archive Compression - On Average, 15 Times Space Savings(70% at some customer sites)

4). Warehouse Compression

-   Syntax: 
                 CREATE TABLE X
                                        COMPRESS FOR QUERY [LOW  | HIGH]
                                        AS
                                        SELECT * FROM Y;
 
                   Where   HIGH | LOW - Specifies the Extent of Compression.
 
                - Warehouse Compression is optimized for Query Performance.

5). Archive Compression

-   Syntax: 
                 CREATE TABLE X
                                        COMPRESS FOR ARCHIVE [LOW  | HIGH]
                                        AS
                                        SELECT * FROM Y;
 
                   Where   HIGH | LOW - Specifies the Extent of Compression.
 
                 - Archive Compression is optimised for Storage savings.

6). When a session places lock in the particular row HCC locks the whole Compression Unit Resulting in multiple row lock. Hence it may not be suitable for OLTP application where frequent update/insert occurs.

7). A Partition Table can have different type of HCC based on the Partition .For example. The latest partition can have COMPRESS BY QUERY whereas the older partition which hardly being queried can create using COMPRESS BY ARCHIVE.

8). The existing table can be converted to get benefit from HCC.However only newly added rows will be compressed .To compress the existing rows you need to do ALTER TABLE MOVE ;

9). You can check the characteristics of a row using DBMS_COMPRESSION package. To know more about the package click here


10)CPU Consumption for compression and decompression of the data is offset'd by less logical reads. Archive Compression consumes relatively more CPU compared to Warehouse Compression.

No comments: