Oracle performs data compression in a variety of ways to improve storage requirements of data for indexes and tables. The benefits of compression are greatest when the data has many repeated values (such as tables with foreign keys) and if data values are not updated frequently.
For b*tree indexes, Oracle performs key compression by storing the common prefix of index key values in an index block, only once. The number of columns to be used in the prefix is equal to the number of key columns - 1 for unique indexes and equal to the number of key columns for nonunique indexes. Index-key compression must be enabled using the COMPRESS keyword on the CREATE INDEX statement.
For bitmap indexes, the bitmap for each data value is automatically compressed by Oracle. For columns whose cardinality is much less than the number of rows in the table, the bitmaps tend to be dense (having a lot of 1s) and can be compressed greatly. Thus, a bitmap index can occupy much less space than the corresponding b*tree index.
Another very useful type of compression is data segment compression, introduced in Oracle 9i Release 2. Data segment compression identifies repeated values of data in a data block and places them in a lookup table at the beginning of the block. The data rows then point to the value in the lookup table. This can significantly reduce storage used when the data contain large repeating values, such as names. It is especially useful for materialized views with aggregation, since the data values for the grouping columns are often repeated.
To enable data segment compression you must specify the COMPRESS keyword when creating a tablespace, a table, or individual partitions of a range or list-partitioned table. It is not available for hash partitioned or index-organized tables. The following example shows a range-partitioned table, PURCHASES, whose first three partitions are compressed and the last one is left uncompressed.
CREATE TABLE easydw.purchases (product_id varchar2(8), time_key date, customer_id varchar2(10), purchase_date date, purchase_time number(4,0), purchase_price number(6,2), shipping_charge number(5,2), today_special_offer varchar2(1)) PARTITION by RANGE (time_key) (partition purchases_jan2002 values less than (TO_DATE('01-02-2002', 'DD-MM-YYYY')) pctfree 0 pctused 99 storage (initial 64k next 64k pctincrease 0) tablespace purchases_jan2002 COMPRESS, partition purchases_feb2002 values less than (TO_DATE('01-03-2002', 'DD-MM-YYYY')) pctfree 0 pctused 99 storage (initial 64k next 64k pctincrease 0) tablespace purchases_feb2002 COMPRESS, partition purchases_mar2002 values less than (TO_DATE('01-04-2002', 'DD-MM-YYYY')) pctfree 0 pctused 99 storage (initial 64k next 64k pctincrease 0) tablespace purchases_mar2002 COMPRESS, partition purchase_catchall values less than (MAXVALUE) pctfree 0 pctused 99 storage (initial 64k next 64k pctincrease 0) tablespace purchases_maxvalue NOCOMPRESS);
If you are adding a partition with segment compression for the first time to a previously uncompressed partitioned table, you need to rebuild any bitmap indexes on that table.
To achieve better compression, when adding data into a new partition, sort the data first so that repeating values appear in the same block as much as possible.
You can change the compress attribute of a table by doing an ALTER TABLE MOVE command. For example, to compress the purchase_catchall partition of the purchases table, you issue:
ALTER TABLE purchases MOVE PARTITION purchases_catchall COMPRESS;
If new rows are added to a compressed table using a conventional INSERT statement, the new data is not compressed efficiently. Similarly, DELETE and UPDATE statements cannot maintain compression efficiently. However, if data is added using DIRECT PATH insert (using SQL Loader or an INSERT /*+APPEND */ statement), the new data is inserted into new blocks, which can be compressed. Hence, segment compression is well suited for data warehousing environments that do batch loads of data or for partitions that are mostly read only.
You may want to merge and compress partitions corresponding to older data, since they are not likely to change very much. This allows you to keep more data online.