DB2-Specific Data Warehousing Guidelines

 <  Day Day Up  >  

DB2 offers specific features and functionality that can be exploited for data warehousing. The guidelines in this section should be used as rules of thumb when you're designing, implementing, and using your DB2-based data warehouse.

Use Good DB2 Database Design Techniques

Use the DB2 DDL design techniques presented in Chapter 5 in conjunction with the guidelines presented in this chapter to ensure an optimal DB2 data warehouse implementation.

Ensure Read-Only Data

Create the data warehouse as a decision support vehicle. The data should be periodically updated and summarized. If your design calls for a data warehouse in which all the data is modified immediately as it is changed in production, you need to rethink your data warehouse design.

Consider starting DB2 data warehouse databases as ACCESS(RO) to ensure read-only access. Doing so has the additional effect of eliminating locking on the read-only databases. When the data warehouse is refreshed, the databases have to be restarted in read/write mode.

Consider Using Dirty Reads

Because the data warehouses are read-only in nature, locking is not truly required. You can specify ISOLATION(UR) for all plans, packages, and queries used in the data warehouse environment. With ISOLATION(UR) , DB2 will take fewer locks, thereby enhancing performance. However, DB2 might read uncommitted data when ISOLATION(UR) is specified. This should not be a major concern in the read-only data warehouse.

CAUTION

Just a reminder that all of your indexes must be Type 2 indexes before you can use ISOLATION(UR) . If you are running DB2 V7 or V8 this should not be a problem.


Avoid Referential Integrity, Triggers, and Check Constraints

Because data is cleansed and scrubbed during the data transformation process, implementing data integrity mechanisms such as referential integrity (RI), triggers, and check constraints on data warehouse tables is not efficient. Even without a comprehensive cleansing during data transformation, the data in the warehouse will be as good as the data in the source operational systems (which should utilize RI and check constraints).

NOTE

Triggers can be useful in data warehouses as a reporting or auditing tool, but not as a data integrity tool. For example, you might create a trigger that records a log containing the timestamp of the last change to data in the data warehouse. This log then can be queried by users to determine the freshness of the data in the warehouse.


Encourage Parallelism

Consider using partitioned table spaces and specifying DEGREE(ANY) to encourage I/O, CPU, and Sysplex parallelism. Parallelism helps to reduce overall elapsed time when accessing large databases such as those common in a data warehouse.

Consider partitioning simple and segmented table spaces to take advantage of DB2's parallelism features. Additionally, consider repartitioning partitioned table spaces to take full advantage of DB2 parallelism based on the usage patterns of your data warehouse access.

Consider Data Compression

DB2's hardware-based data compression techniques are optimal for the data warehousing environment. Consider compressing tables that are infrequently accessed to save disk space. Furthermore, consider compressing all tables if possible.

Utilize Materialized Query Tables

graphics/v8_icon.gif

MQTs can be used to greatly improve the elegance and efficiency of DB2-based data warehouses. Recall that an MQT is basically a query whose data is stored physically for reuse by future queries.


Queries that can utilize MQTs can experience a significant reduction in elapsed time. This is especially the case for often-repeated queries that aggregate the same, or related data.

Refer back to the previous section on Materialized Query Tables for more details on their usage and benefits.

 <  Day Day Up  >  


DB2 Developers Guide
DB2 Developers Guide (5th Edition)
ISBN: 0672326132
EAN: 2147483647
Year: 2004
Pages: 388

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net