When you look closely at the type of processing that is done by most relational databases on a day-to-day basis, you will discover that they tend to support one of two types of workloads: On-Line Transaction Processing (OLTP) and data warehousing. Earlier, we saw that DB2 Data Warehouse Server Edition is the top-of-the-line DB2 Edition for dynamic data warehousing and that it is designed for today's data center environments, where OLTP and decision support are merged into single, integrated information management systems. But just what characteristics does a data warehouse have, and how are they different from the characteristics of an OLTP database?
Data warehousing involves storing and managing large volumes of data (often historical in nature) that is used primarily for analysis. Workloads in a data warehouse vary; they can consist of bulk load operations, short-running queries, long-running complex queries, random queries, occasional updates to data, and the execution of online utilities. To handle these types of workloads, most data warehouse environments have the following characteristics:
Performance: Population of a data warehouse and incremental updates should be performed as fast as possible. Any query should be satisfied at any time without degrading the performance of mission-critical or time-sensitive workloads. Complex queries should handle aggregations, full-table scans, and multiple table joins with little or no performance impact. Likewise, multidimensional queries should be satisfied without any impact. Database is optimized for query processing; multiple indexes are used.
Scalability: Both the hardware and the software components used in building a data warehouse should be scalable. Scalability enables you to grow your data warehouse without impacting performance and throughput rates.
Availability: A data warehouse should be available 24 hours a day, 7 days a week, 365 days a year.
Manageability: A data warehouse environment should be flexible and extensible, while minimizing the administrative costs of a high-volume database.
In contrast, OLTP systems are designed to support day-to-day, mission-critical business activities such as order entry, stock trading, inventory management, and banking. This typically involves hundreds to thousands of users issuing millions of transactions per day against databases that vary in size. Response time requirements tend to be subsecond, and workloads tend to be a mix of real-time Data Manipulation Language (DML) operations (inserts, updates, and deletes).
OLTP workloads tend to have the following characteristics:
High Performance: High throughput, measured in hundreds of transactions per second, with the requirement for subsecond end-user response time. Simple transactions, with each transaction issuing a limited number of straightforward SQL statements that access a small number of rows, and perform few, if any I/Os. Transactions perform a great deal of concurrent read and update activity. Database is optimized for transaction processing-performance of OLTP workloads is considerably enhanced by minimizing I/Os, optimizing CPU utilization, eliminating sorts, and improving concurrency between transactions.
High Volume: Volume of data may be very large (hundreds of gigabytes to a few terabytes) or just tens of gigabytes. (Data tends to be current.) Hundreds to thousands of concurrent users.
High Availability: An OLTP environment must be available 24 hours a day, 7 days a week, 365 days a year.