Both online transaction processing (OLTP) database design and online analytical processing (OLAP) database design follow generally accepted design methodologies that promote effective designs. This lesson looks at OLTP databases. It is important to look at OLTP databases for the following reasons:
After this lesson, you will be able to:
- Describe the design characteristics of OLTP systems
Estimated lesson time: 20 minutes
Before beginning a study of data warehouse design, you should understand three specific design characteristics and their associated issues:
Figure 4-1 shows a typical OLTP system database design diagram. This diagram is a SQL Server 7.0 Enterprise Manager Database Diagram.
Figure 4.1 Entity-relationship diagrams
A principal characteristic of the design of OLTP systems is the use of E-R diagrams. The E-R modeling technique allows designers to visually represent the structure and context of the database. E-R models are built around fundamental standards. Let s take a look at these standards:
These represent the basis for joins between tables.
Relationship Cardinality | Description |
---|---|
One-to-one | A single instance in one table relates to exactly one instance in another table. |
One-to-many | A single instance in one table relates to at least one instance in another table. |
Many-to-many | A single instance in one table relates to at least one instance in another table, and vice versa. |
Zero or one | Modifies the One cardinality specification to allow no instances in one table (for example, zero or one-to-many). |
TIP
There are many types of database diagramming notation, such as E-R (entity-relationship), IDEF1X (ICAM Definition Method 1 Extended), and IE (information engineering). Each has a strict set of rules regarding the elements used to depict logical database design. The preceding guidelines describe in general terms how entity relationships can be diagrammed. Although the interface elements differ slightly from methodology to methodology, they are all fairly similar to the preceding descriptions.
Normalization is a set of sequentially implemented relational database design rules that
The following table describes the characteristics of the first three normal forms. Other normal forms have been defined but they are beyond the scope of this discussion.
Normalization Step | Characteristics | Formal Definition |
---|---|---|
First normal form | Tables must be two-dimensional (columns and rows). Similar to a spreadsheet. One value per data cell (intersection of column and row). Columns must have a single meaning. No multiple uses/types allowed (for example, Order Status/Completion Date). | A relation is in first normal form (1NF) if and only if all attributes contain atomic values only. |
Second normal form | Nonkey columns (attributes) must be fully dependent on the primary key. Same rule applies to each key element of a multipart primary key. | A relation is in second normal form (2NF) if and only if it is in 1NF and every nonkey attribute is fully dependent on the primary key. |
Third normal form | Variation on second normal form. Columns are not dependent on other columns in this or any other table. | A relation is in third normal form (3NF) if and only if it is in 2NF and every nonkey attribute is nontransitively dependent on the primary key. |
Figure 4.2 illustrates how normalization is an incremental process that optimizes the table structure for use in a relational environment.
Figure 4.2 Normalization process
In the graphic, the Purchases file consists of records that contain data about individual purchase orders. A set of fields exists within the file for each part ordered. Descriptive information for suppliers and parts is included in each record. Moving from left to right, the Purchases file goes through these common stages of normalization.
In the first normal form:
In the second normal form:
In the third normal form:
In this exercise, you will view the Northwind database that is used by Northwind Traders. You will view the current operational database and note how it is designed, as well as observing the relationships that have been created.
In this procedure, you will use SQL Server Enterprise Manager to view the database diagram of the Northwind database.
What portions of the Supplier table violate the normal forms discussed above?
How many tables would need to be joined to answer the query, "How many of each product category did each customer order in November?"
The OLTP design approach focuses on individual transaction response time and data update accuracy rather than analytical querying. From a business perspective, the design approach used in OLTP systems has both advantages and disadvantages.
The advantages of the OLTP design approach are that it
The disadvantages of the OLTP design approach are as follows:
Before we can understand dimensional schema database design, it is important to understand the predominant database type: the relational database. A relational database can and often does serve as the starting point for a dimensional database. It is important, however, not to try to make a dimensional database take a relational form. In fact, good dimensional database design violates the guidelines for normalization mentioned previously.