Lesson 1: Review of OLTP System Design

Both online transaction processing (OLTP) database design and online analytical processing (OLAP) database design follow generally accepted design methodologies that promote effective designs. This lesson looks at OLTP databases. It is important to look at OLTP databases for the following reasons:

  • To contrast OLTP design against OLAP design. It is a mistake to employ normalized design methods in designing an OLAP database.
  • To see how an OLTP database is decomposed into an OLAP database. This is useful because most OLAP databases are derived from OLTP databases.

After this lesson, you will be able to:

  • Describe the design characteristics of OLTP systems

Estimated lesson time: 20 minutes

Before beginning a study of data warehouse design, you should understand three specific design characteristics and their associated issues:

  • Entity-relationship (E-R) diagrams
  • Database table normalization
  • OLTP system design issues

Entity-Relationship Diagrams

Figure 4-1 shows a typical OLTP system database design diagram. This diagram is a SQL Server 7.0 Enterprise Manager Database Diagram.

click to view at full size

Figure 4.1 Entity-relationship diagrams

A principal characteristic of the design of OLTP systems is the use of E-R diagrams. The E-R modeling technique allows designers to visually represent the structure and context of the database. E-R models are built around fundamental standards. Let s take a look at these standards:

  • Each box represents an entity.
  • The entity name is above the box.
  • The box contains a list of the attributes.
  • Primary key attributes are listed first and indicated by a line or a highlight.
  • Lines that connect boxes represent a relationship between entities.
  • These represent the basis for joins between tables.

  • Symbols at line end points represent the cardinality of the relationship. Various symbols or notations are used in different tools but all of the cardinality types are summarized in the following table.
Relationship CardinalityDescription
One-to-oneA single instance in one table relates to exactly one instance in another table.
One-to-manyA single instance in one table relates to at least one instance in another table.
Many-to-manyA single instance in one table relates to at least one instance in another table, and vice versa.
Zero or oneModifies the One cardinality specification to allow no instances in one table (for example, zero or one-to-many).

TIP
There are many types of database diagramming notation, such as E-R (entity-relationship), IDEF1X (ICAM Definition Method 1 Extended), and IE (information engineering). Each has a strict set of rules regarding the elements used to depict logical database design. The preceding guidelines describe in general terms how entity relationships can be diagrammed. Although the interface elements differ slightly from methodology to methodology, they are all fairly similar to the preceding descriptions.

Database Table Normalization

Normalization is a set of sequentially implemented relational database design rules that

  • Eliminate duplicate information in tables
  • Accommodate efficient changes to table structures
  • Minimize the impact of table changes to user applications

The following table describes the characteristics of the first three normal forms. Other normal forms have been defined but they are beyond the scope of this discussion.

Normalization StepCharacteristicsFormal Definition
First normal formTables must be two-dimensional (columns and rows).

Similar to a spreadsheet.

One value per data cell (intersection of column and row).

Columns must have a single meaning.

No multiple uses/types allowed (for example, Order Status/Completion Date).

A relation is in first normal form (1NF) if and only if all attributes contain atomic values only.
Second normal form Nonkey columns (attributes) must be fully dependent on the primary key. Same rule applies to each key element of a multipart primary key. A relation is in second normal form (2NF) if and only if it is in 1NF and every nonkey attribute is fully dependent on the primary key.
Third normal formVariation on second normal form. Columns are not dependent on other columns in this or any other table. A relation is in third normal form (3NF) if and only if it is in 2NF and every nonkey attribute is nontransitively dependent on the primary key.

Normalization Example

Figure 4.2 illustrates how normalization is an incremental process that optimizes the table structure for use in a relational environment.

click to view at full size

Figure 4.2 Normalization process

In the graphic, the Purchases file consists of records that contain data about individual purchase orders. A set of fields exists within the file for each part ordered. Descriptive information for suppliers and parts is included in each record. Moving from left to right, the Purchases file goes through these common stages of normalization.

First Normal Form

In the first normal form:

  • Repeating part fields are split away from purchase order header fields.
  • Two tables are created to accommodate the segregated data.
  • The Purchase Order table contains the po key and header attributes.
  • The PO-Part table contains a compound key (po key, part key) and part line item attributes.
  • The po key column in the PO-Part table becomes a foreign key to the Purchase Order table.

Second Normal Form

In the second normal form:

  • The part description column of the PO-Part table is dependent on only the part key column of the primary key and is split into a separate table.
  • The Part table contains the part key and the part description attribute.
  • The part key column in the PO-Part table becomes a foreign key to the Part table.

Third Normal Form

In the third normal form:

  • The supplier name column of the Purchase Order table is dependent only on the supplier key column and is split into a separate table.
  • The Supplier table contains the supplier key and the supplier name attribute.
  • The supplier key column in the Purchase Order table becomes a foreign key to the Supplier table.

Exercise: Viewing an OLTP Database

In this exercise, you will view the Northwind database that is used by Northwind Traders. You will view the current operational database and note how it is designed, as well as observing the relationships that have been created.

  • To view the Northwind database
  • In this procedure, you will use SQL Server Enterprise Manager to view the database diagram of the Northwind database.

    1. Open SQL Server Enterprise Manager.
    2. Expand your server, expand Databases, and then expand Northwind.
    3. Click the Diagrams folder, and then, in the details pane, double-click Relationships.
    4. Review the current E-R design of the Northwind database.
    5. What portions of the Supplier table violate the normal forms discussed above?

      How many tables would need to be joined to answer the query, "How many of each product category did each customer order in November?"

    OLTP System Design Issues

    The OLTP design approach focuses on individual transaction response time and data update accuracy rather than analytical querying. From a business perspective, the design approach used in OLTP systems has both advantages and disadvantages.

    Advantages

    The advantages of the OLTP design approach are that it

    • Prevents update anomalies Anomalies can occur when a specific data change is not applied to all appropriate tables.
    • Ensures data consistency across transactions Data of a specific type and meaning (such as customer name/address and product description) is stored in only one place.
    • Minimizes the number of database accesses to fulfill a business event.
    • Handles discrete transactions Relational database management systems (RDBMS) like DB2, Informix, Oracle, Sybase, and others have been tuned to handle discrete transactions.
    • Has access paths that are clearly defined because all data is stored in a single location. Has fewer tables and less code to be changed to implement a modification. For example, changing the length or data type for a given field is simpler if data is stored in a single location.

    Disadvantages

    The disadvantages of the OLTP design approach are as follows:

    • It is difficult to design analytical reports Traditional E-R data models are optimized for single-row queries but not for multiconstraint, multijoin queries. For example, refer to Figure 4.1 and consider the simple business request, "Which categories of product have been ordered by a specified customer?" You would need to create a query that joins the Customers, Orders, Order Details, Products, and Categories tables to answer this request.
    • Indexes reduce performance Indexes that are required to improve analytical reporting degrade OLTP performance. For example, to answer analytical questions about customer demographics may require indexes on the City, Region, and PostalCode columns of the Customers table in Figure 4.1. These indexes are not required for OLTP.

    Lesson Summary

    Before we can understand dimensional schema database design, it is important to understand the predominant database type: the relational database. A relational database can and often does serve as the starting point for a dimensional database. It is important, however, not to try to make a dimensional database take a relational form. In fact, good dimensional database design violates the guidelines for normalization mentioned previously.



    Microsoft Corporation - Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit
    Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit
    ISBN: 0735606706
    EAN: 2147483647
    Year: 1999
    Pages: 114

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net