Creating an Entity Relationship | Microsoft Windows Architecture for Developers Training Kit

Denormalization

There are tradeoffs between a highly normalized database design and an intelligently denormalized design. When a database is highly normalized, the large number of links between related tables puts more demands on server resources during operations such as queries. The process of selective and intelligent denormalization can improve performance by making more efficient use of processor resources. Several strategies can be used to denormalize .

click to view at full size.

Figure 6.18 Denormalization Strategies

Introduce Redundancy

If the normalized design results in many four-way or greater join relationships, consider introducing redundancy at the attribute (column) or entity (table) level as follows :

Add duplicate attributes ( columns ) to the database entities (tables).
Add derived attributes (columns)for example, aggregates, maximum values, and so onto the database entities (tables).

Adding data redundancy to denormalize a database eliminates the overhead accessing additional tables.

Redefine Attributes

Redefine attributes to reduce the data width of an entity as follows:

Add contrived attributes (columns) to the database entities (tables).
A multiple attribute key on which many joins will take place can be replaced with a smaller contrived key. Appropriate attribute candidates are large keys and text fields. A text field can be redefined or augmented by an abstract field of 255 characters or less.

This technique reduces the size of table columns to reduce server resources used during database queries and other operations.

Redefine Entities

Redefine entities (tables) to reduce the overhead of extraneous attribute (column) data or row data. Appropriate types of associated denormalization are:

Segment an entity (table) by attributes (columns) into two entities (tables), separating high-access data from low-access data.
Under this scenario, the primary key has to be duplicated in each new table. Such a design may improve concurrency and results in narrower tables.
Segment an entity (table) by rows into two entities (tables).
This scenario works well for entities (tables) that contain large volumes of data. This is also a good tactic if data rows are accessed as subsets by logical workgroups (department, market segment, geography, and so on). Additionally, if a subset of the larger data set is the most active, the subset is a candidate for segmentation.

Lesson Summary

Once a relational database has been designed, it should be normalized. An optimal relational database is typically developed using entity-relationship modeling and normalization.

Entity-relationship modeling defines the structure of a relational database. Normalization is a process used to develop a well-designed, optimized and logical database schema before construction of the database begins.

In some cases, such as to relieve demands on system resources during queries, you may want to denormalize a database. Selective and intelligent denormalization can improve performance by making more efficient use of processor resources.