Denormalization


Purposely adding redundant data and other fields that disobey normal forms is denormalization. Denormalizing as a process is more part of the physical design and will also be revisited during the implementation to improve performance. The concept is covered here for continuity and also to show the contrast with data normalization. After you have a logical design completely normalized, rarely will you keep it in that state as you proceed to the physical design of the actual database.

Although normalization gives you a great deal of storage efficiency and might result in increased performance in some situations, there are some drawbacks to a completely normalized database. You should consider the trade-offs in storage efficiency, performance, and maintainability in your final design.


If you go too far with the normalization process, you might actually reverse the effect you're trying to achieve. Although normalization will reduce data redundancy, result in smaller tables with fewer rows, and provide a logical and consistent form, it will also require table joins for the implementation and will not allow for summary, duplicate, or other data that a user might expect to find in a single table. Normalizing a database design too far can decrease performance and make it difficult to alter the underlying table structure, and might make it harder to work with the data.

Denormalization may occur at any number of levels. At the absolute extreme, a database schema can be completely duplicated to a number of servers across the network by implementing replication. This could be warranted if you need to distribute the access to the data across slow network links or to multiple remote locations. Many advantages are gained through database replication, because the data is more easily available at the locations where it will be used. The drawback of this is increased maintenance of a number of servers. Also, if database replication isn't configured properly, it could monopolize a WAN. In addition, if there are network problems or there is a poor setup, the data might not be synchronized to a level that keeps it up-to-date. Data can be maintained as an exact duplication against a number of servers, but this would require a high-speed network and the configuration of a two-phase commit.

Other, simpler examples of planned denormalization would be to maintain complete address information for customers, suppliers, employees, and so on in the tables with the rest of their general information. This is what most users expect, and it is difficult to maintain a separate address table. There are no defined rules for denormalization, but some definite guidelines will help you understand what level might be appropriate in a given situation.

Data warehousing schemas often use a denormalized approach referred to as a star or snowflake schema. This schema structure takes advantage of typical decision support queries by using one central "fact" table for the subject area and many dimension tables containing denormalized descriptions of the facts.

There are also several other situations to consider. If a join requires the implementation of more than three tables, denormalization should be considered. In some situations in which the number of columns in a table can grow very large, a denormalized structure would split the table into more easily handled portions and use a one-to-one relationship to connect the information.

The completed structure will have to be modified over time as the live use of the database warrants. Never consider a database design to be perfect or complete. It often takes several years of actual use to determine the best levels of normalization and denormalization to use.

Some forms of replication and data transfer accommodate redundancy by allowing the data to be on two servers simultaneously. If so, one server is usually treated as a read-only server and is used for offloading query processing from the updatable server. This is discussed in depth later in the book and is definitely an exam topic to be prepared for.




    EXAM CRAM 2 Designing and Implementing Databases with SQL Server 2000 Enterprise
    MCAD/MCSE/MCDBA 70-229 Exam Cram 2: Designing & Implementing Databases w/SQL Server 2000 Enterprise Edition
    ISBN: 0789731061
    EAN: 2147483647
    Year: 2005
    Pages: 154

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net