Normalization


Normalization is the process of optimizing the relationships between tables in your database. The process involves several stages of splitting a database into smaller components and adding the appropriate relationships.

Many reasons exist for normalizing your database from the outset. Normalization helps to improve the performance of the database by using only the most efficient relationships between tables. It also improves maintainability by reducing repetitive data.

The first stage of normalization involves eliminating any columns that might contain multiple values. For example, consider the following table. This shows a variation on the customers table that also includes the data that is in the customer_contacts table.

Figure 16.1. A table with multiple values in a column.


As you can see, the contacts column contains multiple names, separated using commas. This type of table structure must be eliminated.

Notice also that this structure enables you to store only a contact name. Another approach is to have one row in the customers table per contact. This satisfies the requirement of not having multiple values in a column, but this structure will include a lot of duplicated data, as shown in Figure 16.2.

Figure 16.2. A table with duplicated data.


One major pitfall with this approach is that it is possible for some of the contacts for a customer to have different name values than others. There is no need for the company name to be duplicated in this table; the next stage of normalization will eliminate this.

You should remove duplicate rows by creating a related table; this is howthe customers and customer_contacts tables are constructed in the sample database you have been following in this book.

The customer_code column is a primary key for customers and a foreign key for customer_contacts, with a one-to-many relationship. This structure ensures that each customer record appears only once in customers but that each customer_code can appear many times in customer_contacts; in this way, many contact names are added for each customer.

The normalized tables from this part of the database are shown inFigure 16.3

Figure 16.3. The properly normalized tables.


The contact_id column is included in customer_contacts as an auto-incrementing unique primary key for this table. We also decided to split the first_name and last_name columns in the final database.




Sams Teach Yourself MySQL in 10 Minutes
Sams Teach Yourself MySQL in 10 Minutes
ISBN: 0672328631
EAN: 2147483647
Year: 2006
Pages: 165
Authors: Chris Newman

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net