Flylib.com

Books Software

 
 
 

Sams Teach Yourself MySQL in 10 Minutes - page 91


Summary

In this lesson, you learned how to use the Database Manipulation Language in MySQL and how to use transactions to ensure data integrity. In the next lesson, we discuss some of the factors to consider when designing a new database.



Lesson 16. Designing Your Database

In this lesson, we discuss some of the factors that you should consider when designing a database.



Relationships

Now that you know how to create database tables and manipulate data, it'stime to take a look at the principles of relational databases and see how to apply good design practices to your own databases.

Data in a relational database is split across several tables in a structured way. Relationships are the rules that determine how data in one table is related to data in another. When you join tables in SQL, you specify the relationship between the tables in the query. Refer to Lesson 11, "Joining Tables," for more information on joins.

Consider the relationship between two tables in the sample database. The orders table is related to customers . Each record in the orders table contains a customer_id , which is a foreign key column that corresponds to the primary key in customers .

Therelationship between customers and orders is known as a one-to-many relationship because one row in the customers table can be related to many rows in the orders table. Think of what the database is trying to achieve herea customer must not be limited to placing only one order!

Many-to-One

A one-to-many relationship could be considered a many-to-one relationship, depending on which way you look at it. You could think of the relationship between products and customers as many-to-one.


The relationship between products and orders is a many-to-many relationship. An order can contain the same product many timesalthough the quantity column should help to avoid the need to do thisand, of course, each product can appear on many orders.

In the sample table, the many-to-many relationship between products and orders is actually implemented using two separate relationships and the order_lines table. There is a one-to-many relationship between orders and order_lines , and also a one-to-many relationship between products and order_lines .

As you might expect, a one-to-one relationship is also possible, although there is no relationship of this kind in the sample database. A one-to-one relationship uses a key that is unique in both tables; therefore, the data could actually be stored in one table instead of two.

Usually you use a one-to-one relationship for performance reasons. For example, you could keep one table highly optimized by storing only the most commonly used columns. The second table would contain larger amounts of data and would be joined to the first only when those columns were required. In particular, you can often improve performance by separating large text or binary data columns from the body of a table using a one-to-one relationship.



Normalization

Normalization is the process of optimizing the relationships between tables in your database. The process involves several stages of splitting a database into smaller components and adding the appropriate relationships.

Many reasons exist for normalizing your database from the outset. Normalization helps to improve the performance of the database by using only the most efficient relationships between tables. It also improves maintainability by reducing repetitive data.

The first stage of normalization involves eliminating any columns that might contain multiple values. For example, consider the following table. This shows a variation on the customers table that also includes the data that is in the customer_contacts table.

Figure 16.1. A table with multiple values in a column.


As you can see, the contacts column contains multiple names , separated using commas. This type of table structure must be eliminated.

Notice also that this structure enables you to store only a contact name . Another approach is to have one row in the customers table per contact. This satisfies the requirement of not having multiple values in a column, but this structure will include a lot of duplicated data, as shown in Figure 16.2.

Figure 16.2. A table with duplicated data.


One major pitfall with this approach is that it is possible for some of the contacts for a customer to have different name values than others. There is no need for the company name to be duplicated in this table; the next stage of normalization will eliminate this.

You should remove duplicate rows by creating a related table; this is howthe customers and customer_contacts tables are constructed in the sample database you have been following in this book.

The customer_code column is a primary key for customers and a foreign key for customer_contacts , with a one-to-many relationship. This structure ensures that each customer record appears only once in customers but that each customer_code can appear many times in customer_contacts ; in this way, many contact names are added for each customer.

The normalized tables from this part of the database are shown inFigure 16.3

Figure 16.3. The properly normalized tables.


The contact_id column is included in customer_contacts as an auto-incrementing unique primary key for this table. We also decided to split the first_name and last_name columns in the final database.