In this lesson, you learned how to use the Database Manipulation Language in MySQL and how to use transactions to ensure data integrity. In the
Lesson 16. Designing Your Database
In this lesson, we discuss some of the factors that you should consider when designing a database.
Now that you know how to create database tables and manipulate data, it'stime to take a look at the principles of relational databases and see how to apply good design practices to your own databases.
Data in a relational database is split across several tables in a structured way. Relationships are the rules that determine how data in one table is
Consider the relationship between two tables in the sample database. The orders table is related to customers . Each record in the orders table contains a customer_id , which is a foreign key column that corresponds to the primary key in customers .
Therelationship between customers and orders is known as a one-to-many relationship because one row in the customers table can be related to many rows in the orders table. Think of what the database is trying to achieve herea customer must not be limited to placing only one order!
The relationship between products and orders is a many-to-many relationship. An order can contain the same product many timesalthough the quantity column should help to avoid the need to do thisand, of course, each product can appear on many orders.
In the sample table, the many-to-many relationship between products and orders is actually implemented using two separate relationships and the order_lines table. There is a one-to-many relationship between orders and order_lines , and also a one-to-many relationship between products and order_lines .
As you might expect, a one-to-one relationship is also possible, although there is no relationship of this kind in the sample database. A one-to-one relationship uses a key that is unique in both tables; therefore, the data could actually be stored in one table instead of two.
Usually you use a one-to-one relationship for performance reasons. For example, you could keep one table highly optimized by storing only the most commonly used columns. The second table would contain larger amounts of data and would be joined to the first only when those columns were required. In particular, you can often improve performance by separating large text or binary data
Normalization is the process of optimizing the relationships between tables in your database. The process involves several stages of splitting a database into smaller
Many reasons exist for normalizing your database from the outset. Normalization helps to improve the performance of the database by using only the most efficient relationships between tables. It also
The first stage of normalization involves eliminating any
Figure 16.1. A table with multiple values in a column.
As you can see, the
column contains multiple
Notice also that this structure enables you to store only a contact
Figure 16.2. A table with duplicated data.
One major pitfall with this approach is that it is possible for some of the contacts for a customer to have different
values than others. There is no need for the company name to be duplicated in this table; the
You should remove duplicate rows by creating a
The customer_code column is a primary key for customers and a foreign key for customer_contacts , with a one-to-many relationship. This structure ensures that each customer record appears only once in customers but that each customer_code can appear many times in customer_contacts ; in this way, many contact names are added for each customer.
The normalized tables from this part of the database are shown inFigure 16.3
Figure 16.3. The properly normalized tables.
The contact_id column is included in customer_contacts as an auto-incrementing unique primary key for this table. We also decided to split the first_name and last_name columns in the final database.