Normalization was developed by an IBM researcher named E.F. Codd in the early 1970s (he also invented the relational database). A relational database is merely a collection of data, organized in a particular manner; Dr. Codd created a series of rules called normal forms that help define that organization. In this chapter I will discuss the first three of the normal forms, which is sufficient for most database designs. Before you begin normalizing your database, you must define the role of the application being developed. Whether it means that you thoroughly discuss the subject with a client or figure it out for yourself, understanding how the information will be accessed dictates the modeling. Thus, this chapter will require paper and pen, rather than the MySQL software itself (to be clear, database design is applicable to any relational database, not just MySQL). Database design texts commonly use examples such as music or book collections (indeed, I use the latter in my book PHP Advanced for the World Wide Web: Visual QuickPro Guide), but I will create a more business-oriented accounting database here. The primary purpose of the database will be to track invoices and expenses, but it could easily be expanded to log work hours on projects or whatever. Table 3.1 reflects a sample record showing the type of data to be stored.
Before designing the database, some basic terminology is required. The next two sections will discuss the notions of keys and relationships. Tips
|