Suppose you're a company that sells food items, like Northwind. You have a Products table with all the items you sell, with one row for each item: ice cream, yogurt, nondairy creamer, and so on. Among the columns, you include a field to store all the flavors for that item: vanilla, blueberry, Irish crème (see Table 2.14).
Or consider a table of Russian authors. Each row is dedicated to one author. You have fields for first name, last name, and birthday. You also have a column for the author's major works (see Table 2.14). The Flavors field in the Products table and the Major Works field in the Authors table are known as multivalue fields. These are fields that store more than one value in each row. They're usually easy to identify because, as in the examples, they use commas to separate the various values. The fields are also usually named in the pluralflavors, characters, works, and so on. I think multivalue fields are among the most difficult database problems to handle, both conceptually and practically. At first glance, the Flavors field in the Products table and the Major Works field in the Authors table both seem like fine, even elegant, solutions to a common problem. If you know that the number of values for a single characteristicproduct flavors, authors' major works, and so onwill vary among records, why try to predict the number of columns you need? Instead of using threeor six, or eight, or who knows how manycolumns to store values, wouldn't the best solution be to store all values in a single field and separate them by commas? But you are designing your database so you can quickly and efficiently retrieve the data you need. Suppose you decide to discontinue an experimental flavorsay, asparagusfrom all your products. If Asparagus is always a discrete record, it's easy to delete all occurrences of Asparagus from your database. If Asparagus is but one value in a multivalue field, however, you need to edit each of those records. That task is more difficult and, equally important, more likely to result in error. The threat to data integrity is substantial. Or suppose you want to retrieve all records on Dostoevsky's Crime and Punishment. Surrounded as it is between The Brothers Karamazov and The Idiot, it will be inefficient to find information on that one book. Or say you want to sort all the books by title. A multivalue field will make it difficult to put all the books in alphabetical order. Put simply, multivalue fields are difficult to edit, sort, and delete. They violate the essence of good relational database design. You need to get rid of them. Multivalue Fields in the Classic TV DatabaseEarlier you saw that the Programs table had a multipart field, Character Names/Character Occupations. To resolve this multipart field, you created separate Character Names and Character Occupations fields (see Table 2.15). Splitting the field into two columns resolved the problem of having two different types of data in the same field. But the Character Names and Character Occupations remain multivalue fields. To repeat, they have the two telltale signs of multivalue fields: The values are separated by commas, and the field name is plural. Resolving Multivalue FieldsSo how do you fix these multivalue fields? First, take both the Character Names and Character Occupations fields out of the Programs table and put them in a new table called Roles. Second, take the primary key from the Programs table and put it in the Roles table as well. Finally, let's split the new Character Name (now singular) field into Last Name and First Name fields to eliminate the multipart field. A separate field is assigned to the character occupation (see Table 2.16).
NOTE Next to the ProgramID, I've included in italics the name of the TV show the ID refers to. These names are not part of the table. I've (reluctantly) included them only so you can see the program the ID refers to. This solution seems counterintuitive; it provides a complex, even convoluted, solution to what seems to be a simple problem. But using keys to resolve multivalue fields gives you a small taste of how you are going to integrate data in your database. As you'll see, putting the primary key of the original table into the new table enables you to establish a relationship between the original and new tables. By using primary keys to represent the values in each record, you make it possible to bring together values from different tables and integrate the information in your database.
|