Many-to-many relationships are both more difficult to understand and more difficult to resolve than one-to-many relationships. Don't be frustrated if you need to reread this section a couple times before you understand how they work. Again, I introduce the relationship type by using an example from the order-entry system. (I've slightly edited two tables you saw earlier in the chapter.) I then give you two detailed examples from the Classic TV database. Order-Entry SystemLet's say that you have an Orders table (see Table 3.10) and a Products table (see Table 3.11). Each row in the Orders table represents a single order; each row in the Products table represents a single product. (If you want a better idea of what these tables might look like, choose Help, Samples, Northwind Sample Database, and view the Orders and Products tables they contain.)
Consider how products and orders relate to each other. Any single order can be made up of many products. If this were a one-to-many relationship, I'd have finished the last sentence with "and any product can be in only one order." But that, of course, is not true. Any single product can also be included in many orders. You can say that the tables have a many-to-many relationship. This is very different from the relationship between Customers and Orders, where every order had only one customer, or Categories and Products, where every product had just one category. In those cases, you resolved the relationship by adding the primary key of the table on the "one" side of the relationship to the table on the "many" side of the relationship as a foreign key. But that solution won't work for Orders and Products. Suppose you add the primary key ProductID to the Orders table as a foreign key. If an order included three products, as it does for OrderID 10075 in Table 3.12, you would need three separate records in the Orders table for the three products. So you would have to repeat all the data, such as CustomerID, order date, and ship date, for all three records. If an order included 10 products, you would need 10 rows for a single order.
Now let's try it the other way around. Suppose you added the primary key OrderID to the Products table (see Table 3.13). If a product was included in three orders, as it is for ProductID 00123, you would need a separate record in the Products table for each order in which the product was included. Each time a product was ordered, you'd have to repeat data such as product name, CategoryID, and supplier in the Products table. If a product was included in 10 orders, you would need 10 rows for just that product.
Create a Linking TableYou can see that merely putting the primary key of the "one" table as a foreign key in the "many" table is not a solution because the two tables are on both the "one" and the "many" sides. So what is the solution? You create an entirely new table called a linking table (or a join table) that includes both the primary key from the Products table and the primary key from the Orders table. As in the Northwind database, let's call the table Order Details. In the Order Details table, the two foreign keys OrdersID and the ProductsID together are the primary key. In other words, this composite primary key has two fields. Note that this linking table includes two one-to-many relationships. As shown in Table 3.14, the OrderID field can appear many times in the Order Details table but, as you've seen, only once in the Orders table. Similarly, a ProductID can appear many times in the Order Details table but only once in the Products table.
Is the primary key of the Order Details table unique? Let's review the three key requirements of a primary key:
The linking table that resolves a many-to-many relationship might or might not have additional fields besides the primary key. Look at Table 3.15. It includes two fields that are exclusively identified by OrderID and ProductID. The value in the Quantity field can be determined only by knowing both the order and the product. Simply knowing either one is not sufficient to determine the quantity. The Price field is also exclusively identified by the composite primary key. Prices for any individual product might vary, so it's necessary to know the order in addition to the product to determine the price.
This might not seem an elegant solution; it might seem that tables are proliferating across your database, to little advantage. But the overall objective of associating tables and bringing together the data each contains is achieved. Classic TV DatabaseEarlier we considered the relationship between TV shows and networks. Initially, it appeared to be a simple one-to-many relationship. CBS aired such classics as The Lucy Show, Twilight Zone, Northern Exposure, and more. But The Lucy Show and the rest didn't appear on CBS one week, NBC the next, and ABC in week 3. So it would seem that many TV shows could appear on one network, but a single TV show was seen on only one network. But that isn't wholly true. As noted earlier, a few shows started on one network and then moved to another. For example, Leave It To Beaver began on CBS in 1957 but soon moved to ABC, where it ran for 6 years. So although one network can broadcast many TV shows, one TV show can appear on two or more networks. The number of TV shows certainly exceeds the number of networks, but it's nonetheless true that Programs and Networks actually have a many-to-many relationship. You already know one alternative: Choose one network as the primary network (such as the station on which the show originated, the network on which it aired longest, and so on) and simply ignore the second network. In other words, you might decide that the extra data provided by recording subsequent networks simply isn't worth the trouble or expense. The one-to-many design solution remains intact. But that's not a solution if you want to record the entire broadcast history of programs, which would include the start year and end year for the show on each network that it aired. If you included the primary key of the Networks table as a foreign key in the Programs table, you'd have duplicate values in fields such as Location (see Table 3.16). And if you used the ProgramID as a foreign key in the Networks table, you'd have duplicate values for fields such as Founder (see Table 3.17). (To develop a better example, I've used imaginary programs and air dates in both tables.)
Again, the solution is to create a linking table (say, Broadcasts) that includes keys from both tables. The primary key of the Broadcasts table is a composite key. It comprises the primary keys of both the Programs and Networks tables, which in the Broadcasts table become foreign keys. Table 3.18 shows the new table, Broadcasts, with the composite primary key. (The names of the programs and networks are not part of the table; I've included them only to make the table easier to understand.)
Let me put that another way: The Programs table has a one-to-many relationship with the Broadcasts table through the ProgramID field. The primary key of the Programs table is ProgramID; in the Broadcasts table, ProgramID is a foreign key. The Networks table also has a one-to-many relationship with the Broadcasts table. The primary key of the Networks table is NetworkID; in the Broadcasts table, NetworkID is a foreign key. The composite primary key of ProgramID and NetworkID would be unique because no program can run on the same network more than once at the same time. As with the Order Details table, you would want to include in the Broadcasts table any fields that are exclusively described by the composite primary key. In this case, those would be the Year Started and Year Ended fields (see Table 3.19again, the program and network names in parentheses are included only to make the table easier to understand). Because a program can move to a different network, only a primary key that comprises both primary keys can exclusively identify the Year Started and Year Ended fields.
A Three-Field Primary KeyConsider this possibility: Suppose that a program aired on one network, then aired on another network, and then returned to its original network. I don't know if this has ever occurred (probably not), but it is conceivable. What consequences would this have for your table design? The Broadcasts table in Table 3.20 is the same as that in Table 3.19, with one difference: It includes one extra record for Stanley & Son. Instead of ending its run in 1964, it returned to its original network, CBS, for its final year.
In that case, the composite primary key would no longer be unique because the same program aired on the same network twice. In other words, the ProgramID and NetworkID would be identical for two records: the original broadcast of the program (19591961), and the broadcast when it returned to its original network (19651966). Even a primary key with two fields wouldn't be sufficient to identify every record uniquely. One good solution is to include Year Started as an additional field in what would be a three-field composite primary key. Together, the ProgramID, NetworkID, and Year Started fields would uniquely identify the broadcasts. Another possible solution would be to add a field showing the sequence of broadcasts (see Table 3.21). The first broadcast of a program would be 1, the next airing would be 2, and so on. The three fields of ProgramID, NetworkID, and Sequence Number together would be the primary key and would uniquely identify each record. The advantage of this scheme is that you could use the sequence number to search for only first-run or second-run broadcasts. Other database designers would disagree, however, and note that this design needlessly introduces additional complexity into the database.
In the Classic TV database, I'll assume that TV shows never return for a run on their earlier networks.
One More Many-to-Many RelationshipAnother many-to-many relationship in the Classic TV database needs to be resolved. Consider the Actors table and the Programs table. Obviously, any one program can have several actors. It's almost as obvious that any one actor can be on several TV shows. For example, Bob Newhart has been the star of the eponymous The Bob Newhart Show, Newhart, and Bob. (He once joked that his next show would be called simply B.) Because one program can have several actors and one actor can be on several TV programs, the Actors and Programs tables have a many-to-many relationship. Again, you can resolve the many-to-many relationship by using a linking table. (I think by now you understand that simply placing the primary key of the Actors table into the Programs table as a foreign key, or vice versa, is not a solution.) Table 3.22 shows the current status of the database. Reconsider the Roles table, which you created to resolve the Character Name/Character Occupation field (if you need to refresh your memory, take a look back at the "Resolving Multivalue Fields" section in Chapter 2). You can use this table as a linking table for Actors and Programs. Here are the fields we have so far:
Now add three more fields to the Roles table:
Table 3.23 shows a few records from the revised Roles table, which now includes the ActorID field. Again, I emphasize that the names in parentheses are not part of the table.
An Alternative SolutionI think the Classic TV database suggested would fill the needs of most couch potatoes. But it's a less-than-perfect solution. For example, fans of The Patty Duke Show will remember that the actor Patty Duke played both Cathy (who has lived almost everywhere, from Zanzibar to Barclay Square) and her identical cousin, Patty (who has seen only the sights a girl can see from Brooklyn Heights). In that case, the ProgramID and the ActorID would be the same for both characters, and thus the primary key would be inadequate. More frequently, the same actor on a program plays both a main character and one or two other characters that appear infrequently. For example, on the show Bewitched, Elizabeth Montgomery played both the lovely witch Samantha, the main character, and her irritating cousin Serena, who made brief, intermittent appearances. It's also possible for the same character to migrate to other television shows. For example, Frasier Crane is a character on both Cheers and Frasier. You might want to include fields that were exclusive to the character (say, personality) that didn't vary by program or actor. One solution would be to set up a separate Characters table, with fields describing the character alonefirst and last name, personality, and so on (see Table 3.24). The table would have a CharacterID field as a primary key. You could then place the primary key of the Characters table in the Roles table as a foreign key (see Table 3.25, where again the text in parentheses is not part of the table). The primary key for the Roles tables would then comprise three keys: ProgramID, ActorID, and CharacterID. The Programs, Actors, and Characters tables would all have one-to-many relationships with the Roles table through their respective primary keys.
Let's keep things simple, however, by assuming that the program and the actor together is sufficient for identifying roles. |