Many-to-Many Relationships


Many-to-many relationships are both more difficult to understand and more difficult to resolve than one-to-many relationships. Don't be frustrated if you need to reread this section a couple times before you understand how they work.

Again, I introduce the relationship type by using an example from the order-entry system. (I've slightly edited two tables you saw earlier in the chapter.) I then give you two detailed examples from the Classic TV database.

Order-Entry System

Let's say that you have an Orders table (see Table 3.10) and a Products table (see Table 3.11). Each row in the Orders table represents a single order; each row in the Products table represents a single product. (If you want a better idea of what these tables might look like, choose Help, Samples, Northwind Sample Database, and view the Orders and Products tables they contain.)

Table 3.10. This Orders Table Contains Data About Orders

OrderID (PK)

CustomerID (FK)

Order Date

Ship Date

<<other fields>>

10075

002

10/16/2004

10/18/2004

 

10076

003

10/19/2004

10/21/2004

 

10077

002

10/21/2004

10/23/2004

 

10078

007

10/22/2004

10/24/2004

 

10079

008

10/22/2004

10/25/2004

 

PK=Primary key; FK=Foreign key


Table 3.11. This Products Table Contains Data About Products

ProductID (PK)

Product Name

CategoryID (FK)

Supplier

<<otherfields>>

00122

Travel Clock

4

High Noon Timepieces

 

00123

Hiking Socks

5

Mountain Footwear

 

00124

Duffle Bag

4

In The Bag, Inc.

 

00125

Garment Bag

4

Luggage 'n More

 

00126

Toiletries Kit

4

Hi Jean Products

 

PK=Primary key; FK=Foreign key


Consider how products and orders relate to each other. Any single order can be made up of many products. If this were a one-to-many relationship, I'd have finished the last sentence with "and any product can be in only one order." But that, of course, is not true. Any single product can also be included in many orders.

You can say that the tables have a many-to-many relationship.

This is very different from the relationship between Customers and Orders, where every order had only one customer, or Categories and Products, where every product had just one category. In those cases, you resolved the relationship by adding the primary key of the table on the "one" side of the relationship to the table on the "many" side of the relationship as a foreign key.

But that solution won't work for Orders and Products. Suppose you add the primary key ProductID to the Orders table as a foreign key. If an order included three products, as it does for OrderID 10075 in Table 3.12, you would need three separate records in the Orders table for the three products. So you would have to repeat all the data, such as CustomerID, order date, and ship date, for all three records. If an order included 10 products, you would need 10 rows for a single order.

Table 3.12. This Orders Table Includes the Individual Products in the Order. It Is an Example of Bad Database Design.

OrderID (PK)

ProductID (FK)

CustomerID (FK)

Order Date

Ship Date

<<otherfields>>

10075

00123

002

10/16/2004

10/18/2004

 

10075

00124

002

10/16/2004

10/18/2004

 

10075

00125

002

10/16/2004

10/18/2004

 

10076

00124

003

10/19/2004

10/21/2004

 

10077

00126

002

10/21/2004

10/23/2004

 

PK=Primary key; FK=Foreign key


Now let's try it the other way around. Suppose you added the primary key OrderID to the Products table (see Table 3.13). If a product was included in three orders, as it is for ProductID 00123, you would need a separate record in the Products table for each order in which the product was included. Each time a product was ordered, you'd have to repeat data such as product name, CategoryID, and supplier in the Products table. If a product was included in 10 orders, you would need 10 rows for just that product.

Table 3.13. This Products Table Contains Data About the Orders in Which the Product Is Included. It Is an Example of Bad Database Design.

ProductID (PK)

OrderID (FK)

Product Name

CategoryID (FK)

Supplier

<<otherfields>>

00123

10075

Hiking Socks

5

Mountain Footwear

 

00123

10076

Hiking Socks

5

Mountain Footwear

 

00123

10077

Hiking Socks

5

Mountain Footwear

 

00124

10077

Duffle Bag

4

In The Bag, Inc.

 

00125

10078

Garment Bag

4

Luggage'n More

 

PK=Primary key; FK=Foreign key


Create a Linking Table

You can see that merely putting the primary key of the "one" table as a foreign key in the "many" table is not a solution because the two tables are on both the "one" and the "many" sides. So what is the solution?

You create an entirely new table called a linking table (or a join table) that includes both the primary key from the Products table and the primary key from the Orders table. As in the Northwind database, let's call the table Order Details. In the Order Details table, the two foreign keys OrdersID and the ProductsID together are the primary key. In other words, this composite primary key has two fields.

Note that this linking table includes two one-to-many relationships. As shown in Table 3.14, the OrderID field can appear many times in the Order Details table but, as you've seen, only once in the Orders table. Similarly, a ProductID can appear many times in the Order Details table but only once in the Products table.

Table 3.14. The Order Details Table Includes the Primary Keys from Both the Orders and Products Tables. Both Fields Are on the "Many" Side of One-to-Many Relationships.

OrderID CPK/FK

ProductID CPK/FK

10075

00123

10075

00124

10075

00125

10076

00124

10077

00126

CPK=Composite primary key; FK=Foreign key


Is the primary key of the Order Details table unique? Let's review the three key requirements of a primary key:

  • Does it uniquely identify each record? Yes, it does. Individually, you'll find the same OrderID more than once in the OrderID column, and the same ProductID included more than once in the ProductID column. But each combination of OrderID and ProductID in a single row is unique.

  • Can it contain null values? No. For every product, there has to be at least one order, and for every order, there has to be at least one product.

  • Will the primary key change only rarely? After an OrderID has been assigned to an order and a Product ID has been assigned to a product, there should be little reason to change them.

The linking table that resolves a many-to-many relationship might or might not have additional fields besides the primary key. Look at Table 3.15. It includes two fields that are exclusively identified by OrderID and ProductID. The value in the Quantity field can be determined only by knowing both the order and the product. Simply knowing either one is not sufficient to determine the quantity. The Price field is also exclusively identified by the composite primary key. Prices for any individual product might vary, so it's necessary to know the order in addition to the product to determine the price.

Table 3.15. A Linking Table Can Have Additional Fields That Are Uniquely Described by the Composite Primary Key

Order Details, Including Fields Besides the Composite Primary Key

OrderID CPK/FK

ProductID CPK/FK

Quantity

Price

10075

00123

12

$ 2.99

10075

00124

3

$14.95

10075

00125

2

$19.73

10076

00124

3

$14.95

10077

00126

2

$ 4.95

CPK=Composite primary key; FK=Foreign key


This might not seem an elegant solution; it might seem that tables are proliferating across your database, to little advantage. But the overall objective of associating tables and bringing together the data each contains is achieved.

Classic TV Database

Earlier we considered the relationship between TV shows and networks. Initially, it appeared to be a simple one-to-many relationship. CBS aired such classics as The Lucy Show, Twilight Zone, Northern Exposure, and more. But The Lucy Show and the rest didn't appear on CBS one week, NBC the next, and ABC in week 3. So it would seem that many TV shows could appear on one network, but a single TV show was seen on only one network.

But that isn't wholly true. As noted earlier, a few shows started on one network and then moved to another. For example, Leave It To Beaver began on CBS in 1957 but soon moved to ABC, where it ran for 6 years. So although one network can broadcast many TV shows, one TV show can appear on two or more networks. The number of TV shows certainly exceeds the number of networks, but it's nonetheless true that Programs and Networks actually have a many-to-many relationship.

You already know one alternative: Choose one network as the primary network (such as the station on which the show originated, the network on which it aired longest, and so on) and simply ignore the second network. In other words, you might decide that the extra data provided by recording subsequent networks simply isn't worth the trouble or expense. The one-to-many design solution remains intact.

But that's not a solution if you want to record the entire broadcast history of programs, which would include the start year and end year for the show on each network that it aired.

If you included the primary key of the Networks table as a foreign key in the Programs table, you'd have duplicate values in fields such as Location (see Table 3.16). And if you used the ProgramID as a foreign key in the Networks table, you'd have duplicate values for fields such as Founder (see Table 3.17). (To develop a better example, I've used imaginary programs and air dates in both tables.)

Table 3.16. This Programs Table Includes NetworkID as a Foreign Key. Because Programs Such As ProgramID 102 Aired First on One Station and Then on Another, You Need to Repeat Data Such As the Program's Location (Kansas City). This Is Not Good Database Design.

ProgramID (PK)

Program

NetworkID (FK)

Location

Year Started

Year Ended

<otherfields>

101

That's My Millie

1

Milwaukee

1960

1968

 

102

Stanley and Son

1

Kansas City

1959

1961

 

102

Stanley and Son

2

Kansas City

1962

1964

 

103

Dean's Place

3

Los Angeles

1986

1991

 

104

The Zoo

3

San Diego Zoo

1988

1995

 

PK=Primary key; FK=Foreign key


Table 3.17. The Networks Table with ProgramID Included as a Foreign Key. Because Many Programs Appeared on Each Station, You Have a Great Deal of Repeating Data in the Network Fields (Only Popular Name and Founder Are Shown Here).

NetworkID (PK)

ProgramID (FK)

Popular Name

Founder

Year Started

Year Ended

<<otherfields>>

1

101

CBS

William Paley

1955

1960

 

1

102

CBS

William Paley

1959

1961

 

2

102

NBC

David Sarnoff

1962

1964

 

3

103

ABC

Leonard Goldenson

1986

1991

 

3

104

ABC

Leonard Goldenson

1988

1995

 

PK=Primary key; FK=Foreign key


Again, the solution is to create a linking table (say, Broadcasts) that includes keys from both tables. The primary key of the Broadcasts table is a composite key. It comprises the primary keys of both the Programs and Networks tables, which in the Broadcasts table become foreign keys. Table 3.18 shows the new table, Broadcasts, with the composite primary key. (The names of the programs and networks are not part of the table; I've included them only to make the table easier to understand.)

Table 3.18. The Broadcasts Table Is a Linking Table That Includes the Primary Keys of Both the Programs and Network Tables

ProgramID CPK/FK

NetworkID CPK/FK

101 (That's My Millie)

1 (CBS)

102 (Stanley&Son)

1 (CBS)

102 (Stanley&Son)

2 (NBC)

103 (Dean's Place)

3 (ABC)

104 (The Zoo)

3 (ABC)

CPK=Composite primary key; FK=Foreign key


Let me put that another way: The Programs table has a one-to-many relationship with the Broadcasts table through the ProgramID field. The primary key of the Programs table is ProgramID; in the Broadcasts table, ProgramID is a foreign key.

The Networks table also has a one-to-many relationship with the Broadcasts table. The primary key of the Networks table is NetworkID; in the Broadcasts table, NetworkID is a foreign key. The composite primary key of ProgramID and NetworkID would be unique because no program can run on the same network more than once at the same time.

As with the Order Details table, you would want to include in the Broadcasts table any fields that are exclusively described by the composite primary key. In this case, those would be the Year Started and Year Ended fields (see Table 3.19again, the program and network names in parentheses are included only to make the table easier to understand). Because a program can move to a different network, only a primary key that comprises both primary keys can exclusively identify the Year Started and Year Ended fields.

Table 3.19. The Broadcasts Table with Fields That Are Exclusively Identified by the Primary Key

ProgramID CPK/FK

NetworkID CPK/FK

Year Started

Year Ended

101 (That's My Millie)

1 (CBS)

1960

1968

102 (Stanley&Son)

1 (CBS)

1959

1961

102 (Stanley&Son)

2 (NBC)

1962

1964

103 (Dean's Place)

3 (ABC)

1986

1991

104 (The Zoo)

3 (ABC)

1988

1995

CPK=Composite primary key; FK=Foreign key


A Three-Field Primary Key

Consider this possibility: Suppose that a program aired on one network, then aired on another network, and then returned to its original network. I don't know if this has ever occurred (probably not), but it is conceivable. What consequences would this have for your table design?

The Broadcasts table in Table 3.20 is the same as that in Table 3.19, with one difference: It includes one extra record for Stanley & Son. Instead of ending its run in 1964, it returned to its original network, CBS, for its final year.

Table 3.20. The Composite Primary Key of ProgramID and NetworkID Is Insufficient to Uniquely Identify Each Record If a Program Returns to Its Original Network

Broadcasts: Stanley & Son

ProgramID CPK/FK

NetworkID CPK/FK

Year Started

Year Ended

101 (That's My Millie)

1 (CBS)

1960

1968

102 (Stanley&Son)

1 (CBS)

1959

1961

102 (Stanley&Son)

2 (NBC)

1962

1964

102 (Stanley&Son)

1 (CBS)

1965

1966

103 (Dean's Place)

3 (ABC)

1986

1991

104 (The Zoo)

3 (ABC)

1988

1995

CPK=Composite primary key; FK=Foreign key


In that case, the composite primary key would no longer be unique because the same program aired on the same network twice. In other words, the ProgramID and NetworkID would be identical for two records: the original broadcast of the program (19591961), and the broadcast when it returned to its original network (19651966). Even a primary key with two fields wouldn't be sufficient to identify every record uniquely.

One good solution is to include Year Started as an additional field in what would be a three-field composite primary key. Together, the ProgramID, NetworkID, and Year Started fields would uniquely identify the broadcasts.

Another possible solution would be to add a field showing the sequence of broadcasts (see Table 3.21). The first broadcast of a program would be 1, the next airing would be 2, and so on. The three fields of ProgramID, NetworkID, and Sequence Number together would be the primary key and would uniquely identify each record. The advantage of this scheme is that you could use the sequence number to search for only first-run or second-run broadcasts. Other database designers would disagree, however, and note that this design needlessly introduces additional complexity into the database.

Table 3.21. The Sequence Number Column Lets You Uniquely Identify Each Record but Makes the Database More Complex

Broadcasts with Sequence Numbers

ProgramID CPK/FK

NetworkID CPK/FK

Sequence Number CPK

Year Started

YearEnded

101 (That's My Millie)

1 (CBS)

1

1960

1968

102 (Stanley&Son)

1 (CBS)

1

1959

1961

102 (Stanley&Son)

2 (NBC)

2

1962

1964

102 (Stanley&Son)

1 (CBS)

3

1965

1966

103 (Dean's Place)

3 (ABC)

1

1986

1991

104 (The Zoo)

3 (ABC)

1

1988

1995

CPK=Composite primary key; FK=Foreign key


In the Classic TV database, I'll assume that TV shows never return for a run on their earlier networks.

Q&A

Q1:

I think I know the answer to this question, but please remind me one more time. If relational databases are so great, why do you have to add data in parentheses, which isn't actually included in the tables, to show us what's going on? In other words, why don't any of these tables contain data I can actually use as is, rather than be filled with ID numbers that mean nothing to me?

A1:

A good table isn't one that meets a specific information need. It is a table that helps the database as a whole satisfy all your information needs. It's irrelevant whether any single table by itself can be used to meet any of your information requirements because you use tables to store data, not to manipulate or display it. Those tasks are the work of the queries, forms, and reports you will create in later chapters.


One More Many-to-Many Relationship

Another many-to-many relationship in the Classic TV database needs to be resolved. Consider the Actors table and the Programs table. Obviously, any one program can have several actors. It's almost as obvious that any one actor can be on several TV shows. For example, Bob Newhart has been the star of the eponymous The Bob Newhart Show, Newhart, and Bob. (He once joked that his next show would be called simply B.)

Because one program can have several actors and one actor can be on several TV programs, the Actors and Programs tables have a many-to-many relationship. Again, you can resolve the many-to-many relationship by using a linking table. (I think by now you understand that simply placing the primary key of the Actors table into the Programs table as a foreign key, or vice versa, is not a solution.)

Table 3.22 shows the current status of the database. Reconsider the Roles table, which you created to resolve the Character Name/Character Occupation field (if you need to refresh your memory, take a look back at the "Resolving Multivalue Fields" section in Chapter 2). You can use this table as a linking table for Actors and Programs. Here are the fields we have so far:

  • ActorID We can add the primary key from the Actors table as a foreign key in the Roles table.

  • ProgramID The table already includes ProgramID from the Programs table as a foreign key, but you haven't yet assigned a primary key to the table.

    As with the Broadcasts linking table, the primary key of the Roles table will be a composite primary key that includes the primary keys of the two linked tables as foreign keys. In other words, the primary key of the Roles table will be a composite primary key comprising ActorID and ProgramID, the two foreign keys from the linked tables.

  • First Name and Last Name You need to know both the program and the actor to determine the character's name. The character's first name and last name are exclusively identified by the primary key of ProgramID and ActorID.

  • Occupation You need to know both the program and the actor to determine the character's occupation. The character's occupation is exclusively identified by the primary key of ProgramID and ActorID.

Table 3.22. Current Status of the Classic TV Database

Current Field Lists

Programs

Genres

Broadcasts

Networks

Actors

Roles

ProgramID PK

GenreID PK

ProgramID CPK/FK

NetworkID PK

ActorID PK

ProgramID FK

GenreID FK

Genre

NetworkID CPK/FK

Official Name

First Name

First Name

Name

Description

Year Started

Popular Name

Last Name

Last Name

Location

 

Year Ended

Founder

Gender

Occupation

Synopsis

  

Notes

Biography

 

PK=Primary key; CPK=Composite primary key; FK=Foreign key


Now add three more fields to the Roles table:

  • Year Started and Year Ended The years the actor made his or her appearance and exited the show are pieces of information you'd like to know.

  • Notes You might want to put in additional comments about the nature of the actor's role, as you will see shortly.

Table 3.23 shows a few records from the revised Roles table, which now includes the ActorID field. Again, I emphasize that the names in parentheses are not part of the table.

Table 3.23. A Sample of the Roles Table

ActorID CPK/FK

ProgramID CPK/FK

Last Name

First Name

Occupation

<<other fields>>

1 (Andy Griffith)

1 (TAGS)

Taylor

Andy

Sheriff

 

1 (Andy Griffith)

9 (Matlock)

Matlock

Benjamin

Defense attorney

 

2 (Don Knotts)

1 (TAGS)

Fife

Barney

Sheriff's deputy

 

2 (Don Knotts)

9 (Matlock)

Calhoun

Les

Matlock's neighbor

 

3 (Ron Howard)

1 (TAGS)

Taylor

Opie

Schoolboy

 

3 (Ron Howard)

2 (Happy Days)

Cunningham

Richie

Student

 

4 (Jim Nabors)

1 (TAGS)

Pyle

Gomer

Filling station attendant

 

5 (Henry Winkler)

2 (Happy Days)

Fonzarelli

Arthur

Motorcyclist

 

TAGS=The Andy Griffith Show; CPK=Composite primary key; FK=Foreign key


An Alternative Solution

I think the Classic TV database suggested would fill the needs of most couch potatoes. But it's a less-than-perfect solution.

For example, fans of The Patty Duke Show will remember that the actor Patty Duke played both Cathy (who has lived almost everywhere, from Zanzibar to Barclay Square) and her identical cousin, Patty (who has seen only the sights a girl can see from Brooklyn Heights). In that case, the ProgramID and the ActorID would be the same for both characters, and thus the primary key would be inadequate.

More frequently, the same actor on a program plays both a main character and one or two other characters that appear infrequently. For example, on the show Bewitched, Elizabeth Montgomery played both the lovely witch Samantha, the main character, and her irritating cousin Serena, who made brief, intermittent appearances.

It's also possible for the same character to migrate to other television shows. For example, Frasier Crane is a character on both Cheers and Frasier. You might want to include fields that were exclusive to the character (say, personality) that didn't vary by program or actor.

One solution would be to set up a separate Characters table, with fields describing the character alonefirst and last name, personality, and so on (see Table 3.24). The table would have a CharacterID field as a primary key. You could then place the primary key of the Characters table in the Roles table as a foreign key (see Table 3.25, where again the text in parentheses is not part of the table). The primary key for the Roles tables would then comprise three keys: ProgramID, ActorID, and CharacterID. The Programs, Actors, and Characters tables would all have one-to-many relationships with the Roles table through their respective primary keys.

Table 3.24. A Characters Table That Stores Values About the Subject Characters

CharacterID (PK)

Last Name

First Name

Personality

<<other fields>>

141

Bunker

Archie

Bigoted but lovable

 

142

Crane

Frasier

Pretentious yet warm-hearted

 

143

Lane

Patty

Happy, fun-loving

 

144

Lane

Cathy

Proper, thoughtful

 

145

Pyle

Gomer

Simple, goofy

 


Table 3.25. The ProgramID, ActorID, and CharacterID Are All Necessary to Distinguish the Role of Patty Lane from Cathy Lane on The Patty Duke Show

Roles with Character IDs

ProgramID (CPK/FK)

ActorID (CPK/FK)

CharacterID (CPK/FK)

<<otherfields>>

32 (Cheers)

39 (Kelsey Grammer)

142 (Frasier Crane)

 

33 (Frasier)

39 (Kelsey Grammer)

142 (Frasier Crane)

 

34 (The Patty Duke Show)

40 (Patty Duke)

143 (Patty Lane)

 

34 (The Patty Duke Show)

40 (Patty Duke)

144 (Cathy Lane)

 

35 (All in the Family)

41 (Carroll O'Connor)

141 (Archie Bunker)

 

CPK=Composite primary key; FK=Foreign key


Let's keep things simple, however, by assuming that the program and the actor together is sufficient for identifying roles.




Hands-On Microsoft Access(c) A Practical Guide to Improving Your Access Skills
Hands-On Microsoft Access: A Practical Guide to Improving Your Access Skills
ISBN: 0321245458
EAN: 2147483647
Year: 2005
Pages: 169
Authors: Bob Schneider

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net