Joining Parent and Child Records

The buildings discussed in the previous section are probably subject to municipal codes and other regulations that require the regular inspection of doors. Those regulations of course require you to correct any problems that are found. Outside doors need to work properly in emergencies. Doors separating hallways are often fire doors, and need to retard a fire's progress. Doors to closets often must be secure, especially if the closets might contain hazardous materials.

In turn, that means that each of those doors must be inspected regularly and a record made of its condition, including any maintenance action taken. Further, when an occupant of the building complains that, say, a bathroom door won't lock, that complaint needs to be recorded. So does the subsequent repair of the door. Complaints don't follow tidy schedules the way that regular maintenance does.

CASE STUDY
Choosing the Parent Record

Given this situation a common one, by the way how do you intend to store the information in the Doors database that you've set up for Ballou Realty? Does a door constitute a record? If so, how do you handle the fact that a door gets maintenance attention some indeterminate number of times during one calendar year?

How many fields should you allocate to record those maintenance actions? Bear in mind that each time a door receives maintenance, you need to store information about the technician, the date the work was done, what action was taken, whether it's covered by warranty, and so on. Your door-record is going to need a lot of fields. Figure 5.7 shows how your Facilities staff might have used a flat file structure to store the information.

Figure 5.7. A database designer who doesn't understand relational structures (or prefers not to use them) might employ this design.

graphics/05fig07_alt.jpg

graphics/05fig07.jpg

NOTE

The term flat file means a data set that is not relational. An Excel list is a flat file: Different records occupy different rows, and different fields occupy different columns. It's two-dimensional or flat.

Sometimes the layout shown in Figure 5.7 makes good sense. If you know, for example, that a door will be maintained no more than four times during the time it's in service, you might opt for the simplicity of a flat file. Many designs that are wrong in theory are right in practice.

More often, though, this sort of layout creates problems for you. After all, each instance of maintenance really is a different record, and to treat it as a different field dismembers the reality of the situation. When it comes time to analyze all that data on maintenance, you'll need to convert it to individual records.

Suppose that you want to calculate the number of times that doors are repaired and the number of times that they're replaced. If you have each instance of maintenance stored as a separate record, it's easy: You just import, say, the MaintenanceAction field into Excel. Then use a pivot table (or an array formula) to count the number of records with "Repair" and the number of records with "Replace."

But if the maintenance records are stored in separate fields instead of separate records, you have a problem. It's solvable, yes, but it's still a problem. You have to import, for each record, the first, second, third,…Nth instance of maintenance, retrieving MaintenanceAction1, MaintenanceAction2, MaintenanceAction3,…, MaintenanceActionN. Then you have to convert all those values to list format before you can bring a pivot table to bear on the problem.

When you can't predict how many maintenance records you'll need to allow for, you need a different solution. Perhaps your record should instead represent each instance of a door's maintenance. That way, you wind up with an indeterminate number of records, not fields.

But then you need to repeat all the information about the door itself. You need to know which door was worked on, and there's quite a bit of information that gets dragged along with that. You'll need a field that uniquely identifies the door, one that shows its installation date, another for the name of the manufacturer, the warranty expiration date, whether it's a smoke door, whether the door is keyed, and so on. It's wasteful and unnecessary to repeat all that static information in every detail record that's intended to describe periodic maintenance.

Using Joins to Create Relational Structures

The solution to this problem, as to most similar problems, is to structure the database so that you have one table that contains information that doesn't change, and another that contains information that does. In this case, you would have one table with static information about the door (its ID, its installation date, and so on), and another table that contains information about door maintenance (the date that work was done, who did the work, and so on).

The Doors table stores information that either won't or is unlikely to change for a particular door: its unique ID, its manufacturer, its location, and so on. These are termed parent records.

The Door Maintenance table stores information that you expect to change from record to record: the action taken (inspection or repair, for example), a technician's name, the date when the action was taken, the action's outcome, and so on. These are termed child records. Each child record belongs to a particular parent record.

What's crucial in this setup is that the two tables each have a field that enables you to link, or join, them in such a way that if you focus on a particular parent record, you automatically get only those child records that belong to the parent.

In this case study, you want to make sure that when you call for information about the entrance door on the north side of the building's first floor, the only inspection and repair records that appear are the ones that belong to that door.

This relationship is shown graphically in the table pane, whether you're using Microsoft Query or a database manager such as Access (see Figure 5.8).

Figure 5.8. The line between the two tables is termed a join.

Notice in Figure 5.8 that the table named Doors stores relatively static information about a door: its rating, whether or not it's a fire door, its floor, and so on. The table named DoorDetails stores information that changes in this case, the data changes over time: the date that a door was inspected, the date it was repaired, what the deficiency was, and so on.

In particular, notice in Figure 5.8 that both the Doors table and the DoorDetails table include a field named DoorID. It's this shared field that establishes a relationship between the two tables, and that establishes a relational structure. It is no longer simply a flat file.

NOTE

It is useful but not required that the two instances of the field have the same name. In the Doors table, it could have been named DoorID and in the DoorDetails table, it could have been named DoorIdentifier.

Understanding Inner Joins

When two or more tables are joined, as they are in Figure 5.8, your query can return records from one table and related records from the other table. Records are related if they have the same value on the fields that are on either end of the join. Figure 5.9 shows records returned from Microsoft Access into Microsoft Excel by this query.

Figure 5.9. It's not necessary to return the join fields from the query. They are shown here only for clarity.

graphics/05fig09.gif

Notice that the value of DoorID from the table of parent records (door records) is always the same as the value of DoorID from the table of child records (door maintenance records).

Also notice in Figure 5.9 that the door with DoorID 1A0A003 appears six times. The data in columns A and B comes from the Doors table, which contains only one instance of that particular door. That door appears six times in DoorDetails: once for each time that a Ballou technician has serviced the door. By relating the two tables, the query is able to display static information such as its floor and zone along with changing information such as the date the door was inspected.

The relational structure enables you to avoid the two problems discussed in the prior section: putting a large number of fields into a record when they might or might not be used, and unnecessarily repeating static information across many detail records.

In Figure 5.8, the join line connecting the DoorID fields in the two tables is the default join type: an inner join. An inner join returns a record only if the same value exists in both join fields. For example, Figure 5.8 shows a record with the value 1A0A321 in the DoorID field of the table Doors, and also in the DoorID field of the table DoorDetails. Given the join type, the query would not return that record if either table had no record with that value. Put another way, an inner join will not return a record unless it has the same value in both join fields.

There are a few aspects to keep in mind about the fields that are joined:

The fields must be of the same data type; one can't be numeric and the other text, for example.
If the fields have the same name and data type, and one of them is a table's primary key, both Microsoft Query and Microsoft Access are able to join the tables automatically.
If the join is the default join type (an inner join), the query returns only those records with identical values on the join fields.

NOTE

A primary key is a field in a table that uniquely identifies each record. For example, if a particular Social Security number is assigned to exactly one person, the Social Security Number field could be a table's primary key. In the Doors case study that this chapter has used, DoorID is the primary key of the Doors table: It uniquely identifies a door. It cannot be the primary key of the DoorDetails table because each door can appear more than once in that table; a primary key's values must all be unique. Primary keys have broad applicability in database design.

To find more information about primary and other keys, see "Establishing Keys," p. 238.

Understanding Outer Joins

There are two other join types: a left outer join and a right outer join. The meaning of these terms is not intuitively rich, and the only reason they're even mentioned here is so that you'll recognize them if you ever see them in a SQL statement.

In Figure 5.8, suppose that the Doors table contains a door whose DoorID is 1A0A321A, and that the DoorDetails table has no records with that DoorID (perhaps because no work has been done on that door this year). The query will return no record with DoorID 1A0A321A from DoorDetails because there are none in that table. And the record that does exist in the Doors table with DoorID 1A0A321A will not be returned: The inner join requires that a match exist if it is to return a record at all.

Figure 5.10 shows what happens if, in Microsoft Query, you choose Joins from the Table menu (or if you double-click the join line).

Figure 5.10. The first option specifies the default, an inner join. The second and third options specify outer joins.

graphics/05fig10_alt.jpg

graphics/05fig10.gif

Suppose that you select the second option, All Values From 'Doors' and Only Records from 'DoorDetails' Where Doors.DoorID = DoorDetails.DoorID, and then click Add, and then click Close. Figure 5.11 shows what results in the Microsoft Query window.

Figure 5.11. Note the arrowhead at the end of the join line: It points to the table that might not match a join value.

There are two main differences between Figures 5.8 and 5.11. One is the additional records in Figure 5.11's data pane. Those additional records have values in the DoorID field in the Doors table, but none for the DoorID field in the DoorDetails table. And that's what this join calls for: all values from Doors, and only records from DoorDetails with matching values on DoorID. In place of a DoorID and a DateInspected value from DoorDetails, the query returns null values placeholders, in a sense.

Contrast this result with that shown in Figure 5.8. There the record with the DoorID value of, for example, 1A0A321A did not appear because there was no matching record in DoorDetails. In Figure 5.11, though, the join calls for all records from the Doors table, regardless of whether they have matching records in DoorDetails.

TIP

This type of join provides a convenient way of finding records in one table that are not matched by records in another table. In Figure 5.11, you could supply the criterion Is Null for the DoorID field in DoorDetails. Executing the query would return all the records (and only those records) with values of DoorID in the Doors table that were not matched by a value in the DoorID field of DoorDetails.

The other main difference between Figures 5.8 and 5.11 is the arrowhead at the end of the join line in Figure 5.11. In Microsoft Query and Microsoft Access, when you see the arrow head on a query's join line, you know that an outer join has been specified. You also know that the table that the arrow points to is the table that will return null values for those records that lack matching values on the join field.

The difference between a left outer join and a right outer join is trivial: It's merely a question of which table is mentioned before the join or after the join. These two joins are equivalent:

 FROM Doors LEFT OUTER JOIN DoorsDetails ON Doors.DoorID = DoorsDetails.DoorID; FROM DoorsDetails RIGHT OUTER JOIN Doors ON DoorsDetails.DoorID = Doors.DoorID;

In a left join, the table named to the left of the JOIN returns all its records; in a right join, the table named to the right of the JOIN returns all its records. (The word OUTER is optional in the SQL.)

Table of content

Managing Data with Microsoft Excel

ISBN: 789731002
EAN: N/A

Year: 2003
Pages: 134

BUY ON AMAZON

Introducing Microsoft Office InfoPath 2003 (Bpg-Other)

Software Configuration Management

VBScript Programmers Reference

Network Security Architectures

101 Microsoft Visual Basic .NET Applications

802.11 Wireless Networks: The Definitive Guide, Second Edition

CASE STUDY Choosing the Parent Record

Figure 5.7. A database designer who doesn't understand relational structures (or prefers not to use them) might employ this design.

Using Joins to Create Relational Structures

Figure 5.8. The line between the two tables is termed a join.

Understanding Inner Joins

Figure 5.9. It's not necessary to return the join fields from the query. They are shown here only for clarity.

Understanding Outer Joins

Figure 5.10. The first option specifies the default, an inner join. The second and third options specify outer joins.

Figure 5.11. Note the arrowhead at the end of the join line: It points to the table that might not match a join value.

CASE STUDY
Choosing the Parent Record