At this point, you’ve been through all the variations on a single theme-queries on a single table. It’s easy to build on this knowledge to retrieve related information from many tables and to place that information in a single view. You’ll find this ability to select data from multiple tables very useful in designing forms and reports.
The examples in this chapter are based on the tables and data in HousingDataCopy.accdb and ContactsDataCopy.accdb on the companion CD included with this book. These databases are copies of the data from the Housing Reservations and Conrad Systems Contacts sample applications, respectively, and they contain the sample queries used in this chapter. The query results you see from the sample queries you build in this chapter might not exactly match what you see in this book if you have changed the sample data in the files. Also, all the screen images in this chapter were taken on a Windows Vista system with the display theme set to Blue. Your results might look different if you are using a different operating system or a different theme.
A join is the link you need to define between two related tables in a query so that the data you see makes sense. If you don’t define the link, you’ll see all rows from the first table combined with all rows from the second table (also called the Cartesian product). When you use an inner join (the default for joins in queries), you won’t see rows from either table that don’t have a matching row in the other table. This type of query is also called an equi-join query, meaning that you’ll see rows only where there are equal values in both tables. For example, in a query that joins departments and employees, you won’t see any departments that have no employees or any employees that aren’t assigned to a department. To see how to build a query that returns all rows from one of the tables, including rows that have no match in the related table, see “Using Outer Joins” on page 425.
Correctly designing your tables requires you to split out (normalize) your data into separate tables to avoid redundant data and problems updating the data. (For details about designing your tables, see Article 1, “Designing Your Database Application,” on the companion CD.) For many tasks, however, you need to work with the data from multiple tables. For example, in the Housing Reservations application, to work with employees and the departments to which they are assigned, you can’t get all the information you need from just tblEmployees. Sure, you can see the employee’s DepartmentID, but what about the department name and location? If you want to sort or filter employees by department name, you need both tblDepartments and tblEmployees.
In the previous chapter, we built several queries on tblReservations. Because we defined Lookup properties in that table, you can see the facility name in the reservation record, but as you discovered, that’s really a numeric field. If you want to sort on the real name value or see other information about the facility, you must create a query that uses both tblFacilities and tblReservations.
Try the following example, in which you combine information about a reservation and about the facility in which the reservation was confirmed. Start by opening the HousingDataCopy.accdb database. Click the Query Design button in the Other group on the Create tab. Microsoft Office Access 2007 immediately opens the Show Table dialog box. In this dialog box, you select the tables and queries that you want in your new query. Select the tblFacilities and tblReservations tables (hold down the Ctrl key as you click each table name), click the Add button, and then close the dialog box.
How can I be sure I’m using the correct table in the query designer?
Whenever your query is based on more than one table, it’s a good idea to select the Show Table Names option in the Query Design section of the Object Designers category of the Access Options dialog box. (Click the Microsoft Office Button, click Access Options, and then click the Object Designers category in the left column. Below Query Design, make sure that the Show Table Names check box is selected.) Because you might have the same field name in more than one of the tables, showing table names in the design grid helps to ensure that your query refers to the field you intend it to.
Whenever you have relationships defined, the query designer automatically links (joins) multiple tables on the defined relationships. You might also want to select the Enable AutoJoin check box in the Object Designers category of the Access Options dialog box. When you enable this option and build a query on two tables that aren’t directly related, the query designer attempts to link the two tables for you. The query designer looks at the primary key of each table. If it can find a field with the same name and data type in one of the other tables you added to the query designer, the query designer builds the link for you. Some advanced users might prefer to always create these links themselves.
The two tables, tblFacilities and tblReservations, aren’t directly related to each other. If you look in the Relationships window (click Relationships on the Database Tools tab), you’ll see a relationship defined between tblFacilities and tblFacilityRooms on the FacilityID field. There’s also a relationship between tblFacilityRooms and tblReservations on the combination of the FacilityID and the RoomNumber fields. So, tblFacilities is related to tblReservations via the FacilityID field, but indirectly. In other words, the FacilityID field in tblReservations is a foreign key that points to the related row in tblFacilities. So, it’s perfectly legitimate to build a query that links these two tables on the FacilityID field.
|Inside Out-Query Joins Don’t Always Need to Match Relationships|| |
It’s a good idea to define relationships between related tables to help ensure the integrity of your data. However, you don’t need to define a relationship between the foreign key in a table and the matching primary key in every other related table. For example, if table A is related to table B, and table B, is related to table C, you don’t necessarily need a relationship defined between table A and table C even though table C might contain a foreign key field that relates it to table A. Even when you haven’t explicitly defined a relationship between table A and table C, it is perfectly valid to join table A to table C in a query as long as there’s a legitimate matching field in both tables.
The upper part of the Query window in Design view should look like that shown in Figure 8–1. Office Access 2007 first links multiple tables in a query based on the relationships you have defined. If no defined relationship exists, and you have Enable AutoJoin selected in the Object Designers category in the Access Options dialog box (this option is enabled by default), then Access attempts to match the primary key from one table with a field that has the same name and data type in the other table.
Figure 8–1: This query selects information from the tblFacilities and tblReservations tables.
Access 2007 shows the links between tables as a line drawn from the primary key in one table to its matching field in the other table. As already noted, no direct relationship exists between these two tables in this example. With Enable AutoJoin selected, however, Access sees that the FacilityID field is the primary key in tblFacilities and finds a matching FacilityID field in tblReservations. So, it should create a join line between the two tables on the FacilityID fields. If you don’t see this line, you can click FacilityID in tblFacilities and drag and drop it on FacilityID in tblReservations, just like you learned to do in “Defining Relationships” on page 181.
If you haven’t defined relationships, when you create a query that uses two tables that are related by more than one field (for example, tblFacilityRooms and tblReservations in this database), the Join Properties dialog box lets you define only one of the links at a time. You must click the New button again to define the second part of the join. If you’re using drag and drop, you can do this with only one field at a time even though you can select multiple fields in either table window. We’ll discuss the Join Properties dialog box in “Building a Simple Outer Join” on page 425.
In this example, you want to add to the query the FacilityID and FacilityName fields from the tblFacilities table and the ReservationID, EmployeeNumber, FacilityID, RoomNumber, CheckInDate, and CheckOutDate fields from the tblReservations table. (We resized the columns on the query grid in Figure 8–1 so that you can see all the fields.)
When you run the query, you see the recordset shown in Figure 8–2. The fields from the tblFacilities table appear first, left to right. We resized the columns displayed in Figure 8–2 so that you can see all the fields. You might need to scroll right in the datasheet to see them on your computer.
Figure 8–2: Here you can see the recordset of the query shown in Figure 8–1. The facility information in the drop-down list comes from the Lookup properties defined in the tblReservations table.
INSIDE OUT-A Query Is Really Defined by Its SQL
The query designer converts everything you build in a query grid into SQL-the lingua franca of database queries. Access 2007 actually stores only the SQL and rebuilds the query grid each time you open a query in Design view. Later in this chapter, we’ll examine some of the actual syntax behind your queries, and you can study the full details of SQL in Article 2, “Understanding SQL,” on the companion CD.
Notice the facility name in the column for the FacilityID field from the tblReservations table. (The caption for the field is Facility.) If you check the definition of the FacilityID field in the tblReservations table, you’ll see a Lookup combo box defined-the query has inherited those properties. Click in a field in the Facility column in this datasheet, and the combo box appears. If you choose a different Facility name from the list, you will change the FacilityID field for that reservation. But the room number already in the row might not exist in the new Facility, so you might get an error if you try to save the row.
This is yet another example of a business rule that you will have to ensure is verified in a form you create for users to edit this data.
If you choose a different Facility (the FacilityID field in tblReservations), you can see the ID and Name change on the left side of the datasheet. When you change the value in a foreign key field (in this case, the FacilityID field in tblReservations) in a table on the many side of a one-to-many query (there are many reservations for each facility), Access performs an AutoLookup to retrieve the related row from the one side (tblFacilities) to keep the data synchronized. You’ll find this feature handy later when you build a form to display and edit information from a query like this.
Try changing the entry in the Facility column (remember, this is the FacilityID field from tblReservations; you see the name of the facility because we defined Lookup properties on this field.) in the first row from Main Campus Housing A to South Campus Housing C. When you select the new value for the FacilityID field in tblReservations, you should see the ID field from tblFacilities change to 3 and the Name entry (the FacilityName field from tblFacilities) change from Main Campus Housing A to South Campus Housing C. Note that in this case you’re changing only the linking FacilityID field in tblReservations, not the name of the facility in tblFacilities. Access is retrieving the row from tblFacilities that matches the changed FacilityID value in tblReservations to show you the correct name.
If you change the facility in one of the rows in this query, Access won’t let you save the row if the room number in the reservation doesn’t exist in the new facility you chose.
One interesting aspect of queries on multiple tables is that in many cases you can update the fields from either table in the query. See “Limitations on Using Select Queries to Update Data” on page 468 for a discussion of when joined queries are not updatable. For example, you can change the facility name in the tblFacilities table by changing the data in the Name column in this query’s datasheet.
In most cases, Access lets you edit fields from the table on the one side of the join (in this case, tblFacilities). Because the facility name comes from a table on the one side of a oneto-many relationship (one facility has many reservations, but each reservation is for only one facility), if you change the name of the facility in any row in this query, you change the name for all reservations for the same facility.
Although the dangers to doing this are apparent, this is actually one of the benefits of designing your tables properly. If the facility is renamed (perhaps it gets renamed in honor of a beloved ex-president of the company), you need to change the name in only one place. You don’t have to find all existing reservations for the facility and update them all.
You might use a query like the one in Figure 8–1 as the basis for a report on reservations by facility. However, such a report would probably also need to include the employee name and department to be truly useful. Switch back to Design view, click the Show Table button in the Query Setup group of the Design contextual tab and add tblEmployees and tblDepartments to the query. You can also drag and drop any table or query from the Navigation Pane to your query.
You’ll run into a small problem when you add tblDepartments to your query: There are two relationships defined between tblDepartments and tblEmployees. First, each employee must have a valid department assigned, so there’s a relationship on DepartmentID. Also, any manager for a department must be an employee of the department, and an employee can manage only one department, so there’s a second relationship defined between EmployeeNumber in tblEmployees and ManagerNumber in tblDepartments.
The query designer doesn’t know which relationship you want to use as a join in this query, so it includes them both in the query grid. If you leave both join lines in your query, you’ll see only reservations for managers of departments because the join between EmployeeNumber in tblEmployees and ManagerNumber in tblDepartments forces the query to include only employees who are also managers. Click on the join line between EmployeeNumber in tblEmployees and ManagerNumber in tblDepartments and press the Delete key to remove the join.
If you’re going to use this query in a report, you probably don’t need EmployeeNumber and FacilityID from tblReservations, so you can delete them. Next click on RoomNumber in the design grid and click Insert Columns in the Query Setup group on the Design tab to give you a blank column to work with. You need the employee name for your report, but you most likely don’t need the separate FirstName, MiddleName, and LastName fields. Use the blank column to create an expression as follows:
EmpName: tblEmployees.FirstName & " " & (tblEmployees.MiddleName +" ") & LastName
Note that you’re using the little trick you learned in the previous chapter: using an arithmetic operation to eliminate the potential extra blank when an employee has no middle name. Drag and drop the Department field from tblDepartments on top of RoomNumber in the design grid-this should place it between the EmpName field you just defined and RoomNumber. Your query grid should now look like Figure 8–3.
Figure 8–3: In this example we are creating a complex query using four tables.
You can switch to Datasheet view to see the results of your work as shown in Figure 8–4
Figure 8–4: In Datasheet view you can see the recordset of the query shown in Figure 8–3.
Do you notice anything strange about the sequence of rows? Why aren’t the rows sorted by facility name and then perhaps employee name? If you were to put the EmployeeNumber field back into the query grid and take a look at the data again, you would discover that Access sorted the rows by EmployeeNumber and then by ReservationID. Access looked at all the information you requested and then figured out the quickest way to give you the answer-probably by fetching rows from tblEmployee first (which are sorted on the primary key, EmployeeID) and then fetching the matching rows from tblReservations.
Remember from the previous chapter that the only way you can guarantee the sequence of rows is to specify a sort on the fields you want. In this case, you might want to sort by facility name, employee last name, employee first name, and check-in date. You buried first name and last name in the EmpName expression, so you can’t use that field to sort the data by last name. You can find the correct answer saved as qxmplSortReservations in the sample database. (Hint: You need to add tblEmployees.LastName and tblEmployees.FirstName to the grid to be able to specify the sort.)
When you’re building a very complex query, sometimes it’s easier to visualize the solution to the problem at hand by breaking it down into smaller pieces. In some cases, building a query on a query might be the only way to correctly solve the problem.
For this set of examples, let’s switch to the data in the Conrad Systems Contacts database. Start Access 2007 and open ContactsDataCopy.accdb. Customers who purchase a Single User copy of the BO$$ software marketed by Conrad Systems can later decide to upgrade to the Multi-User edition for a reduced price. Assume you’re a consultant hired by Conrad Systems, and the company has asked you to produce a list of all customers and their companies who purchased a Single User copy and then later purchased the upgrade.
To solve this, you might be tempted to start a new query on tblContacts and add the tblCompanies, tblContactProducts, and tblProducts tables. You would include the CompanyID and CompanyName fields from tblCompanies; the ContactID, FirstName, and LastName fields from tblContacts; and the ProductID and ProductName fields from tblProducts. Then you would place a criterion like “BO$$ Single User” And “Upgrade to BO$$ Multi-User” on the Criteria line under ProductName. However, any one row in your query will show information from only one contact and product, so one row can’t contain both “BO$$ Single User” and “Upgrade to BO$$ Multi-User.” (See the discussion in “AND vs. OR” on page 357.) Your query will return no rows.
Your next attempt might be to correct the criterion to “BO$$ Single User” Or “Upgrade to BO$$ Multi-User”. That will at least return some data, but you’ll get an answer similar to Figure 8–5. (You can find this query saved as qxmplTwoProductsWrong in the sample database.)
Figure 8–5: This query is an attempt to find out which contacts have purchased both BO$$ Single User and the BO$$ Multi-User upgrade.
Because there aren’t very many rows in this table, you can scan this 19-row result and see that the correct answer is four contacts-Joseph Matthews, John Smith, Daniel Koczka, and Mark Hanson. But if there were thousands of rows in the database, you wouldn’t be able to easily find the contacts who purchased both products. And if you need to display the output in a report, you really need a single row for each contact that meets your criteria.
One way to solve this sort of problem is to build a query that finds everyone who owns BO$$ Single User and save it. Then build another query that finds everyone who purchased the BO$$ Multi-User upgrade and save that. Finally, build a third query that joins the first two results to get your final answer. Remember that a simple join returns only the rows that match in both tables-or queries. So, someone who appears in both queries clearly owns both products! Here’s how to build the solution.
Build the first query to find customers who own BO$$ Single User.
Start a new query on tblContactProducts and add tblProducts to the query. You should see a join line between tblProducts and tblContactProducts on the ProductID field because there’s a relationship defined.
From tblContactProducts, include the CompanyID and the ContactID fields.
Add ProductName from tblProducts, and enter “BO$$ Single User” on the Criteria line under this field. Save this query and name it qrySingle.
Build the second query to find customers who bought the upgrade.
Start another query on tblContactProducts and add tblProducts to the query.
From tblContactProducts, include the CompanyID and the ContactID fields.
Add ProductName from tblProducts, and enter “Upgrade to BO$$ MultiUser” on the Criteria line under this field. Save this query and name it qryMultiUpgrade.
Build the final solution query.
Start a new query on tblCompanies. Add your new qrySingle and qryMultiUpgrade queries.
The query designer will link tblCompanies to both queries on CompanyID, but you don’t need a link to both. Click on the join line between tblCompanies and qryMultiUpgrade and delete it.
You do need to link qrySingle and qryMultiUpgrade. Drag and drop CompanyID from qrySingle to qryMultiUpgrade. Then, drag and drop ContactID from qrySingle to qryMultiUpgrade. Because you are defining an inner join between the two queries, the query fetches rows only from the two queries where the CompanyID and ContactID match.
Add tblContacts to your query. Because there’s a relationship defined between ContactID in tblContacts and ReferredBy in tblCompanies, the query designer adds this join line. You don’t need this join (the query would return only contacts who have made referrals), so click on the line and delete it. The query designer does correctly create a join line between tblContacts and qrySingle.
On the query design grid, include CompanyName from tblCompanies, and FirstName and LastName from tblContacts. Your result should look like Figure 8–6.
Figure 8–6: Here you are solving the “contacts who own two products” problem the right way by building a query on queries.
Switch to Datasheet view, and, sure enough, the query gives you the right answer as shown in Figure 8–7.
Figure 8–7: You can now correctly see the four contacts who purchased a BO$$ Single User edition and later upgraded.
This works because you’re using the two queries to filter each other via the join. The qrySingle query finds contacts who own BO$$ Single User. The qryMultiUpgrade query finds contacts who bought the BO$$ Multi-User upgrade. The join lines between these two queries ask the final query to return rows only where the CompanyID and ContactID in the first two queries match, so you won’t see rows from qrySingle that don’t have a matching combination of CompanyID and ContactID in qryMultiUpgrade, and vice versa.
For more examples of building queries on queries, see Article 2, “Understanding SQL,” on the companion CD.
Most queries that you create to request information from multiple tables will show results on the basis of matching data in one or more tables. For example, the Query window in Datasheet view shown in Figure 8–4 contains the names of facilities that have reservations in the tblReservations table-and it does not contain the names of facilities that don’t have any reservations booked. As explained earlier, this type of query is called an equi-join query, meaning that you’ll see rows only where there are equal values in both tables. But, what if you want to display facilities that do not have any reservations in the database? Or, how do you find employees who have no reservations? You can get the information you need by creating a query that uses an outer join. An outer join lets you see all rows from one of the tables even if there’s no matching row in the related table. When no matching row exists, Access returns the special Null value in the columns from the related table.
To create an outer join, you must modify the join properties. Let’s see if we can find any employees who don’t have any reservations booked. Start a new query on tblEmployees in the HousingDataCopy database. Add tblReservations to the query. Double-click the join line between the two tables in the upper part of the Query window in Design view to see the Join Properties dialog box, shown in Figure 8–8.
Figure 8–8: The Join Properties dialog box allows you to change the join properties for the query.
The default setting in the Join Properties dialog box is the first option-where the joined fields from both tables are equal. You can see that you have two additional options for this query: to see all employees and any reservations that match, or to see all reservations and any employees that match. If you entered your underlying data correctly, you shouldn’t have reservations for employees who aren’t defined in the database. If you asked Access to enforce referential integrity (discussed in Chapter 4, “Creating Your Database and Tables”) when you defined the relationship between the tblEmployees table and the tblReservations table, Access won’t let you create any reservations for nonexistent employees.
Select the second option in the dialog box. (When the link between two tables involves more than one field in each table, you can click the New button to define the additional links.) Click OK. You should now see an arrow on the join line pointing from the tblEmployees field list to the tblReservations field list, indicating that you have asked for an outer join with all records from tblEmployees regardless of match, as shown in Figure 8–9. For employees who have no reservations, Access returns the special Null value in all the columns for tblReservations. So, you can find the employees that aren’t planning to stay in any facility by including the Is Null test for any of the columns from tblReservations. When you run this query, you should find exactly two employees who have no reservations, as shown in Figure 8–10. The finished query is saved as qxmplEmployeesNoReservations in the HousingDataCopy.accdb database.
Figure 8–9: This query design finds employees who have no reservations.
Figure 8–10: This recordset shows the two employees who have no reservations.
As discussed earlier in this chapter, you know that to solve certain types of problems you must first build one query to define a subset of data from your tables and then use that query as input to another query to get the final answer. For example, suppose you want to find out which employees have no reservations in a certain time period. You might guess that an outer join from the tblEmployees table to the tblReservations table will do the trick. That would work fine if the tblReservations table contained reservations for only the time period in question. Remember, to find employees who haven’t booked a room, you have to look for a special Null value in the columns from tblReservations. But to limit the data in tblReservations to a specific time period-let’s say June and July 2007-you have to be able to test real values. In other words, you have a problem because a column from tblReservations can’t be both Null and have a date value at the same time. (You can find an example of the wrong way to solve this problem saved as qxmplEmpNotBookedJunJulWRONG in the sample database.)
To solve this problem, you must first create a query that contains only the reservations for the months you want. As you’ll see in a bit, you can then use that query with an outer join in another query to find out which employees haven’t booked a room in June and July 2007. Figure 8–11 shows the query you need to start with, using a. tblReservations. This example includes the EmployeeNumber field as well as the FacilityID and RoomNumber fields, so you can use it to search for either employees or facilities or rooms that aren’t booked in the target months.
Figure 8–11: You can add a filter to your query to list reservation data for particular months.
Notice that if you truly want to see all reservations in these two months, you need to specify a criterion on both CheckInDate and CheckOutDate. Anyone who checked in on or before July 31, 2007-provided they didn’t check out before June 1, 2007-is someone who stayed in a room between the dates of interest. You don’t want anyone who checked out on June 1. (who stayed the night of May 31, but didn’t stay over into June), which explains why the second criterion is >#6/1/2007# and not >=#6/1/2007#. This query is saved as qxmplBookingsJunJul in the HousingDataCopy.accdb database.
You might be looking at the problem of finding any reservation that crosses into or is contained within a certain date span and scratching your head. You want any reservation that meets one of these criteria:
The reservation begins before the start of the date span but extends into the date span.
The reservation is contained wholly within the date span.
The reservation begins before the end of the date span but extends beyond the date span.
The reservation starts before the beginning of the date span and ends after the end of the date span.
You can see these four conditions in the following illustration.
You might be tempted to include four separate criteria in your query, but that’s not necessary. As long as a reservation begins before the end of the span and ends after the beginning of the span, you’ve got them all! Try out the criteria shown in Figure 8–11 to see if that simple test doesn’t find all the previous cases.
After you save the first query, click the Query Design button in the Other group on the Create tab to start a new query. In the Show Table dialog box add tblEmployees to the design grid by double-clicking on tblEmployees. Click the Queries tab in the Show Table dialog box and then double-click on qxmplBookingsJunJul to add it to the query grid. Click Close to close the Show Table dialog box. Access should automatically link tblEmployees to the query on matching EmployeeNumber fields. Double-click the join line to open the Join Properties dialog box, and select option 2 to see all rows from tblEmployees and any matching rows from the query. The join line’s arrow should point from tblEmployees to the query, as shown in Figure 8–12.
Figure 8–12: An outer join query searches for employees not booked in June and July 2007.
As you did in the previous outer join example, include some fields from the tblEmployees table and at least one field from the query that contains reservations only from June and July 2007. In the field from the query, add the special Is Null criterion. When you run this query (the results of which are shown in Figure 8–13), you should find six employees who haven’t booked a room in June and July 2007-including the two employees that you found earlier who haven’t booked any room at all. This query is saved as qxmplEmpNotBookedJunJul in the HousingDataCopy.accdb database.
Figure 8–13: These employees have no bookings in June and July 2007.
Let’s study another example. When you’re looking at reservation requests, and each request indicates the particular type of room desired, it might be useful to know either which facilities have this type of room or which facilities do not. (Not all facilities have all the different types of rooms.) You can find a complete list of the room types in the tlkpRoomTypes table in the Housing Reservations application.
To find out which room types aren’t in a facility, you might try an outer join from tlkpRoomTypes to tblFacilityRooms and look for Null again, but all you’d find is that all room types exist somewhere-you wouldn’t know which room type was missing in what facility. In truth, you need to build a query first that limits room types to one facility. Your query should look something like Figure 8–14.
Figure 8–14: This query lists all the rooms and their room types in Facility 1.
Now you can build your outer join query to find out which room types aren’t in the first housing facility. Start a new query by clicking the Query Design button in the Other group on the Create tab. In the Show Table dialog box add tlkpRoomTypes to the design grid by double-clicking tlkpRoomTypes. Click the Queries tab in the Show Table dialog box and then double-click the qxmplRoomsFacilityl query to add it to the query grid. Close the Show Table dialog box. Double-click the join line and ask for all rows from tlkpRoomTypes and any matching rows from the query. Add the RoomType field from tlkpRoomTypes and the FacilityID field from the query to the grid. Under FacilityID, place a criterion of Is Null. Your query should look like Figure 8–15.
If you run this query, you’ll find that Facility 1. has no one-bedroom suites with king bed, no one-bedroom suites with two queen beds, and no two-bedroom suites with a king bed, queen bed, and kitchenette. In the sample database, you’ll find sample queries that return the room types for the other three facilities, so you can build queries like the one in Figure 8–15 to find out what room types are missing in those facilities.
Figure 8–15: This query determines which room types are not in Facility 1.