Joining Tables to Create Multitable Queries

Before you can create joins between tables, you must know which fields are related by common values. As mentioned in Chapter 5, assigning identical names to primary-key and foreign-key fields in different tables that contain related data is a common practice. This approach, used by Microsoft when creating the Northwind sample database, makes determining relationships and creating joins among tables easier. The CustomerID field in the Customers table and the CustomerID field in the Orders table, for example, are used to join orders with customers. A join between tables requires that one field in each table have a common set of values CustomerID codes for this example.

Figure 11.1 shows the structure of the Northwind.mdb database with a graphical display of the relationships between the tables. Access indicates relationships with lines between field names of different tables. Bold type indicates primary-key fields. Each relationship usually involves at least one primary-key field. Relationships define potential joins between tables, but it's not necessary to have a predefined relationship to create a join.

Figure 11.1. The Relationships window displays the relationships between primary keys and foreign keys in the Northwind database with the HRActions table added. Tables with composite primary keys (such as Order Details and HRActions) show each field of the primary key in bold type.

graphics/11fig01.jpg

graphics/relationships.gif You can display the structure of the joins between the tables in Access 2003's Northwind database by giving the Database window the focus (press F11) and then clicking the Relationship button on the toolbar or by choosing Tools, Relationships. The 1 above the line that shows the join between two tables in Figure 11.1 indicates the "one" side of a one-to-many relationship; the infinity symbol ([is]) indicates the "many" side.

graphics/show_direct_relationship.gif graphics/autoformat.gif You can choose between displaying only the direct relationships for a single table (the Show Direct Relationships button on the toolbar) or all relationships for all tables in a database (the Show All Relationships button). All tables of Northwind.mdb appear by default when you open the Relationships window of the Northwind sample database. In this case, clicking the Show Direct Relationships button has no effect.

Tip

To show relationships for only one table, click the Clear Layout toolbar button, click the Show Table button to display the Show Table dialog, select the table to display in the Tables list, and then click Add and Close. Click the Show Direct Relationships button to display the relationships for the selected table. Clearing the layout of the Relationships windows doesn't affect the underlying relationships between the tables. The Show Direct Relationships feature is useful primarily with databases that contain many related tables. Close the Relationships window and don't save the changes.


Access supports four types of joins in the Query Design window:

  • Inner joins are the most common join for creating select queries. The most common type of an inner join is a natural join (also called an equi-join), which displays all the records in one table that have corresponding records in another table. The correspondence between records is determined by identical values (WHERE field1 = field2 in SQL) in the fields that join the tables. In most cases, joins are based on a unique primary-key field in one table and a foreign-key field in the other table in a one-to-many relationship. If none of the table's records that act as the many side of the relationship has a field value that corresponds to a record in the table of the one side, the corresponding records in the one side don't appear in the query result.

Note

Access automatically creates natural joins between tables if the tables share a common field name that's a primary key of one of the tables.


  • Outer joins display records in one member of the join, regardless of whether corresponding records exist on the other side of the join.

  • Self-joins relate data within a single table. You create a self-join in Access by adding to the query a duplicate of the table (Access provides an alias for the duplicate), and then you create a join to the field(s) of the duplicate table.

  • Theta joins relate data by using comparison operators other than =. Theta joins include not-equal joins (<>) used in queries designed to return records that don't have corresponding values. It's easier to implement theta joins by WHERE criteria rather than by the SQL JOIN reserved word. The Query Design window doesn't indicate theta joins by drawing lines between field names.

graphics/power_tools.gif

The 15.5MB Oakmont.mdb database, in the \Seua11\Oakmont folder of the accompanying CD-ROM, has a circular set of relationships. Open Oakmont.mdb, either from the CD-ROM or from a copy on your fixed disk, and then click the Relationships button to open the Relationships window (see Figure 11.2). Courses are one-to-many related to Courses, Departments are one-to-many related to Courses, and Employees are one-to-many related to Sections. You also can see a circular relationship between Courses, Enrollments, Students, Grades, and Courses. Oakmont.mdb is useful when you want to test the performance of queries with a large number of records. The fictitious Oakmont University in Navasota, Texas, has about 30,000 students, 2,320 employees, and offers 1,770 sections of 590 courses in 14 academic departments.

Figure 11.2. The Oakmont.mdb database has a circular set of relationships between the Courses, Enrollments, Students, Grades, and Course tables.

graphics/11fig02.jpg

Creating Conventional Single-Column Inner Joins

Joins based on one column in each table are known as single-column inner equi-joins. The following list details the basic rules for designing a database that lets you use simple single-column inner equi-join for all queries:

  • Each table on the one side of the relationship must have a primary key with a No Duplicates index to maintain referential integrity. Access automatically creates a No Duplicates index on the primary-key field(s) of a table.

  • Many-to-many relationships, such as the relationship of Orders to Products, are implemented by an intermediary table (in this case, Order Details) having a one-to-many relationship (Orders to Order Details) with one table and a many-to-one relationship (Order Details to Products) with another.

  • Duplicated data in tables, where applicable, is extracted to a new table that has a primary-key, no-duplicates, one-to-many relationship with the table from which the duplicate data is extracted. Using a multicolumn primary key to identify extracted data uniquely often is necessary because individual key fields might contain duplicate data. The combination (also known as concatenation) of the values of the key fields, however, must be unique. Access 2003's Table Analyzer Wizard locates and extracts most duplicate data automatically.

For more information on Make-Table queries, see "Creating New Tables with Make-Table Queries," p. 490.


If you're not sure how to create relationships, see "Establishing Relationships Between Tables," p. 189.


All joins in the Northwind database, shown earlier by the lines that connect field names of adjacent tables in Figure 11.1, are single-column inner joins between tables with one-to-many relationships. Figure 11.2 illustrates the two-column relationship between the CourseID and SectionID fields of the Sections and Enrollments tables. Access uses the ANSI SQL-92 reserved words INNER JOIN to identify conventional inner joins, and LEFT JOIN or RIGHT JOIN to specify outer joins.

Among the most common uses for queries based on inner joins is matching customer names and addresses with orders received. You might want to create a simple report, for example, that lists the customer name, order number, order date, and amount. To create a conventional one-to-many, single-column inner join query that relates Northwind's customers to their orders, sorted by company and order date, follow these steps:

  1. With Northwind.mdb open, close all windows except the Database window.

  2. graphics/query.gif Click the Queries shortcut of the Database window and then double-click the Create Query in Design View shortcut. Access displays the Show Table dialog superimposed on an empty Query Design window.

  3. graphics/show_table.gif Select the Customers table from the Show Table list and click the Add button. Alternatively, you can double-click the Customers table name to add the table to the query. Access adds the Field Names list for Customers to the Query Design window.

  4. Double-click the Orders table in the Show Table list and then click the Close button. Access adds to the window the Field Names list for Orders, plus a line that indicates a join between the CustomerID fields of the two tables. Access creates the join automatically because Access found a relationship to the CustomerID field (a foreign key) in the Orders table.

  5. To identify each order with the customer's name, select the CompanyName field of the Customers table and drag the field symbol to the Field row of the Query Design grid's first column.

  6. Select the OrderID field of the Orders table and drag the field symbol to the second column's Field row. Drag the OrderDate field to the third column. Your query design appears as shown in Figure 11.3.

    Figure 11.3. Access automatically creates the inner join on the CustomerID field between the Customers and Orders table.

    graphics/11fig03.jpg

  7. graphics/run_toolbar.gif graphics/datasheet_view.gif Click the Run or Datasheet view button to display the result of the query, the Recordset shown in Figure 11.4. Notice that the field headers of the query result set show the captions for the table fields, which include spaces, rather than the actual field names, which don't have spaces.

    Figure 11.4. The Datasheet view of the query design of Figure 11.3 displays the three fields added to the grid.

    graphics/11fig04.jpg

Specifying a Sort Order and Top Values Limit

Access displays query result sets in the order of the index on the primary-key field, unless you specify sorting on another field or a different sort direction on the primary key field. If more than one column represents a primary-key field, Access sorts simple query result sets in left-to-right key-field column precedence. Because CompanyName is the leftmost primary-key field, the query result set displays all orders for a single company in order-number sequence. You can override the primary-key display order by adding a sort order to the query. For example, if you want to see the most recent orders first, you can specify a descending sort by the order date.

For more information on primary-key indexes, see "Adding Indexes to Tables," p. 194.


You can use the Top Values option to limit the number of rows returned by the query to those that are likely to be of most interest. For this example with a descending sort, only the most recent orders are relevant. Minimizing the number of rows returned by a query is especially important with client/server queries against large tables or when creating networked applications for remote users having slow dial-up connections.

To add this sort sequence and row limit to your query, follow these steps:

  1. graphics/design_view.gif Click the Design View button.

  2. Place the cursor in the Sort row of the Order Date column of the Query Design grid and click the arrow or press Alt+ to open the drop-down list.

  3. Select Descending from the drop-down list to specify a descending sort on date latest orders first (see Figure 11.5).

    Figure 11.5. Add a descending sort on the OrderDate field to display the latest orders first.

    graphics/11fig05.jpg

  4. graphics/all.gif Open the Top Values list of the toolbar and select 5%. Adding a Top Values constraint doesn't affect the Query Design grid.

  5. graphics/run_toolbar.gif Click the Run button or the View button to display the query result set with the new sort order and row limit (see Figure 11.6).

    Figure 11.6. Orders appear in descending date sequence in this Datasheet view. With 5% set in the Top Values list, the query returns only 44 rows.

    graphics/11fig06.jpg

  6. graphics/sql.gif Open the View list button on the toolbar, and choose SQL View to open the SQL window, which displays the Jet SQL statement for the query.

Jet SQL

graphics/sql.gif The Jet SQL statement for the sorted query with the Top Values limit is

 SELECT TOP 5 PERCENT Customers.CompanyName,   Orders.OrderID, Orders.OrderDate FROM Customers   INNER JOIN Orders   ON Customers.CustomerID=Orders.CustomerID ORDER BY Orders.OrderDate DESC; 

The INNER JOIN Orders clause specifies a join with the Customers table, and the ON Customers.CustomerID=Orders.CustomerID qualifier names the joined fields.

A pre-SQL-92 alternative method for creating joins is to use the WHERE clause to specify a join. If you edit the SQL statement as follows, you achieve the same result:

 SELECT TOP 5 PERCENT Customers.CompanyName,   Orders.OrderID, Orders.OrderDate FROM Customers, Orders WHERE Customers.CustomerID=Orders.CustomerID ORDER BY Orders.OrderDate DESC; 

Using WHERE clauses to specify INNER, LEFT, and RIGHT JOINs no longer is common practice, because result sets created by WITH clauses aren't updatable.

Designing Nested Queries

Jet lets you use a saved query (QueryDef objects) in lieu of that query's tables (TableDef objects) in other queries. The only significant difference between these two objects from a query design standpoint is that queries don't have primary keys. Prior to executing the top-level query, Jet executes the QueryDef objects of lower-level (nested) queries, and then creates the join with other tables.

To add a saved query, Northwind.mdb's sample Order Subtotals query for this example, as a nested query in the customer/orders query you created in the preceding section, do this:

  1. graphics/show_table.gif Return to Query Design view and click the Show Table button to open the dialog.

  2. Click the Queries tab of the Show Tables dialog, double-click the Order Subtotals entry in the list, and click Close.

  3. Double-click the Subtotal column of the Order Subtotals query to add it to the grid. Optionally, double-click the Freight field of the Orders table to add a Freight column to the query (see Figure 11.7). The join line represents a one-to-one relationship between the OrderID fields of the Orders table and the Order Subtotals query.

    Figure 11.7. Adding a query instead of a table as a query data source adds a relationship between columns and fields of the same name. In this case, the relationship is one-to-one.

    graphics/11fig07.jpg

  4. graphics/run_toolbar.gif Click the Run button to display the result set (see Figure 11.8).

    Figure 11.8. The query design of Figure 11.7 adds the Order Subtotals' Subtotal column and the optional Freight field of the Orders table.

    graphics/11fig08.jpg

  5. Choose Tools, Relationships, to open the Relationships window, click the Show Tables button, click the Queries tab, double-click the Orders Subtotals item in the list, and click Close. Unlike the Query Design process, Access doesn't automatically display the relationship between queries and tables.

  6. Drag the OrderID field from the Orders table and drop it on the OrderID column of the Orders Subtotals column to display the Edit Relationships dialog (see Figure 11.9). You can't enforce referential integrity between tables and queries. Click OK to close the dialog.

    Figure 11.9. Creating a join between a query and a table disables the referential integrity options of the Edit Relationships dialog.

    graphics/11fig09.jpg

  7. Close the Relationships window, and save the layout changes. Then close your query, and save it with a descriptive name, such as qryOrderAmountsRecentTop5%.

Jet SQL

graphics/sql.gif The Jet SQL statement for the nested query is

 SELECT TOP 5 PERCENT Customers.CompanyName, Orders.OrderID,   Orders.OrderDate, [Order Subtotals].Subtotal, Orders.Freight FROM Customers   INNER JOIN (Orders     INNER JOIN [Order Subtotals]     ON Orders.OrderID = [Order Subtotals].OrderID)   ON Customers.CustomerID = Orders.CustomerID ORDER BY Orders.OrderDate DESC; 

Square bracket pairs ([])surround table or query names having spaces or SQL-illegal punctuation. Indenting the INNER JOIN statements at the same level as the ON prepositions makes the syntax easier to understand.

Creating Queries from Tables with Indirect Relationships

You can create queries that return indirectly related records, such as the categories of products purchased by each customer. You must include in the queries each table that serves as a link in the chain of joins. If you're designing queries to display the categories of products purchased by each customer, for example, include each of the tables that link the chain of joins between the Customers and Categories tables. This chain includes the Customers, Orders, Order Details, Products, and Categories tables. You often need indirect relationships for data analysis queries.

Tip

Queries with indirect relationships are especially useful to create PivotTable and PivotChart views of data. Several of the next chapter's PivotTable and PivotChart examples use this and related queries as data sources.


To create a query that you can use to analyze customers purchases by category, which requires specifying fields of indirectly related records, follow these steps:

  1. graphics/query_design_window.gif In the Queries list of the Database window, click the Create Query in Design View shortcut.

  2. Add the Customers, Orders, Order Details, Products, and Categories tables to the query, in sequence; then click the Close button of the Add Table dialog. Access automatically creates a chain of joins between Customers and Categories based on relationships between the primary-key field of each intervening table and the identically named foreign-key field in the adjacent table.

    Tip

    As you add tables to the Query Design window, the table field lists might not appear in the upper pane. Use the upper pane's vertical scroll bar to display the "hidden" tables. You can drag the table field lists to the top of the upper pane and then rearrange the field lists to match the appearance of the upper pane of Figure 11.10.

    Figure 11.10. The query design shown here calculates the net purchases of each product by every customer.

    graphics/11fig10.gif

  3. Double-click the CompanyName and CategoryName fields from the Customers and Categories tables, to add them to the first two columns of the grid.

  4. In the Field row of the third column, type Amount: CCur([Order Details].[UnitPrice]*[Quantity]*(1-[Discount])) to calculate the net amount of the purchase of each line item in the Orders Details table (see Figure 11.10).

  5. graphics/run_toolbar.gif Click the Run button to test the query at this intermediate point of the design (see Figure 11.11). The query returns 2,155 rows, which is the number of records in the Order Details table.

    Figure 11.11. The Datasheet view of the query design of Figure 11.10 has too much detail to be usable for sales analysis of product categories.

    graphics/11fig11.gif

  6. graphics/design_view.gif graphics/auto_sum.gif Return to Design view and click the Totals button to group the data by CategoryName and CustomerName, and generate total sales by category for each customer. Apply an ascending sort to the CategoryName column, and select Sum from the drop-down list in the Group By row of the Amount column (see Figure 11.12).

    Figure 11.12. To reduce the amount of detail, group the records by the CustomerName and CategoryName fields, and calculate the sum of the Amount column.

    graphics/11fig12.jpg

  7. graphics/run_toolbar.gif Run the query to display the summary (aggregated) result set, which now contains 598 records (see Figure 11.13). 598 records are too many for most people to analyze by inspection.

    Figure 11.13. This summary query result set totals product sales by category and customer to reduce the number of rows from 2,155 to 598.

    graphics/11fig13.gif

  8. graphics/design_view.gif graphics/run_toolbar.gif Return to Query Design view, open the Field list of the first column, substitute Country for CustomerName to reduce the number of records to 165, and run the query (see Figure 11.14).

    Figure 11.14. Aggregating sales by country and category displays 165 records. If customers in all 21 countries had made purchases in all eight categories, the result set would have 168 records.

    graphics/11fig14.gif

  9. Close the Query and save it as qryOrdersByCountryAndCategory.

Jet SQL

graphics/sql.gif The Jet SQL statement for the aggregate query is

 SELECT Customers.Country, Categories.CategoryName,    Sum(CCur([Order Details].[UnitPrice]*    [Quantity]*(1-[Discount]))) AS Amount FROM (Categories   INNER JOIN Products    ON Categories.CategoryID = Products.CategoryID)  INNER JOIN ((Customers INNER JOIN Orders   ON Customers.CustomerID = Orders.CustomerID)  INNER JOIN [Order Details]   ON Orders.OrderID = [Order Details].OrderID)   ON Products.ProductID = [Order Details].ProductID GROUP BY Customers.Country, Categories.CategoryName ORDER BY Categories.CategoryName; 

If you write SQL statements for queries with several joins instead of using Access's graphical query design window, it's easier to use a pre-SQL-92 WHERE clause to define the joins, as in:

 SELECT Customers.Country, Categories.CategoryName,    Sum(CCur([Order Details].UnitPrice*    [Quantity]*(1-[Discount]))) AS Amount FROM Customers, Orders, [Order Details], Products, Categories WHERE Categories.CategoryID=Products.CategoryID   AND Customers.CustomerID=Orders.CustomerID   AND Orders.OrderID=[Order Details].OrderID   AND Products.ProductID=[Order Details].ProductID GROUP BY Customers.Country, Categories.CategoryName ORDER BY Categories.CategoryName; 

The two preceding SQL statements produce the same result set, but using the WHERE clause causes the join lines to disappear from the Query Design pane. Notice that the WHERE clause elements are identical to the ON elements. Updatability isn't a factor in this case, because aggregate queries aren't updatable.

Queries that use SQL aggregate functions are the foundation of Jet crosstab queries. Access data projects (ADP) don't support crosstab queries, because T-SQL lacks the Jet SQL reserved words needed to create crosstabs directly. Instead, ADP use PivotTables to display aggregate query result sets in crosstab format.

For more information on summary queries, see "Using the SQL Aggregate Functions," p. 435.


To learn more about crosstab queries, see "Creating Crosstab Queries," p. 442.


Tip

Access's graphical Query Design features are much more comprehensive than those included with Windows programming languages, such as Visual Basic 6.0 or Visual Basic .NET. If you're a Visual Basic programmer (or plan to learn Visual Basic to create database front ends for Jet databases), use Access to write your programs' Jet SQL statements. SQL Server is the production back end preferred by most Visual Basic programmers, but Jet remains an effective database engine for storing and manipulating local data on Windows clients.


Creating Multicolumn Inner Joins and Selecting Unique Values

You can't have more than one join between a pair of tables, but you can have joins on multiple fields. You might, for example, want to create a query that returns the names of customers who have the same billing and shipping addresses. The billing address is the Address field of the Customers table, and the shipping address is the ShipAddress field of the Orders table. Therefore, you need to match the CustomerID fields in the two tables and Customers.Address with Orders.ShipAddress. This task requires a multicolumn inner join.

To create this example of an address-matching, multicolumn inner join, follow these steps:

  1. graphics/query_design_window.gif Open a new query in Design view.

  2. Add the Customers and Orders tables to the query and close the Add Tables dialog. Access creates the join on the CustomerID fields.

  3. Click and drag the Address field of the Customers table's Field List box to the ShipAddress field of the Orders table's Field List box. This creates another join criterion, indicated by the new line between Address and ShipAddress (see the top pane of Figure 11.15). The new join line between Address and ShipAddress has dots at both ends, indicating that the join is between a pair of fields that doesn't have a specified relationship, the same field name, or a primary-key index.

    Figure 11.15. This query has an inner join on two fields. You must manually add join criteria between fields with dissimilar names.

    graphics/11fig15.gif

  4. Drag the Customers table's CompanyName and Address fields to the Field row of the first and second query columns and then drop the fields. Drag the Orders table's ShipAddress field to the query's third column and drop the field in the Field row (refer to the lower pane of Figure 11.15).

  5. graphics/run_toolbar.gif Click the Run button. Figure 11.16 shows the query's result set.

    Figure 11.16. The result set displays records for all orders in which billing and shipping addresses are the same.

    graphics/11fig16.jpg

  6. graphics/design_view.gif To eliminate the duplicate rows, you must use the Unique Values option of the Query Properties window. To display the Query Properties window, click the Design View button, and then double-click an empty area in the Query Design window's upper pane. (If the title bar of the Properties window displays Field Properties or Field List, click an empty area in the Query Design window's upper pane so that the title bar displays Query Properties.) Alternatively, right-click an empty region of the upper pane and select Properties from the pop-up menu.

  7. By default, both the Unique Records query property and the Unique Values property are set to No. Open the Unique Values list and select Yes (see Figure 11.17). Setting the Unique Values property to Yes adds the ANSI SQL reserved word DISTINCT to the query. Close the Query Properties window.

    Figure 11.17. Setting the Unique Values property to Yes adds the DISTINCT qualifier to the query to display only rows that have different contents.

    graphics/11fig17.gif

    Tip

    Alternatively, you can change the property settings for the Unique Records and Unique Values properties by double-clicking their text boxes in the Properties window. All properties with Yes/No values let you toggle their value by double-clicking.

  8. graphics/run_toolbar.gif Click the Run button. The result set no longer contains duplicate rows, as shown in Figure 11.18.

    Figure 11.18. This result set demonstrates the effect of adding the DISTINCT qualifier to a query.

    graphics/11fig18.jpg

  9. Click the Close Window button to close the query, and save it as qryShipBillAddresses for use later in the chapter.

Because most of the orders have the same billing and shipping addresses, a more useful query is to find the orders for which the customer's billing and shipping addresses differ. You can create a not-equal join for this purpose by changing the (Customers.Address = Orders.ShipAddress) criterion to (Customers.Address <> Orders.ShipAddress). If you make this change, Access displays an error message in Query Design view.

graphics/troubleshooting.gif

If you encounter the Enter Parameter dialog when attempting to execute the preceding query, see the "Missing Objects in Queries" member of the "Troubleshooting" section near the end of this chapter.




Special Edition Using Microsoft Office Access 2003
Special Edition Using Microsoft Office Access 2003
ISBN: 0789729520
EAN: 2147483647
Year: 2005
Pages: 417

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net