Lesson 2: Advanced Joins

After this lesson, you will be able to:

  • Combine data from more than two tables using joins
  • Join a table to itself
  • Combine multiple result sets into one result set by using the UNION operator
  • Create tables by using the SELECT INTO statement

Estimated lesson time: 90 minutes

Joining More Than Two Tables

It is possible to join up to 256 tables in a single query. A table can be joined to many other tables within the same query. The joins can use either the same common column or different common columns.

When you use multiple joins, it is easier to consider each join independently if, for example, Tables A, B, and C are joined. The first join combines Tables A and B to produce a result set; this result set is combined with Table C in the second join to produce the final result set. Columns from any of the tables can be specified in the other clauses of the SELECT statement. If two tables have a column with the same name, qualify each column name using the table_name.column_name format.

NOTE
The query optimizer may produce a plan that processes the tables in a different order, but the result set is the same.

Figure D.4 shows a query joining three tables. The first join is between the buyers table and the sales table, and the join condition uses the common buyer_id column. The second join is between the sales table and the produce table, and the join condition uses the common prod_id column.

click to view at full size

Figure D.4 Joining more than two tables

Example 1

This example returns the buyer_name, prod_name, and qty columns from the buyers, sales, and produce tables. The buyer_id column is common to the buyers and sales tables and is used to join these two tables. The prod_id column is common to the sales and produce tables and is used to join the produce table to the result of the join between the buyers and sales tables.

USE joindb SELECT buyer_name, prod_name, qty FROM buyers JOIN sales ON buyers.buyer_id = sales.buyer_id JOIN produce ON sales.prod_id = produce.prod_id

Result

buyer_name                prod_name                 qty ------------------------- ------------------------- ----------- Erin O'Melia            Apples                   37 Adam Barr              Pears                 15 Erin O'Melia            Pears                  1003 Adam Barr              Oranges                   5 Eva Corets              Peaches                   11  (5 row(s) affected)

Example 2

This example displays information from the Orders and Products tables using the Order Details table as a link. For example, if you want a list of products that are ordered each day, you need information from the Orders and Products tables. An order can consist of many products, and a product can be ordered on many orders. Therefore, the Products and Orders tables have a many-to-many relationship.

To retrieve information from both the Orders and Products tables, you need to use an intermediate table and two joins. Although you do not need to retrieve any columns from the Order Details table, you must include this table in the query to relate the Orders table to the Products table. In this example, the OrderID column is common to the Orders and Order Details tables, and the ProductID column is common to the Order Details and Products tables.

USE Northwind SELECT OrderDate, ProductName FROM Orders O INNER JOIN [Order Details] OD ON O.OrderID = OD.OrderID JOIN Products P  ON OD.ProductID = P.ProductID WHERE OrderDate = '6/16/97'

Result

OrderDate                   ProductName --------------------------- ---------------------------------------- 1997-06-16 00:00:00.000     Gorgonzola Telino 1997-06-16 00:00:00.000     Lakkalikööri  (2 row(s) affected)

Exercise 1: Joining Several Tables and Ordering the Results

In this exercise, you will write and execute a query on the title, item, and copy tables that returns the isbn, copy_no, on_loan, title, translation, and cover values for rows in the copy table with an ISBN of 1, 500 (five hundred), or 1000 (one thousand). Order the results by the ISBN column. The C:\SQLDW\Exercise\AppD\join.SQL file contains a completed script for this exercise.

  • To write a SELECT statement that uses multiple joins and the ORDER BY clause
    1. Write the select list of the query. Qualify the name of each column with a table alias of at least two characters (for example, ti.title_no for title.title_no).
    2. Write a FROM clause that creates a join between the title and copy tables on the title_no columns. Set up the table aliases in the FROM clause that you used in the select list.
    3. Add a second JOIN clause to create a join between the item and copy tables on the ISBN columns.
    4. Compose a WHERE clause to restrict the rows that are retrieved from the copy table to those with an ISBN of 1 (one), 500 (five hundred), or 1000 (one thousand).
    5. Write the ORDER BY clause to sort the result by ISBN.
    6. Execute the script.
    7. USE library SELECT   co.isbn,   co.copy_no,   co.on_loan,   ti.title,   it.[translation],   it.cover FROM copy co JOIN title ti ON co.title_no = ti.title_no JOIN item it ON co.isbn = it.isbn WHERE (co.isbn IN (1, 500, 1000)) ORDER BY co.isbn

    Performance

    When joining more than two tables, especially if the tables are large, you should analyze the query using the SET SHOWPLAN_TEXT, SET SHOWPLAN_ALL, and SET STATISTICS IO options and the graphical execution plan tools. Small changes to the query can have a significant effect on the number of logical reads that are needed to process the query.

    If you refer to columns that are common to two or more tables in the WHERE clause of these queries, the table prefix you choose will affect the performance of the query.

    Example

    In this example, a list of titles and the names of the members who have borrowed each title is retrieved by joining the title, loanhist, and member tables in the library database. A search condition on the member number is specified in the WHERE clause. The result set is the same whether the member_no column of the member table or the member_no column of the loanhist table is specified in the WHERE clause. However, if you turn on Show stats I/O and Show stats time in the Current Connection Options menu in SQL Server Query Analyzer, you will see that the query that specifies the member_no column of the member table executes about three times faster.

    The following is a query specifying the member_no column of the member table in the WHERE clause:

    USE library SELECT DISTINCT title, firstname + ' ' + lastname FROM title t JOIN loanhist l ON t.title_no = l.title_no JOIN member m ON m.member_no = l.member_no WHERE m.member_no < 100

    Executing this query yielded the following Show stats I/O and stats time results:

    Table 'title'. Scan count 624, logical reads 1248... Table 'loanhist'. Scan count 99, logical reads 2071... Table 'member'. Scan count 1, logical reads 101... SQL Server Execution Times:    CPU time = 160 ms,  elapsed time = 157 ms.

    The following is a query specifying the member_no column of the loanhist table in the WHERE clause:

    USE library SELECT DISTINCT title, firstname + ' ' + lastname FROM title t JOIN loanhist l ON t.title_no = l.title_no JOIN member m ON m.member_no = l.member_no WHERE l.member_no < 100

    Executing this query yielded the following Show stats I/O and stats time results:

    Table 'title'. Scan count 624, logical reads 1248... Table 'member'. Scan count 624, logical reads 1872... Table 'loanhist'. Scan count 1, logical reads 511... SQL Server Execution Times:    CPU time = 561 ms,  elapsed time = 565 ms.

    Joining a Table to Itself

    If you want to find rows that have values in common with other rows in the same table, you can join a table to another instance of itself. This process is known as a self-join.

    When to Use Self-Joins

    Self-joins are used to represent hierarchies or tree structures. For example, a company employment structure is typically hierarchical. Each employee has a manager who is also an employee. In an employee table, the primary key is the employee ID, and the manager ID column is the foreign key that relates the employee table to itself.

    Self-joins are also useful for finding matching data in a table. For example, in the pubs database, you can create a query on the authors table that lists for each author the other authors who live in the same city.

    When you use self-joins, consider the following guidelines:

    • Table aliases are required to reference two copies of the table because they must be referenced uniquely. Remember that table aliases are different from column aliases. A table alias is specified in the FROM clause after the table name.
    • It may be necessary to use conditions in the WHERE clause to filter out duplicate matching rows if the same column is used more than once in the select list.

    The next three examples develop the same query to show how conditions in the WHERE clause are used to create a meaningful self-join query.

    Example 1

    This example displays a list of buyers who purchased the same products. The first, third, fourth, fifth, and seventh rows of the result set are simply rows that match themselves. The sixth row mirrors the second row. These rows should be eliminated from the result set. Figure D.5 shows how the rows of the buyer table are self-joined to produce the result set that is developed in Examples 1, 2, and 3.

    USE joindb SELECT a.buyer_id AS buyer1, a.prod_id, b.buyer_id AS buyer2 FROM sales a JOIN sales b ON a.prod_id = b.prod_id

    click to view at full size

    Figure D.5 Joining a table to itself

    Result

    buyer1      prod_id     buyer2 ----------- ----------- ----------- 1           2           1 4           2           1 1           3           1 4           1           4 3           5           3 1           2           4 4           2           4  (7 row(s) affected)

    Example 2

    This example displays a list of buyers who all purchased the same products, and it eliminates some of the extra rows that occurred in Example 1. Compare the result sets of Examples 1 and 2. The WHERE clause with the not equal to (<>) operator eliminates rows in the result set that match themselves. However, duplicate rows that are mirror images of one another are still returned in the result set.

    USE joindb SELECT a.buyer_id AS buyer1, a.prod_id, b.buyer_id AS buyer2 FROM sales a JOIN sales b ON a.prod_id = b.prod_id WHERE a.buyer_id <> b. buyer_id

    Result

    buyer1      prod_id     buyer2 ----------- ----------- ----------- 4           2           1 1           2           4  (2 row(s) affected)

    Example 3

    This example displays a list of buyers who all purchased the same products. The WHERE clause with the greater than (>) operator eliminates the mirrored row in addition to the self-matched rows that occurred in Example 1.

    USE joindb SELECT a.buyer_id AS buyer1, a.prod_id, b.buyer_id AS buyer2 FROM sales a JOIN sales b ON a.prod_id = b.prod_id WHERE a.buyer_id > b. buyer_id

    Result

    buyer1      prod_id     buyer2 ----------- ----------- ----------- 4           2           1  (1 row(s) affected)

    Example 4

    This example lists the names of employees and their managers.

    USE Northwind SELECT E.FirstName, E.LastName, M.FirstName, M.LastName FROM Employees E LEFT OUTER JOIN Employees M ON M.EmployeeID = E.ReportsTo

    Result

    FirstName  LastName             FirstName  LastName ---------- -------------------- ---------- -------------------- Nancy      Davolio              Andrew     Fuller Andrew     Fuller               NULL       NULL Janet      Leverling            Andrew     Fuller Margaret   Peacock              Andrew     Fuller Steven     Buchanan             Andrew     Fuller Michael    Suyama               Steven     Buchanan Robert     King                 Steven     Buchanan Laura      Callahan             Andrew     Fuller Anne       Dodsworth            Steven     Buchanan  (9 row(s) affected)

    Combining Multiple Result Sets

    The UNION operator combines the result of two or more SELECT statements into a single result set. Unlike joins, which combine rows from the base tables into single rows, UNION appends rows from the result sets after each other.

    Use the UNION operator when the data that you want to retrieve cannot be accessed with a single query. UNION is often used to recombine tables that have been partitioned. For example, old rows from an orders table can be moved to archive tables that each hold orders from a three-month period. To retrieve a single result set from more than one of these tables, create individual SELECT statements and combine them with the UNION operator.

    When you use the UNION operator, consider the following facts and guidelines:

    • The result sets must have matching columns. The columns needn t have the same names, but there must be the same number of columns in each result set and the columns must have compatible data types and be in the same order.
    • SQL Server removes duplicate rows from the result set. However, if you use the ALL option, all rows (including duplicates) are included in the result set.
    • The column names in the result set are taken from the first SELECT statement. Therefore, if you want to define new column headings for the result set, you must create the column aliases in the first SELECT statement.
    • If you want the entire result set to be returned in a specific order, you must specify a sort order by including an ORDER BY clause after the last SELECT statement. Otherwise, the result set may not be returned in the order that you want. SQL Server generates an error if you specify an ORDER BY clause for any of the other SELECT statements in the query.
    • An ORDER BY clause used with the UNION operator can only use items that appear in the select list of the first SELECT statement.
    • You may experience better performance if you break a complex query into multiple SELECT statements and then use the UNION operator to combine them.

    Syntax

    select_statement UNION [ALL] select_statement [,...n]

    Example 1

    This example combines two result sets. The first result set returns the name, city, and postal code of each employee from the Employees table. The second result set returns the name, city, and postal code of each customer from the Customers table. Notice that the customers were listed before the employees, although the Employees table was referenced first in the statement. This is because ordering is not guaranteed unless you specify an ORDER BY clause.

    USE Northwind SELECT FirstName + ' ' + LastName AS [Name], City, PostalCode FROM Employees UNION SELECT CompanyName, City, PostalCode FROM Customers

    Result

    Name                                     City            PostalCode ---------------------------------------- --------------- ---------- Alfreds Futterkiste                      Berlin          12209 Ana Trujillo Emparedados y helados       México D.F.     05021 Antonio Moreno Taquería             México D.F.     05023 Michael Suyama                           London          EC2 7JR Anne Dodsworth                           London          WG2 7LT  (100 row(s) affected)

    Example 2

    This example adds an ORDER BY clause to the statement from Example 1.

    USE Northwind SELECT FirstName + ' ' + LastName AS [Name], City, PostalCode FROM Employees UNION SELECT CompanyName, City, PostalCode FROM Customers ORDER BY [Name]

    Result

    Name                                     City           PostalCode ---------------------------------------- --------------- ---------- Alfreds Futterkiste                      Berlin          12209 Ana Trujillo Emparedados y helados       México D.F.     05021 Andrew Fuller                            Tacoma          98401 Wilman Kala                              Helsinki        21240 Wolski  Zajazd                           Warszawa        01-012  (100 row(s) affected)

    Exercise 2: Producing a Single Result Set from Two SELECT Statements

    In this exercise, you will write a query to retrieve a single list of members, both adult and juvenile, who have reserved books with ISBN numbers 1, 43, or 288. The list must include the isbn, title, member_no, and name (lastname and firstname) of each member who has the reservation. Additionally, the list should indicate whether the member is an adult or a juvenile. The C:\SQLDW\Exercise\AppD\union.SQL file contains a completed script for this exercise.

  • To write a query that combines two SELECT statements by using the UNION operator
    1. Write a SELECT statement that returns information about juvenile members from the reservation, item, title, member, and juvenile tables.
    2. Write a SELECT statement that returns information about adult members from the reservation, item, title, member, and adult tables.
    3. Add WHERE clauses to both statements to restrict the rows that are returned to those with ISBN numbers of 1, 43, and 288.
    4. Combine these statements with the UNION operator, add an ORDER BY clause to sort the results by ISBN, and then execute the query.
    5. USE library SELECT   re.isbn,   ti.title,   me.member_no,   me.lastname + ', ' + substring(firstname,1,1) AS [name],   'juvenile' AS age FROM reservation re JOIN item it ON re.isbn = it.isbn JOIN title ti ON it.title_no = ti.title_no JOIN member me ON re.member_no = me.member_no JOIN juvenile ju ON re.member_no = ju.member_no WHERE re.isbn in (1,43, 288) UNION SELECT   re.isbn,   ti.title,   me.member_no,   me.lastname + ', ' + substring(firstname,1,1) AS [name],   'adult' AS age FROM reservation re JOIN item it ON re.isbn = it.isbn JOIN title ti ON it.title_no = ti.title_no JOIN member me ON re.member_no = me.member_no JOIN adult ad ON re.member_no = ad.member_no WHERE re.isbn in (1,43,288) ORDER BY re.isbn

    Creating a Table from a Result Set

    You can place the result set of any query into a new table with the SELECT INTO statement.

    Use the SELECT INTO statement to create and populate new tables. You can also use the SELECT INTO statement to create temporary tables and break down complex problems that require a data set from various sources. If you first create a temporary table, the queries that you execute on it may be simpler than those you would execute on multiple tables or databases.

    When you use the SELECT INTO statement, consider the following facts and guidelines:

    • SQL Server creates a table and inserts a result set into the table.
    • Ensure that the table name that is specified in the SELECT INTO statement is unique. If a table exists with the same name, the SELECT INTO statement fails.

    • Set to on the select into/bulkcopy database option in order to create a permanent table.
    • You must create column aliases for calculated columns in the select list. You may use aliases to rename other columns; otherwise, the column name will be used.

    • You can create a local or global temporary table.
    • Create a local temporary table by preceding the table name with a number sign (#) or create a global temporary table by preceding the table name with a double number sign (##). To decide which to use, consider the facts listed below:

      • A local temporary table is available only to the connection that created it.
      • A global temporary table is available to all connections.
      • A local temporary table is deleted when the user closes the connection.
      • A global temporary table is deleted when the table is no longer used by any connections.

    Syntax

    The syntax for the SELECT INTO statement is as follows:

    SELECT select_list INTO new_table_name  FROM table_source  [WHERE search_condition]

    Example

    This example creates a local temporary table based on a query made on the Products table.

    USE Northwind SELECT ProductName, UnitPrice AS Price, (UnitPrice * 0.1) AS Tax INTO #pricetable FROM Products

    No results are returned by the SELECT INTO statement. Query the new table to see its data. For example:

    SELECT * FROM #pricetable

    Result

    ProductName                           Price            Tax ------------------------------------- ---------------- ---------------- -- Chai                                  18.0000          1.80000 Chang                                 19.0000          1.90000 Aniseed Syrup                         10.0000          1.00000 Lakkalikööri                        18.0000          1.80000 Original Frankfurter grüne Soße       13.0000          1.30000  (77 row(s) affected)

    Exercise 3: Creating and Populating a Temporary Table by Using the SELECT INTO Statement

    In this exercise, you will create and populate a temporary table named #overdue by using the INTO clause of the SELECT statement.

  • To write a SELECT INTO statement
    1. Write and execute a query that returns the member_no, lastname, out_date, due_date, and title columns from the loan, member, and title tables. Convert the datetime values from the out_date and due_date columns to char (12), format 101. Do not add the INTO clause. The C:\SQLDW\Exercise\AppD\into.SQL file contains a completed script for this step.
    2. Write a WHERE clause that restricts the results to past-due loans. Use the GETDATE() function (which returns the current date and time) in the search argument to check for past-due loans. Do not add the INTO clause. Execute the query and confirm that it returns approximately 1120 overdue loans. The C:\SQLDW\Exercise\AppD\into_1.SQL file contains a completed script for this step.
    3. USE library SELECT   lo.member_no,   me.lastname,   CONVERT(char(12),lo.out_date,101) AS out_date,   CONVERT(char(12),lo.due_date,101) AS date_due,   ti.title FROM loan lo JOIN member me ON lo.member_no = me.member_no JOIN title ti ON lo.title_no = ti.title_no WHERE (lo.due_date < GETDATE())

    4. Now that you have tested your query, add an INTO clause that creates a temporary table called #overdue. Leave the query window open after you create the temporary table. (If you don t, the connection will be closed and the temporary table will no longer be available.) The C:\SQLDW\Exercise\AppD\into_2.SQL file contains a completed script for this step.
    5. Write and execute a query that returns the member number and a calculated column called pastdue that lists the number of past-due loans of each member from the #overdue table. Use the COUNT(*) aggregate function and group the results by the member_no column. (You will learn more about the COUNT function and the GROUP BY clause in Appendix F.) Use the same query window that you used to create the temporary table. The C:\SQLDW\Exercise\AppD\into_3.SQL file contains a completed script for this step.
    6. USE library SELECT member_no, COUNT(*) AS pastdue FROM #overdue GROUP BY member_no

    Lesson Summary

    You can join up to 256 tables in a single query. When joining more than two tables, each successive join joins another table to the result set produced by the previous join. Self-joins are useful for analyzing the data within a table and for representing hierarchical structures. Result sets from SELECT queries that are column-compatible can be combined into a single result set with the UNION operator. The INTO clause of the SELECT statement allows you to create and populate a new permanent or temporary table with a result set from a query.



    Microsoft Corporation - Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit
    Microsoft SQL Server 7 Data Warehousing Technical Support Training Kit
    ISBN: 0735606706
    EAN: 2147483647
    Year: 1999
    Pages: 114

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net