Multi-Table Joins

   

So far, all the queries that you've seen involve a single table. Most databases contain multiple tables and there are relationships between these tables. This sample database has an example:

 movies=# \d rentals             Table "rentals"   Attribute       Type      Modifier -------------+--------------+----------  tape_id      character(8)  not null  rental_date  date          not null  customer_id  integer       not null 

Here's a description of the rentals table from earlier in this chapter:

"When a customer comes in to rent a tape, we will add a row to the rentals table to record the transaction. There are three pieces of information that we need to record for each rental: the tape_id , the customer_id , and the date that the rental occurred. Notice that each row in the rentals table refers to a customer ( customer_id ) and a tape ( tape_id )."

You can see that each row in the rentals table refers to a tape ( tape_id ) and to a customer ( customer_id ). If you SELECT from the rentals table, you can see the tape ID and customer ID, but you can't see the movie title or customer name . What you need here is a join . When you need to retrieve data from multiple tables, you join those tables.

PostgreSQL (and all relational databases) supports a number of join types. The most basic join type is a cross-join (or Cartesian product). In a cross join, PostgreSQL joins each row in the first table to each row in the second table to produce a result table. If you are joining against a third table, PostgreSQL joins each row in the intermediate result with each row in the third table.

Let's look at an example. We'll cross-join the rentals and customers tables. First, I'll show you each table:

 movies=# SELECT * FROM rentals;  tape_id   rental_date  customer_id ----------+-------------+-------------  AB-12345  2001-11-25             1  AB-67472  2001-11-25             3  OW-41221  2001-11-25             1  MC-68873  2001-11-20             3 (4 rows) movies=# SELECT * FROM customers;  id     customer_name       phone    birth_date  balance ----+----------------------+----------+------------+---------   1  Jones, Henry          555-1212  1970-10-10     0.00   2  Rubin, William        555-2211  1972-07-10    15.00   3  Panky, Henry          555-1221  1968-01-21     0.00   4  Wonderland, Alice N.  555-1122  1969-03-05     3.00   5  Funkmaster, Freddy    555-FUNK                7  Gull, Jonathon LC     555-1111  1984-02-05    8  Grumby, Jonas         555-2222  1984-02-21  (7 rows) 

Now I'll join these tables. To perform a cross-join, we simply list each table in the FROM clause:

 movies=# SELECT rentals.*, customers.id, customers.customer_name movies-#   FROM rentals, customers;  tape_id   rental_date  customer_id  id     customer_name ----------+-------------+-------------+----+----------------------  AB-12345  2001-11-25             1   1  Jones, Henry  AB-12345  2001-11-25             1   2  Rubin, William  AB-12345  2001-11-25             1   3  Panky, Henry  AB-12345  2001-11-25             1   4  Wonderland, Alice N.  AB-12345  2001-11-25             1   5  Funkmaster, Freddy  AB-12345  2001-11-25             1   7  Gull, Jonathon LC  AB-12345  2001-11-25             1   8  Grumby, Jonas  AB-67472  2001-11-25             3   1  Jones, Henry  AB-67472  2001-11-25             3   2  Rubin, William  AB-67472  2001-11-25             3   3  Panky, Henry  AB-67472  2001-11-25             3   4  Wonderland, Alice N.  AB-67472  2001-11-25             3   5  Funkmaster, Freddy  AB-67472  2001-11-25             3   7  Gull, Jonathon LC  AB-67472  2001-11-25             3   8  Grumby, Jonas  OW-41221  2001-11-25             1   1  Jones, Henry  OW-41221  2001-11-25             1   2  Rubin, William  OW-41221  2001-11-25             1   3  Panky, Henry  OW-41221  2001-11-25             1   4  Wonderland, Alice N.  OW-41221  2001-11-25             1   5  Funkmaster, Freddy  OW-41221  2001-11-25             1   7  Gull, Jonathon LC  OW-41221  2001-11-25             1   8  Grumby, Jonas  MC-68873  2001-11-20             3   1  Jones, Henry  MC-68873  2001-11-20             3   2  Rubin, William  MC-68873  2001-11-20             3   3  Panky, Henry  MC-68873  2001-11-20             3   4  Wonderland, Alice N.  MC-68873  2001-11-20             3   5  Funkmaster, Freddy  MC-68873  2001-11-20             3   7  Gull, Jonathon LC  MC-68873  2001-11-20             3   8  Grumby, Jonas (28 rows) 

You can see that PostgreSQL has joined each row in the rentals table to each row in the customers table. The rentals table contains four rows; the customers table contains seven rows. The result set contains 4 x 7 or 28 rows.

Cross-joins are rarely useful ”they usually don't represent real-world relationships.

The second type of join, the inner-join , is very useful. An inner-join starts with a cross-join, and then throws out the rows that you don't want. Take a close look at the results of the previous query. Here are the first seven rows again:

 tape_id   rental_date  customer_id  id     customer_name ----------+-------------+-------------+----+----------------------  AB-12345  2001-11-25             1   1  Jones, Henry  AB-12345  2001-11-25             1   2  Rubin, William  AB-12345  2001-11-25             1   3  Panky, Henry  AB-12345  2001-11-25             1   4  Wonderland, Alice N.  AB-12345  2001-11-25             1   5  Funkmaster, Freddy  AB-12345  2001-11-25             1   7  Gull, Jonathon LC  AB-12345  2001-11-25             1   8  Grumby, Jonas     .            .             .         .       .     .            .             .         .       .     .            .             .         .       . 

These seven rows were produced by joining the first row in the rentals table:

 tape_id   rental_date  customer_id ----------+-------------+-------------  AB-12345  2001-11-25             1 

with each row in the customers table. What is the real-world relationship between a rentals row and a customers row? Each row in the rentals table contains a customer ID. Each row in the customers table is uniquely identified by a customer ID. So, given a rentals row, we can find the corresponding customers row by searching for a customer where the customer ID is equal to rentals.customer_id . Looking back at the previous query, you can see that the meaningful rows are those WHERE customers.id = rentals.customer_id .

Qualifying Column Names

Notice that this WHERE clause mentions two columns with similar names ( customer_id and id ). You may find it helpful to qualify each column name by prefixing it with the name of the corresponding table, followed by a period. So, customers.id refers to the id column in the customers table and rentals.customer_id refers to the customer_id column in the rentals table. Adding the table qualifier is required if a command involves two columns with identical names, but is useful in other cases.

Now you can construct a query that will show us all of the rentals and the names of the corresponding customers:

 movies=# SELECT rentals.*, customers.id, customers.customer_name movies-#   FROM rentals, customers movies-#   WHERE customers.id = rentals.customer_id;  tape_id   rental_date  customer_id  id  customer_name ----------+-------------+-------------+----+---------------  AB-12345  2001-11-25             1   1  Jones, Henry  OW-41221  2001-11-25             1   1  Jones, Henry  AB-67472  2001-11-25             3   3  Panky, Henry  MC-68873  2001-11-20             3   3  Panky, Henry (4 rows) 

To execute this query, PostgreSQL could start by creating the cross-join between all the tables involved, producing an intermediate result table. Next, PostgreSQL could throw out all the rows that fail to satisfy the WHERE clause. In practice, this would be a poor strategy: Cross-joins can get very large quickly. Instead, the PostgreSQL query optimizer analyzes the query and plans an execution strategy to minimize execution time. I'll cover query optimization in Chapter 4, "Query Optimization."

Join Types

We've seen two join types so far: cross-joins and inner-joins. Now we'll look at outer-joins . An outer-join is similar to an inner-join: a relationship between two tables is established by correlating a column from each table.

In an earlier section, you wrote a query that answered the question: "Which customers are currently renting movies?" How would you answer the question: "Who are my customers and which movies are they currently renting ?" You might start by trying the following query:

 movies=# SELECT customers.*, rentals.tape_id movies-#   FROM customers, rentals movies-#   WHERE rentals.customer_id = customers.id;  id  customer_name   phone    birth_date  balance  tape_id ----+---------------+----------+------------+---------+----------   1  Jones, Henry   555-1212  1970-10-10     0.00  AB-12345   1  Jones, Henry   555-1212  1970-10-10     0.00  OW-41221   3  Panky, Henry   555-1221  1968-01-21     0.00  AB-67472   3  Panky, Henry   555-1221  1968-01-21     0.00  MC-68873 (4 rows) 

Well, that didn't work. This query showed you which customers are currently renting movies (and the movies that they are renting). What we really want is a list of all customers and, if a customer is currently renting any movies, all the movies rented. This is an outer-join. An outer-join preserves all the rows in one table (or both tables) regardless of whether a matching row can be found in the second table.

The syntax for an outer-join is a little strange . Here is an example:

 movies=# SELECT customers.customer_name, rentals.tape_id movies-#   FROM customers LEFT OUTER JOIN rentals movies-#   ON customers.id = rentals.customer_id;     customer_name      tape_id ----------------------+----------  Jones, Henry          AB-12345  Jones, Henry          OW-41221  Rubin, William         Panky, Henry          AB-67472  Panky, Henry          MC-68873  Wonderland, Alice N.   Funkmaster, Freddy     Gull, Jonathon LC      Grumby, Jonas         (9 rows) 

This query is a left outer-join . Why left? Because you will see each row from the left table (the table to the left of the LEFT OUTER JOIN phrase). An inner-join would list only two customers ("Jones, Henry" and "Panky, Henry") ”the other customers have no rentals.

A RIGHT OUTER JOIN preserves each row from the right table. A FULL OUTER JOIN preserves each row from both tables.

The following query shows a list of all customers , all tapes , and any rentals :

 movies=# SELECT customers.customer_name, rentals.tape_id, tapes.title movies-#   FROM customers FULL OUTER JOIN rentals movies-#     ON customers.id = rentals.customer_id movies-#   FULL OUTER JOIN tapes movies-#     ON tapes.tape_id = rentals.tape_id;     customer_name      tape_id          title ----------------------+----------+----------------------  Jones, Henry          AB-12345  The Godfather  Panky, Henry          AB-67472  The Godfather                                  Rear Window                                  American Citizen, An  Panky, Henry          MC-68873  Casablanca  Jones, Henry          OW-41221  Citizen Kane  Rubin, William                   Wonderland, Alice N.             Funkmaster, Freddy               Gull, Jonathon LC                Grumby, Jonas                                                    Sly                                  Stone (13 rows) 
   


PostgreSQL
PostgreSQL (2nd Edition)
ISBN: 0672327562
EAN: 2147483647
Year: 2005
Pages: 220
Authors: Korry Douglas

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net