Performing Joins


Because relational databases are more complexly structured, they sometimes require special query statements to retrieve the information you need most. For example, if you wanted to know what categories the www.php.net URL is filed under, you would need to find its url_id from the urls table, use that to retrieve all the url_category_ids from the url_associations table, and then retrieve the actual category values from the url_categories table for those url_category_ids. As you can see, this requires several steps and queries. By using a joinan SQL query performed by cross-referencing tablesyou can accomplish all of that in one fell swoop.

Several types of joins are conceivable according to SQL. Beginning to intermediate users will find that the two most basic joins, which I'll teach in this chapter, will suffice for almost every application. The most used join is called an inner join.

 SELECT * FROM url_associations,  url_categories WHERE url_associations.  url_category_id = url_categories.  url_category_id 

The benefit of this join is that it will retrieve all of the information from both the url_associations and url_categories tables wherever a url_associations.url_category_id is the same as the url_categories.url_category_id (Figure 5.17). In other words, the query will replace the url_category_id foreign key in the url_associations table with all of the information for that type from the url_categories table. An inner join like this will return records only where a match is made (so if a URL category were not being used by an existing URL, it would not be displayed).

Figure 5.17. This join returns every column from both tables where the url_category_id values are equal.


When selecting from multiple tables and columns, you must use the dot syntax (table.column) if they have columns with the same name. This is normally the case when dealing with relational databases because a primary key from one table will have the same name as a foreign key in another. If you are not explicit when referencing your columns, you'll get an error (Figure 5.18).

Figure 5.18. Generically referring to a column present in multiple tables will cause an ambiguity error.


The second type of join I'll discussan outer or left joindiffers from an inner join in that it could return records not matched by a conditional. An example of a left join is

 SELECT * FROM url_categories LEFT JOIN  url_associations ON url_associations.  url_category_id = url_categories.  url_category_id 

Note that the comma from the inner join is replaced by the words LEFT JOIN and the word WHERE is replaced with ON. The most important consideration with left joins is which table gets named first. In this example, all of the url_categories records will be returned along with all of the url_associations information, if a match is made. If no url_associations match exists for a url_categories row, then NULL values will be returned instead (Figure 5.19).

Figure 5.19. This left join returns more records than the comparative inner join (see Figure 5.17).


If both tables in a left join have the same column name, you can simplify your query with

 SELECT * FROM url_categories LEFT JOIN  url_associations USING  (url_category_id) 

Because of the complicated syntax with joins, the SQL concept of an aliasintroduced in the Chapter 4will come in handy when writing them.

To use joins

1.

Retrieve the URL category for every record in the url_associations table (Figure 5.20).

 SELECT category FROM url_  associations AS ua, url_categories  AS uc WHERE ua.url_category_id=uc.  url_category_id; 

Figure 5.20. A basic inner join that returns only one column of values.


This query, which includes an inner join, will effectively replace the url_category_id value from the url_associations table with the corresponding category value from the url_categories table for each of the records in the url_associations table. The end result is that it displays the textual version of the category for each stored URL.

2.

Retrieve the URL and the URL category for every record in the url_associations table, sorted by category (Figure 5.21).

 SELECT category, url FROM urls AS u,  url_categories AS uc,  url_associations AS ua WHERE ua.url_category_id=uc.  url_category_id AND u.url_id=ua.url_id ORDER BY uc.category ASC; 

Figure 5.21. A slightly more complicated version of an inner join, using all three tables in the content database.


This join has two differences from its predecessor. First, I also want to return the actual URL as well, which means that I need to join in the urls table. Second, I've added an ORDER BY to sort the results.

Take note of how a three-table inner join is written (with an AND conditional) and how the aliases are used for shorthand when referring to the three tables and their columns. In simplest terms, what is returned by this query is every record from the urls table, with the textual version of the category replacing the url_category_id value and the textual version of the URL replacing the url_id value.

3.

Retrieve the URL and the URL type for every URL type (Figure 5.22).

 SELECT category, url FROM  url_categories LEFT JOIN  url_associations USING  (url_category_id) LEFT JOIN urls  USING (url_id) ORDER BY  url_categories.category; 

Figure 5.22. This left join returns more records than the comparative inner join (see Figure 5.21).


This is a left join variant on the inner join from Step 2. Whereas that join returned only the URL and category for records in the urls table, this one returns all of the URLs and categories based upon the url_categories table. The end result is that one more record is returned here (General Database), and the URL, which is not matched through an inner join (no URL has been assigned this type), is given a NULL value.

Tips

  • You can even join a table with itself (a self-join)!

  • Joins can be created using conditionals involving any columns, not just the primary and foreign keys, as I have here.

  • You can perform joins across multiple databases using the database.table.column syntax, as long as every database is on the same server (you cannot do this across a network).

  • Joins that do not include a WHERE clause (e.g., SELECT * FROM urls, url_associations) are called full joins and will return every record from both tables. This construct can have unwieldy results with larger tables.

  • A NULL value in a column referenced in a join will never be returned, because a NULL matches no other value, including NULL.

  • Because the syntax of joins is more complicated than standard SELECT queries, successfully writing them takes some practice and experimentation. If your PHP script will be using a join, make sure it works in the mysql client (or phpMyAdmin) first.




    PHP and MySQL for Dynamic Web Sites. Visual QuickPro Guide
    PHP and MySQL for Dynamic Web Sites: Visual QuickPro Guide (2nd Edition)
    ISBN: 0321336577
    EAN: 2147483647
    Year: 2005
    Pages: 166
    Authors: Larry Ullman

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net