Finding Rows with No Match in Another Table

12.6.1 Problem

You want to find rows in one table that have no match in another. Or you want to produce a list on the basis of a join between tables, but you want the list to include an entry even when there are no matches in the second table.

12.6.2 Solution

Use a LEFT JOIN. As of MySQL 3.23.25, you can also use a RIGHT JOIN.

12.6.3 Discussion

The preceding sections focused on finding matches between two tables. But the answers to some questions require determining which records do not have a match (or, stated another way, which records have values that are missing from the other table). For example, you might want to know which artists in the artist table you don't yet have any paintings by. The same kind of question occurs in other contexts, such as:

  • You're working in sales. You have a list of potential customers, and another list of people who have placed orders. To focus your efforts on people who are not yet actual customers, you want to find people in the first list that are not in the second.
  • You have one list of baseball players, another list of players who have hit home runs, and you want to know which players in the first list have not hit a home run. The answer is determined by finding those players in the first list who are not in the second.

For these types of questions, you need to use a LEFT JOIN.

To see why, let's determine which artists in the artist table are missing from the painting table. At present, the tables are small, so it's easy to examine them visually and determine that you have no paintings by Monet and Picasso (there are no painting records with an a_id value of 2 or 4):

mysql> SELECT * FROM artist ORDER BY a_id;
+------+----------+
| a_id | name |
+------+----------+
| 1 | Da Vinci |
| 2 | Monet |
| 3 | Van Gogh |
| 4 | Picasso |
| 5 | Renoir |
+------+----------+
mysql> SELECT * FROM painting ORDER BY a_id, p_id;
+------+------+-------------------+-------+-------+
| a_id | p_id | title | state | price |
+------+------+-------------------+-------+-------+
| 1 | 1 | The Last Supper | IN | 34 |
| 1 | 2 | The Mona Lisa | MI | 87 |
| 3 | 3 | Starry Night | KY | 48 |
| 3 | 4 | The Potato Eaters | KY | 67 |
| 3 | 5 | The Rocks | IA | 33 |
| 5 | 6 | Les Deux Soeurs | NE | 64 |
+------+------+-------------------+-------+-------+

But as you acquire more paintings and the tables get larger, it won't be so easy to eyeball them and answer the question by inspection. Can you answer the question using SQL? Sure, although first attempts at solving the problem generally look something like the following query, using a WHERE clause that looks for mismatches between the two tables:

mysql> SELECT * FROM artist, painting WHERE artist.a_id != painting.a_id;
+------+----------+------+------+-------------------+-------+-------+
| a_id | name | a_id | p_id | title | state | price |
+------+----------+------+------+-------------------+-------+-------+
| 2 | Monet | 1 | 1 | The Last Supper | IN | 34 |
| 3 | Van Gogh | 1 | 1 | The Last Supper | IN | 34 |
| 4 | Picasso | 1 | 1 | The Last Supper | IN | 34 |
| 5 | Renoir | 1 | 1 | The Last Supper | IN | 34 |
| 2 | Monet | 1 | 2 | The Mona Lisa | MI | 87 |
| 3 | Van Gogh | 1 | 2 | The Mona Lisa | MI | 87 |
| 4 | Picasso | 1 | 2 | The Mona Lisa | MI | 87 |
| 5 | Renoir | 1 | 2 | The Mona Lisa | MI | 87 |
| 1 | Da Vinci | 3 | 3 | Starry Night | KY | 48 |
| 2 | Monet | 3 | 3 | Starry Night | KY | 48 |
| 4 | Picasso | 3 | 3 | Starry Night | KY | 48 |
| 5 | Renoir | 3 | 3 | Starry Night | KY | 48 |
| 1 | Da Vinci | 3 | 4 | The Potato Eaters | KY | 67 |
| 2 | Monet | 3 | 4 | The Potato Eaters | KY | 67 |
| 4 | Picasso | 3 | 4 | The Potato Eaters | KY | 67 |
| 5 | Renoir | 3 | 4 | The Potato Eaters | KY | 67 |
| 1 | Da Vinci | 3 | 5 | The Rocks | IA | 33 |
| 2 | Monet | 3 | 5 | The Rocks | IA | 33 |
| 4 | Picasso | 3 | 5 | The Rocks | IA | 33 |
| 5 | Renoir | 3 | 5 | The Rocks | IA | 33 |
| 1 | Da Vinci | 5 | 6 | Les Deux Soeurs | NE | 64 |
| 2 | Monet | 5 | 6 | Les Deux Soeurs | NE | 64 |
| 3 | Van Gogh | 5 | 6 | Les Deux Soeurs | NE | 64 |
| 4 | Picasso | 5 | 6 | Les Deux Soeurs | NE | 64 |
+------+----------+------+------+-------------------+-------+-------+

That's obviously not the correct result! The query produces a list of all combinations of values from the two rows where the values aren't the same, but what you really want is a list of values in artist that aren't present at all in painting. The trouble here is that a regular join can only produce combinations from values that are present in the tables. It can't tell you anything about values that are missing.

When faced with the problem of finding values in one table that have no match in (or that are missing from) another table, you should get in the habit of thinking, "aha, that's a LEFT JOIN problem." A LEFT JOIN is similar to a regular join in that it attempts to match rows in the first (left) table with the rows in the second (right) table. But in addition, if a left table row has no match in the right table, a LEFT JOIN still produces a rowone in which all the columns from the right table are set to NULL. This means you can find values that are missing from the right table by looking for NULL. It's easier to observe how this happens by working in stages. First, run a regular join to find matching rows:

mysql> SELECT * FROM artist, painting
 -> WHERE artist.a_id = painting.a_id;
+------+----------+------+------+-------------------+-------+-------+
| a_id | name | a_id | p_id | title | state | price |
+------+----------+------+------+-------------------+-------+-------+
| 1 | Da Vinci | 1 | 1 | The Last Supper | IN | 34 |
| 1 | Da Vinci | 1 | 2 | The Mona Lisa | MI | 87 |
| 3 | Van Gogh | 3 | 3 | Starry Night | KY | 48 |
| 3 | Van Gogh | 3 | 4 | The Potato Eaters | KY | 67 |
| 3 | Van Gogh | 3 | 5 | The Rocks | IA | 33 |
| 5 | Renoir | 5 | 6 | Les Deux Soeurs | NE | 64 |
+------+----------+------+------+-------------------+-------+-------+

In this output, the first a_id column comes from the artist table and the second one comes from the painting table.

Now compare that result with the output you get from a LEFT JOIN. A LEFT JOIN is written in somewhat similar fashion, but you separate the table names by LEFT JOIN rather than by a comma, and specify which columns to compare using an ON clause rather than a WHERE clause:

mysql> SELECT * FROM artist LEFT JOIN painting
 -> ON artist.a_id = painting.a_id;
+------+----------+------+------+-------------------+-------+-------+
| a_id | name | a_id | p_id | title | state | price |
+------+----------+------+------+-------------------+-------+-------+
| 1 | Da Vinci | 1 | 1 | The Last Supper | IN | 34 |
| 1 | Da Vinci | 1 | 2 | The Mona Lisa | MI | 87 |
| 2 | Monet | NULL | NULL | NULL | NULL | NULL |
| 3 | Van Gogh | 3 | 3 | Starry Night | KY | 48 |
| 3 | Van Gogh | 3 | 4 | The Potato Eaters | KY | 67 |
| 3 | Van Gogh | 3 | 5 | The Rocks | IA | 33 |
| 4 | Picasso | NULL | NULL | NULL | NULL | NULL |
| 5 | Renoir | 5 | 6 | Les Deux Soeurs | NE | 64 |
+------+----------+------+------+-------------------+-------+-------+

The output is similar to that from the regular join, except that the LEFT JOIN also produces an output row for artist rows that have no painting table match. For those output rows, all the columns from painting are set to NULL.

Next, to restrict the output only to the non-matched artist rows, add a WHERE clause that looks for NULL values in the painting column that is named in the ON clause:

mysql> SELECT * FROM artist LEFT JOIN painting
 -> ON artist.a_id = painting.a_id
 -> WHERE painting.a_id IS NULL;
+------+---------+------+------+-------+-------+
| a_id | name | a_id | p_id | title | price |
+------+---------+------+------+-------+-------+
| 2 | Monet | NULL | NULL | NULL | NULL |
| 4 | Picasso | NULL | NULL | NULL | NULL |
+------+---------+------+------+-------+-------+

Finally, to show only the artist table values that are missing from the painting table, shorten the output column list to include only columns from the artist table:

mysql> SELECT artist.* FROM artist LEFT JOIN painting
 -> ON artist.a_id = painting.a_id
 -> WHERE painting.a_id IS NULL;
+------+---------+
| a_id | name |
+------+---------+
| 2 | Monet |
| 4 | Picasso |
+------+---------+

The preceding LEFT JOIN lists those left-table values that are not present in the right table. A similar kind of operation can be used to report each left-table value along with an indicator whether or not it's present in the right table. To do this, perform a LEFT JOIN to count the number of times each left-table value occurs in the right table. A count of zero indicates the value is not present. The following query lists each artist from the artist table, and whether or not you have any paintings by the artist:

mysql> SELECT artist.name,
 -> IF(COUNT(painting.a_id)>0,'yes','no') AS 'in collection'
 -> FROM artist LEFT JOIN painting ON artist.a_id = painting.a_id
 -> GROUP BY artist.name;
+----------+---------------+
| name | in collection |
+----------+---------------+
| Da Vinci | yes |
| Monet | no |
| Picasso | no |
| Renoir | yes |
| Van Gogh | yes |
+----------+---------------+

As of MySQL 3.23.25, you can also use RIGHT JOIN, which is like LEFT JOIN but reverses the roles of the left and right tables. In other words, RIGHT JOIN forces the matching process to produce a row from each table in the right table, even in the absence of a corresponding row in the left table. This means you would rewrite the preceding LEFT JOIN as follows to convert it to a RIGHT JOIN that produces the same results:

mysql> SELECT artist.name,
 -> IF(COUNT(painting.a_id)>0,'yes','no') AS 'in collection'
 -> FROM painting RIGHT JOIN artist ON painting.a_id = artist.a_id
 -> GROUP BY artist.name;
+----------+---------------+
| name | in collection |
+----------+---------------+
| Da Vinci | yes |
| Monet | no |
| Picasso | no |
| Renoir | yes |
| Van Gogh | yes |
+----------+---------------+

Elsewhere in this book, I'll generally refer in discussion only to LEFT JOIN for brevity, but the discussions apply to RIGHT JOIN as well if you reverse the roles of the tables.

Other Ways to Write LEFT JOIN and RIGHT JOIN Queries

When the names of the columns to be matched are the same in both tables, an alternative to ON can be used for writing LEFT JOIN and RIGHT JOIN queries. This syntax substitutes USING for ON. For example, the following two queries are equivalent:

SELECT t1.n, t2.n FROM t1 LEFT JOIN t2 ON t1.n = t2.n;
SELECT t1.n, t2.n FROM t1 LEFT JOIN t2 USING (n);

As are these:

SELECT t1.n, t2.n FROM t1 RIGHT JOIN t2 ON t1.n = t2.n;
SELECT t1.n, t2.n FROM t1 RIGHT JOIN t2 USING (n);

In the special case that you want to base the comparison on all columns that appear in both tables, you can use NATURAL LEFT JOIN or NATURAL RIGHT JOIN:

SELECT t1.n, t2.n FROM t1 NATURAL LEFT JOIN t2;
SELECT t1.n, t2.n FROM t1 NATURAL RIGHT JOIN t2;

12.6.4 See Also

As shown in this section, LEFT JOIN is useful for finding values with no match in another table, or for showing whether each value is matched. LEFT JOIN may also be used for producing a summary that includes all items in a list, even those for which there's nothing to summarize. This is very common for characterizing the relationship between a master table and a detail table. For example, a LEFT JOIN can produce "total sales per customer" reports that list all customers, even those who haven't bought anything during the summary period. (See Recipe 12.9.)

Another application of LEFT JOIN is for performing consistency checking when you receive two datafiles that are supposed to be related, and you want to determine whether they really are. (That is, you want to check the integrity of the relationship.) Import each file into a MySQL table, then run a couple of LEFT JOIN statements to determine whether there are unattached records in one table or the otherthat is, records that have no match in the other table. (If there are any such records and you want to delete them, see Recipe 12.22.)

Using the mysql Client Program

Writing MySQL-Based Programs

Record Selection Techniques

Working with Strings

Working with Dates and Times

Sorting Query Results

Generating Summaries

Modifying Tables with ALTER TABLE

Obtaining and Using Metadata

Importing and Exporting Data

Generating and Using Sequences

Using Multiple Tables

Statistical Techniques

Handling Duplicates

Performing Transactions

Introduction to MySQL on the Web

Incorporating Query Resultsinto Web Pages

Processing Web Input with MySQL

Using MySQL-Based Web Session Management

Appendix A. Obtaining MySQL Software

Appendix B. JSP and Tomcat Primer

Appendix C. References



MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2005
Pages: 412
Authors: Paul DuBois

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net