13.4. Comparing Subquery Results to Outer Query Columns


The scalar subquery examples shown in previous sections use the = equality operator to compare a single column to the value returned by the subquery. But you are not limited to using the = equality operator. When comparing the values in the outer query with those returned by a scalar subquery, you can use any of the usual comparison operators, such as =, <, >, <>, and >=.

When a comparison requires a scalar subquery, it is an error if the subquery returns more than a single value. Suppose that we wanted to find out whether there is a country that has a city with a population of less than 100, using the following subquery:

 mysql> SELECT Code c, Name     -> FROM Country     -> WHERE 100 > (SELECT Population     ->              FROM City     ->              WHERE CountryCode = c); ERROR 1242 (21000): Subquery returns more than 1 row 

The subquery returns more than one value, so the statement fails.

Other subquery comparison operations do not require scalar subqueries. The following sections describe operations that allow column subqueries. Section 13.5, "Comparison Using Row Subqueries," discusses how to compare rows.

13.4.1. Using ALL, ANY, and SOME

To perform a comparison between a scalar value and a subquery that returns several rows of data in a single column (a column subquery), we must use a quantified comparison. The quantifier keywords ALL, ANY, and SOME allow comparison to multiple-row results.

Using the ALL keyword in a comparison with a column subquery limits the result set to only those records where the comparison is true for all values produced by the subquery. Consider the following statement, which tells us the average country population for each of the world's continents:

 mysql> SELECT Continent, AVG(Population)     -> FROM Country     -> GROUP BY Continent; +---------------+-----------------+ | Continent     | AVG(Population) | +---------------+-----------------+ | Asia          |   72647562.7451 | | Europe        |   15871186.9565 | | North America |   13053864.8649 | | Africa        |   13525431.0345 | | Oceania       |    1085755.3571 | | Antarctica    |          0.0000 | | South America |   24698571.4286 | +---------------+-----------------+ 

Now, suppose that we would like to know all the countries in the world where the population is larger than the average country population of all of the world's continents. To get this information, we can use ALL in conjunction with the > operator to compare the value of the country population with every average continent population from the preceding result:

 mysql> SELECT Name, Population     -> FROM Country     -> WHERE Population > ALL (SELECT AVG(Population)     ->                         FROM Country     ->                         GROUP BY Continent)     -> ORDER BY Name; +--------------------+------------+ | Name               | Population | +--------------------+------------+ | Bangladesh         |  129155000 | | Brazil             |  170115000 | | China              | 1277558000 | | Germany            |   82164700 | | India              | 1013662000 | | Indonesia          |  212107000 | | Japan              |  126714000 | | Mexico             |   98881000 | | Nigeria            |  111506000 | | Pakistan           |  156483000 | | Philippines        |   75967000 | | Russian Federation |  146934000 | | United States      |  278357000 | | Vietnam            |   79832000 | +--------------------+------------+ 

Note that Continent has been removed from the subquery's SELECT clause, because a quantified subquery can produce only a single column of values. If the subquery is written to select both the Continent column and the calculated column, MySQL cannot tell which one to use in the comparison and issues a complaint:

 mysql> SELECT Name     -> FROM Country     -> WHERE Population > ALL (SELECT Continent, AVG(Population)     ->                         FROM Country     ->                         GROUP BY Continent)     -> ORDER BY Name; ERROR 1241 (21000): Operand should contain 1 column(s) 

The keyword ANY (as well as the other quantified comparison keywords) is not limited to working with the = operator. Any of the standard comparison operators (=, <, >, <>, >=, and so forth) may be used for the comparison.

Comparisons using the word ANY will, as the name implies, succeed for any values in the column of data found by the subquery which succeed in the comparison. The following example finds the countries on the European continent, and, for each one, tests whether the country is among the worldwide list of countries where Spanish is spoken:

 mysql> SELECT Name     -> FROM Country     -> WHERE Continent = 'Europe'     ->   AND Code = ANY (SELECT CountryCode     ->                   FROM CountryLanguage     ->                   WHERE Language = 'Spanish')     -> ORDER BY Name; +---------+ | Name    | +---------+ | Andorra | | France  | | Spain   | | Sweden  | +---------+ 

Compare that query to the following one using ALL: We run the same query, changing = ANY to = ALL to see if the European continent covers all those countries where Spanish is spoken:

 mysql> SELECT Name     -> FROM Country     -> WHERE Continent = 'Europe'     ->   AND Code = ALL (SELECT CountryCode     ->                   FROM CountryLanguage     ->                   WHERE Language = 'Spanish')     -> ORDER BY Name; Empty set (0.00 sec) 

Because the result is empty, we can conclude that the European continent is not the only one where Spanish is spoken.

The word SOME is an alias for ANY, and may be used anywhere that ANY is used. The SQL standard defines these two words with the same meaning to overcome a limitation in the English language. Consider the following statement, in which we use <> ANY to negate the sense of the previous ANY example. As you read the example, try to form a sentence in your head to describe the output you would expect from the query (the output has been reduced to enhance readability):

 mysql> SELECT Name     -> FROM Country     -> WHERE Continent = 'Europe'     ->   AND Code <> ANY (SELECT CountryCode     ->                    FROM CountryLanguage     ->                    WHERE Language = 'Spanish')     -> ORDER BY Name; +-------------------------------+ | Name                          | +-------------------------------+ | Albania                       | | Andorra                       | | Austria                       | .... | Finland                       | | France                        | | Germany                       | ... | Svalbard and Jan Mayen        | | Sweden                        | | Switzerland                   | | Ukraine                       | | United Kingdom                | | Yugoslavia                    | +-------------------------------+ 

You probably expected this query to find "all the countries on the European continent where Spanish is not spoken," or something similar. Yet the query actually finds every single country on the European continent.

In the English language, we expect "not any" to mean "none at all." However, in SQL, <> ANY means "one or more do not match." In other words, the statement is really saying "return all the countries, where there are some people that do not speak Spanish." In our example, for all of the four countries where there are Spanish speakers, we do in fact also find speakers of other languages.

To alleviate the confusion that might arise from the use of <> ANY, the SQL standard includes the SOME keyword as a synonym for ANY. Using the <> SOME construct makes it easier to understand the expected outcome of the SQL statement:

 SELECT Name FROM Country WHERE Continent = 'Europe'   AND Code <> SOME (SELECT CountryCode                     FROM CountryLanguage                     WHERE Language = 'Spanish') ORDER BY Name; 

13.4.2. Using IN

From Section 10.6.1, "Comparison Functions," you are already familiar with the variant of IN that may be used in an expression, as shown in the following example:

 mysql> SELECT Name     -> FROM Country     -> WHERE Code IN ('DEU', 'USA', 'JPN'); +---------------+ | Name          | +---------------+ | Germany       | | Japan         | | United States | +---------------+ 

In this case, using IN is merely a shorthand for writing WHERE Code='DEU' OR Code='USA' OR Code='JPN'. It has nothing to do with subqueries.

When IN is used with a subquery, it is functionally equivalent to = ANY (note that the = sign is part of the equivalence). Many consider IN to be more readable than = ANY, because what you really want to know is "does this value appear in the subquery?" As an example, consider the equivalent IN version of the = ANY example shown in the previous section:

 SELECT Name FROM Country WHERE Continent = 'Europe'   AND Code IN (SELECT CountryCode                FROM CountryLanguage                WHERE Language = 'Spanish') ORDER BY Name; 

IN cannot be combined with any comparison operators such as = or <>.

NOT IN is another "shorthand." However, it is not an alias of <> ANY as you might otherwise expect. It is an alias of <> ALL. In other words, NOT IN is only true if none of the records of the subquery can be matched by the outer query. In the example for SOME, we demonstrated that <> ANY would return records of countries where some people didn't speak Spanish. The same query, using NOT IN (that is, <> ALL) will return only those countries where Spanish is not spoken at all. Although it may seem logically flawed that IN and NOT IN are aliases of two very different statements, it fits better with the way that we usually understand the equivalent English terms.

13.4.3. Using EXISTS

The EXISTS predicate performs a simple test: It tells you whether the subquery finds any rows. It does not return the actual values found in any of the rows, it merely returns TRUE if any rows were found. As does one of our previous examples, the following example finds countries on the European continent where Spanish is spoken. But with this query, no actual comparison is made between the data in the outer query and the rows found in the inner query.

 mysql> SELECT Code c, Name     -> FROM Country     -> WHERE Continent = 'Europe'     ->   AND EXISTS (SELECT *     ->               FROM CountryLanguage     ->               WHERE CountryCode = c     ->                 AND Language = 'Spanish'); +-----+---------+ | c   | Name    | +-----+---------+ | AND | Andorra | | ESP | Spain   | | FRA | France  | | SWE | Sweden  | +-----+---------+ 

The use of SELECT * in EXISTS subqueries is purely by tradition. You can use a different column list as long as the subquery is syntactically correct. No column values are ever needed for comparison, so MySQL never actually evaluates the column list given in the subquery SELECT. For example, you could replace the * with a constant value such as 1, 0, or even NULL.

EXISTS can be negated using NOT EXISTS, which, as the name implies, returns TRUE for subquery result sets with no rows. Replacing EXISTS with NOT EXISTS in the previous example shows those 42 countries on the European continent in which Spanish is not spoken at all.



MySQL 5 Certification Study Guide
MySQL 5.0 Certification Study Guide
ISBN: 0672328127
EAN: 2147483647
Year: 2006
Pages: 312

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net