# Choosing Appropriate LIMIT Values

3.19.1 Problem

LIMIT doesn't seem to do what you want it to.

3.19.2 Solution

Be sure you understand what question you're asking. It may be that LIMIT is exposing some interesting subtleties in your data that you have not considered or are not aware of.

3.19.3 Discussion

LIMIT n is useful in conjunction with ORDER BY for selecting smallest or largest values from a result set. But does that actually give you the rows with the n smallest or largest values? Not necessarily! It does if your rows contain unique values, but not if there are duplicates. You may find it necessary to run a preliminary query first to help you choose the proper LIMIT value.

To see why this is, consider the following dataset, which shows the American League pitchers who won 15 or more games during the 2001 baseball season:

mysql> SELECT name, wins FROM al_winner
-> ORDER BY wins DESC, name;
+----------------+------+
| name | wins |
+----------------+------+
| Mulder, Mark | 21 |
| Clemens, Roger | 20 |
| Moyer, Jamie | 20 |
| Garcia, Freddy | 18 |
| Hudson, Tim | 18 |
| Abbott, Paul | 17 |
| Mays, Joe | 17 |
| Mussina, Mike | 17 |
| Sabathia, C.C. | 17 |
| Zito, Barry | 17 |
| Buehrle, Mark | 16 |
| Milton, Eric | 15 |
| Pettitte, Andy | 15 |
| Sele, Aaron | 15 |
+----------------+------+

If you want to know who won the most games, adding LIMIT 1 to the preceding query will give you the correct answer, because the maximum value is 21 and there is only one pitcher with that value (Mark Mulder). But what if you want the four highest game winners? The proper queries depend on what you mean by that, which can have various interpretations:

• If you just want the first four rows, sort the records and add LIMIT 4:

mysql> SELECT name, wins FROM al_winner
-> ORDER BY wins DESC, name
-> LIMIT 4;
+----------------+------+
| name | wins |
+----------------+------+
| Mulder, Mark | 21 |
| Clemens, Roger | 20 |
| Moyer, Jamie | 20 |
| Garcia, Freddy | 18 |
+----------------+------+

That may not suit your purposes because LIMIT imposes a cutoff that occurs in the middle of a set of pitchers with the same number of wins (Tim Hudson also won 18 games).

• To avoid making a cutoff in the middle of a set of rows with the same value, select rows with values greater than or equal to the value in the fourth row. Find out what that value is with LIMIT, then use it in the WHERE clause of a second query to select rows:

mysql> SELECT wins FROM al_winner
-> ORDER BY wins DESC, name
-> LIMIT 3, 1;
+------+
| wins |
+------+
| 18 |
+------+
mysql> SELECT name, wins FROM al_winner
-> WHERE wins >= 18
-> ORDER BY wins DESC, name;
+----------------+------+
| name | wins |
+----------------+------+
| Mulder, Mark | 21 |
| Clemens, Roger | 20 |
| Moyer, Jamie | 20 |
| Garcia, Freddy | 18 |
| Hudson, Tim | 18 |
+----------------+------+
• If you want to know all the pitchers with the four largest wins values, another approach is needed. Determine the fourth-largest value with DISTINCT and LIMIT, then use it to select rows:

mysql> SELECT DISTINCT wins FROM al_winner
-> ORDER BY wins DESC, name
-> LIMIT 3, 1;
+------+
| wins |
+------+
| 17 |
+------+
mysql> SELECT name, wins FROM al_winner
-> WHERE wins >= 17
-> ORDER BY wins DESC, name;
+----------------+------+
| name | wins |
+----------------+------+
| Mulder, Mark | 21 |
| Clemens, Roger | 20 |
| Moyer, Jamie | 20 |
| Garcia, Freddy | 18 |
| Hudson, Tim | 18 |
| Abbott, Paul | 17 |
| Mays, Joe | 17 |
| Mussina, Mike | 17 |
| Sabathia, C.C. | 17 |
| Zito, Barry | 17 |
+----------------+------+

For this dataset, each method yields a different result. The moral is that the way you use LIMIT may require some thought about what you really want to know.

MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2005
Pages: 412
Authors: Paul DuBois

Similar book on Amazon