If you specify no criteria for selecting records from a table, a SELECT statement retrieves every record in the table. This is often more information than you need, particularly for large tables. To be more specific about which rows are of interest, include a WHERE clause that describes the characteristics of those rows. A WHERE clause can be as simple or complex as necessary to identify the rows that are relevant for your purposes. For example, to retrieve records from the Country table for those countries that achieved independence after the year 1990, it's sufficient to use a WHERE clause that specifies a single condition: SELECT * FROM Country WHERE IndepYear > 1990; More complex WHERE clauses specify multiple conditions, which may be combined using logical operators such as AND and OR. The following statement returns rows with Population values in the range from 1 million to 2 million: SELECT * FROM Country WHERE Population >= 1000000 AND Population <= 2000000; For testing values in a range, you can also use the BETWEEN operator: SELECT * FROM Country WHERE Population BETWEEN 1000000 AND 2000000; Some operators have higher precedence than others. For example, AND has a higher precedence than OR. To control the order of evaluation of terms within a complex expression (or simply to make the evaluation order explicit), use parentheses to group expression terms. Consider the following WHERE clause: WHERE GNP < 1000 AND Continent = 'Africa' OR Continent = 'Asia' Because AND has a higher precedence than OR, the preceding expression is equivalent to the following one: WHERE (GNP < 1000 AND Continent = 'Africa') OR Continent = 'Asia' That expression finds all records with a GNP value less than 1000 that also have a Continent value of 'Africa', as well as all records with a Continent value of 'Asia' (regardless of their GNP value). However, a different placement of parentheses results in a very different meaning: WHERE GNP < 1000 AND (Continent = 'Africa' OR Continent = 'Asia') That expression finds records that have a GNP value less than 1000 and a Continent value of either 'Africa' or 'Asia'. More information on writing expressions can be found in Chapter 10, "SQL Expressions." Detailed descriptions of the operators and functions that you can use in expressions are provided in the MySQL Reference Manual. It's possible to prevent SELECT statements that might generate a great deal of output from returning more than 1,000 rows. The mysql client supports this feature if you invoke it with the --safe-updates option. For more information, see Section 2.9, "Using the --safe-updates Option." 9.3.1. Using ORDER BY to Sort Query ResultsBy default, the rows in the result set produced by a SELECT statement are returned by the server to the client in no particular order. When you issue a query, the server is free to return the rows in any convenient order. This order can be affected by factors such as the order in which rows are actually stored in the table or which indexes are used to process the query. If you require output rows to be returned in a specific order, include an ORDER BY clause that indicates how to sort the results. The examples in this section demonstrate ORDER BY using a table t that has the following contents (id is numeric, last_name and first_name are strings, and birth contains dates): mysql> SELECT id, last_name, first_name, birth FROM t; +------+-----------+------------+------------+ | id | last_name | first_name | birth | +------+-----------+------------+------------+ | 1 | Brown | Bill | 1972-10-14 | | 2 | Larsson | Sven | 1965-01-03 | | 3 | Brown | Betty | 1971-07-12 | | 4 | Larsson | Selma | 1968-05-29 | +------+-----------+------------+------------+ ORDER BY provides a great deal of flexibility for sorting result sets. It has the following characteristics:
9.3.2. The Natural Sort Order of Data TypesEach type of data managed by MySQL has its own natural sort order. For the most part, these orders are fairly intuitive. The rules for string sorting are the most complex because they depend on whether the strings are non-binary, binary, or come from ENUM or SET columns.
9.3.3. Limiting a Selection Using LIMITMySQL supports a LIMIT clause in SELECT statements, which tells the server to return only some of the rows selected by the statement. This is useful for retrieving records based on their position within the set of selected rows. LIMIT may be given with either one or two arguments: LIMIT row_count LIMIT skip_count, row_count Each argument must be given as an integer constant. You cannot use expressions, user variables, and so forth. When followed by a single integer, row_count, LIMIT returns the first row_count rows from the beginning of the result set. To select just the first 10 rows of a result set, use LIMIT 10: SELECT * FROM Country LIMIT 10; When followed by two integers, skip_count and row_count, LIMIT skips the first skip_count rows from the beginning of the result set, and then returns the next row_count rows. To skip the first 20 rows and then return the next 10 rows, do this: SELECT * FROM Country LIMIT 20,10; The single-argument form of LIMIT is applicable only when the rows you want to retrieve appear at the beginning of the result set. The two-argument form is more general and can be used to select an arbitrary section of rows from anywhere in the result set. When you need only some of the rows selected by a query, LIMIT is an efficient way to obtain them. For a client application that fetches rows from the server, you get better performance by adding LIMIT to the query than by having the client fetch all the rows and discard all but the ones of interest. By using LIMIT, the unwanted rows never cross the network at all. It's often helpful to include an ORDER BY clause to put the rows in a particular order when you use LIMIT. When ORDER BY and LIMIT are used together, MySQL applies ORDER BY first and then LIMIT. One common use for this is to find the row containing the smallest or largest values in a particular column. For example, to find the row in a table t containing the smallest id value, use this statement: SELECT * FROM t ORDER BY id LIMIT 1; To find the largest value instead, use DESC to sort the rows in reverse: SELECT * FROM t ORDER BY id DESC LIMIT 1; The two-argument form of LIMIT is useful in conjunction with ORDER BY for situations in which you want to process successive sections of a result set. For example, in Web applications, it's common to display the result of a large search across a series of pages that each present one section of the result. To retrieve sections of the search result this way, issue a series of statements that all specify the same number of rows to return in the LIMIT clause, but vary the number of initial rows to skip: SELECT * FROM t ORDER BY id LIMIT 0, 20; SELECT * FROM t ORDER BY id LIMIT 20, 20; SELECT * FROM t ORDER BY id LIMIT 40, 20; SELECT * FROM t ORDER BY id LIMIT 60, 20; ... It's possible to abuse the LIMIT feature. For example, it isn't a good idea to use a clause such as LIMIT 1000000, 10 to return 10 rows from a query that normally would return more than a million rows. The server must still process the query to determine the first million rows before returning the 10 rows. It's better to use a WHERE clause to reduce the query result to a more manageable size, and then use LIMIT to pull rows from that reduced result. This also makes the use of ORDER BY with LIMIT more efficient because the server need not sort as large a row set before applying the limit. The UPDATE and DELETE statements also support the use of LIMIT, which causes only a certain number of rows to be updated or deleted. See Chapter 11, "Updating Data." 9.3.4. Using DISTINCT to Eliminate DuplicatesIf a query returns a result that contains duplicate rows, you can remove duplicates to produce a result set in which every row is unique. To do this, include the keyword DISTINCT after SELECT and before the output column list. Suppose that a query returns a result set that contains duplicated rows: mysql> SELECT last_name FROM t; +-----------+ | last_name | +-----------+ | Brown | | Larsson | | Brown | | Larsson | +-----------+ Adding DISTINCT removes the duplicates and returns only unique rows: mysql> SELECT DISTINCT last_name FROM t; +-----------+ | last_name | +-----------+ | Brown | | Larsson | +-----------+ Duplicate elimination for string values happens differently for non-binary and binary strings. The strings 'ABC', 'Abc', and 'abc' are considered distinct if they're binary strings. If they are non-binary strings, they are considered distinct if they have different values based on their collation. DISTINCT treats all NULL values within a given column as having the same value. Suppose that a table t contains the following rows: mysql> SELECT i, j FROM t; +------+------+ | i | j | +------+------+ | 1 | 2 | | 1 | NULL | | 1 | NULL | +------+------+ For purposes of DISTINCT, the NULL values in the second column are the same, so the second and third rows are identical. Adding DISTINCT to the query eliminates one of them as a duplicate: mysql> SELECT DISTINCT i, j FROM t; +------+------+ | i | j | +------+------+ | 1 | 2 | | 1 | NULL | +------+------+ Using DISTINCT is logically equivalent to using GROUP BY on all selected columns with no aggregate function. For such a query, GROUP BY just produces a list of distinct grouping values. If you display and group by a single column, the query produces the distinct values in that column. If you display and group by multiple columns, the query produces the distinct combinations of values in the column. For example, the following two queries produce the same set of rows: SELECT DISTINCT id FROM t; SELECT id FROM t GROUP BY id; As do these: SELECT DISTINCT id, name FROM t; SELECT id, name FROM t GROUP BY id, name; Another correspondence between the behavior of DISTINCT and GROUP BY is that for purposes of assessing distinctness, DISTINCT considers all NULL values the same. This is analogous to the way that GROUP BY groups NULL values. A difference between DISTINCT and GROUP BY is that DISTINCT doesn't cause row sorting. In MySQL, GROUP BY does cause sorting. DISTINCT can be used with the COUNT() function to count how many distinct values a column contains. In this case, NULL values are ignored: mysql> SELECT j FROM t; +------+ | j | +------+ | 2 | | NULL | | NULL | +------+ mysql> SELECT COUNT(DISTINCT j) FROM t; +-------------------+ | COUNT(DISTINCT j) | +-------------------+ | 1 | +-------------------+ COUNT(DISTINCT) is discussed further in Section 9.4.3, "The COUNT() Aggregate Function." |