Analyzing How MySQL Uses Indexes with EXPLAIN

Analyzing How MySQL Uses Indexes with EXPLAIN

One of the best-kept secrets in MySQL is the EXPLAIN statement. Even people who've been using MySQL for years seem to have missed this statement, and it makes your life a lot easier!

EXPLAIN shows (explains!) how MySQL processes SELECT statements, uses indexes, and joins the tables. It can help you select better indexes and write your queries more optimally.

To use it, place it before a SELECT statement:

mysql> EXPLAIN SELECT surname,first_name FROM customer WHERE id=1; +----------+-------+---------------+---------+---------+-------+------+-------+ | table    | type  | possible_keys | key     | key_len | ref   | rows | Extra | +----------+-------+---------------+---------+---------+-------+------+-------+ | customer | const | PRIMARY       | PRIMARY |       4 | const |    1 |       | +----------+-------+---------------+---------+---------+-------+------+-------+ 1 row in set (0.00 sec)

So what do all these columns mean? See Table 4.2 for an explanation.

Table 4.2: What the EXPLAIN Columns Mean

Column

Description

table

Shows you which table the rest of the row is about (it's trivial in this example, but it's useful when you join more than one table in a query).

type

This is an important column, as it tells you which type of join is being used. From best to worst, the join types are system, const, eq_ref, ref, range, index, and ALL.

possible_keys

Shows which indexes could possibly apply to this table. If it's empty, there are no possible indexes. You could make one available by looking for a relevant field from the WHERE clause.

key

The index that was actually used. If this is NULL, then no index was used. Rarely, MySQL chooses a less optimal index. In this case, you could force the use of an index by employing USE INDEX(indexname) with your SELECT statement or force MySQL toignore an index with IGNORE INDEX(indexname).

key_len

The length of the index used. The shorter you can make this without losing accuracy, the better.

ref

Tells you which column of the index is used, or a constant if this is possible.

rows

Number of rows MySQL believes it must examine to be able to return the required data.

extra

Extra information about how MySQL will resolve the query. This is discussed more fully in Table 4.3, but the bad ones to see here are Using temporary and Using filesort, which mean that MySQL is unable to make much use of an index at all, andthe results will be slow to retrieve.

Table 4.3 looks at what the descriptions returned by the extra column mean.

Table 4.3: What Descriptions in the extra EXPLAIN Column Mean

Extra Column Description

Description

Distinct

Once MySQL has found one row that matches the row combination, it will not search for more.

Not exists

MySQL optimized the LEFT JOIN and once it has found one row that matches the LEFT JOIN criteria, it will not search for more.

range checked for each record (index map: #)

No ideal index was found, so for each row combination from the earlier tables, MySQL checks which index to use and uses this to return rows from the table. This is one of the slowest joins with an index.

Using filesort

When you see this, the query needs to be optimized. MySQL will need to doan extra step to find out how to sort the rows it returns. It sorts by going through all rows according to the join type and storing the sort key and a pointer to the row for all rows that match the condition. The keys are then sorted and finally the rows returned in the sorted order.

Using index

The column data is returned from the table using only information in the index, without having to read the actual row. This occurs when all the required columns for the table are part of the same index.

Using temporary

When you see this, the query needs to be optimized. Here, MySQL needs tocreate a temporary table to hold the result, which usually occurs when you perform an ORDER BY on a different column set than what you did a GROUP BY on.

Where used

A WHERE clause is used restrict which rows will be matched against the nexttable or returned to the client. If you don't want to return all rows from thetable, and the join type is either ALL or index, this should appear, or elsethere may be a problem with your query.

The type column returned by EXPLAIN tells you the type of join being used. Table 4.4 explains the join type, listing them in order of most to least efficient.

Table 4.4: The Different Join Type

Join

Description

system

The table has only one row: a system table. This is a special case of the const join type.

const

A maximum of one record from the table can match this query (the index would be either a primary key or a unique index). With only one row, the value is effectively a constant, as MySQL reads this value first and then treats it identically to a constant.

eq_ref

In a join, MySQL reads one record from this table for each combination of records from the earlier tables in the query. It is used when the query uses all parts of an index that is either a primary key or a unique key.

ref

This join type occurs if the query uses a key that is not a unique or primary key or is only partofone of these types (for instance, makes use of leftmost prefixing). All records that match will be read from this table for each combination of rows from earlier tables. This join type is heavily dependent on how many records are matched by the indexthe fewer the better.

range

This join type uses the index to return rows from within a range, such as what occurs if you search for something using > or <.

index

This join type scans the entire index for each combination of records from the earlier tables (which is better than ALL, as the indexes are usually smaller than the table data).

ALL

This join scans the entire table for each combination of records from earlier tables. This is usually very bad and should be avoided as much as possible.

Let's return to the example:

mysql> EXPLAIN SELECT surname,first_name FROM customer WHERE id=1; +----------+-------+---------------+---------+---------+-------+------+-------+ | table    | type  | possible_keys | key     | key_len | ref   | rows | Extra | +----------+-------+---------------+---------+---------+-------+------+-------+ | customer | const | PRIMARY       | PRIMARY |       4 | const |    1 |       | +----------+-------+---------------+---------+---------+-------+------+-------+ 1 row in set (0.00 sec)

You already have a primary key in the customer table on the id field, and, because there is only one condition in our query, the id field is equivalent to a constant, and the query is as optimal as it can be. The rows column tells you that MySQL only needed to look at one row to return the results. You can't get better than that! Also, the type of the join (in this case, it's not really a join) is const, standing for constant, which is the best type. Let's look at what happens if you perform a similar query on a table with no indexes:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE employee_number=2; +-----------+------+---------------+------+---------+------+------+------------+ | table     | type | possible_keys | key  | key_len | ref  | rows | Extra      | +-----------+------+---------------+------+---------+------+------+------------+ | sales_rep | ALL  | NULL          | NULL |    NULL | NULL |    5 | where used | +-----------+------+---------------+------+---------+------+------+------------+ 1 row in set (0.00 sec)

This query is as bad as you can get. The join type is ALL, the worst of the lot, there were no possible keys, and five rows were examined to return the results (there are currently only five records in the sales_rep table). Let's see how you can improve the situation:

mysql> SHOW COLUMNS FROM sales_rep; +-----------------+-------------+------+-----+---------+-------+ | Field           | Type        | Null | Key | Default | Extra | +-----------------+-------------+------+-----+---------+-------+ | employee_number | int(11)     | YES  |     | NULL    |       | | surname         | varchar(40) | YES  |     | NULL    |       | | first_name      | varchar(30) | YES  |     | NULL    |       | | commission      | tinyint(4)  | YES  |     | NULL    |       | | date_joined     | date        | YES  |     | NULL    |       | | birthday        | date        | YES  |     | NULL    |       | +-----------------+-------------+------+-----+---------+-------+ 6 rows in set (0.00 sec) mysql> SELECT * FROM sales_rep; +-----------------+----------+------------+------------+-------------+------------+ | employee_number | surname  | first_name | commission | date_joined | birthday   | +-----------------+----------+------------+------------+-------------+------------+ |               1 | Rive     | Sol        |         10 | 2000-02-15  | 1976-03-18 | |               2 | Gordimer | Charlene   |         15 | 1998-07-09  | 1958-11-30 | |               3 | Serote   | Mike       |         10 | 2001-05-14  | 1971-06-18 | |               4 | Rive     | Mongane    |         10 | 2002-11-23  | 1982-01-04 | |               5 | Jomo     | Ignesund   |         10 | 2002-11-29  | 1968-12-01 | +-----------------+----------+------------+------------+-------------+------------+ 5 rows in set (0.01 sec) 

An obvious choice for a primary key here is the employee_number field. There are no duplicate values, and you would not want any, so you can make it a primary key without any complications:

mysql> ALTER TABLE sales_rep MODIFY employee_number INT NOT NULL    PRIMARY KEY; Query OK, 5 rows affected (0.00 sec) Records: 5  Duplicates: 0  Warnings: 0

The result is immediately noticeable if you rerun the EXPLAIN on the query:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE employee_number=2; +-----------+-------+---------------+---------+---------+-------+------+-------+ | table     | type  | possible_keys | key     | key_len | ref   | rows | Extra | +-----------+-------+---------------+---------+---------+-------+------+-------+ | sales_rep | const | PRIMARY       | PRIMARY |       4 | const |    1 |       | +-----------+-------+---------------+---------+---------+-------+------+-------+ 1 row in set (0.00 sec)

This is a vast improvement!

EXPLAIN is also a synonym for DESCRIBE tablename or SHOW COLUMNS FROM tablename if used before a table name.

Calculating during a Query

Let's examine some more complex situations. For example, you want to return all the sales representatives who would have a commission of less than 20 percent if you were to give them all a
5-percent increase in commission earned. The following is one possible query with EXPLAIN:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE (commission+5)<20; +-----------+------+---------------+------+---------+------+------+------------+ | table     | type | possible_keys | key  | key_len | ref  | rows | Extra      | +-----------+------+---------------+------+---------+------+------+------------+ | sales_rep | ALL  | NULL          | NULL |    NULL | NULL |    5 | where used | +-----------+------+---------------+------+---------+------+------+------------+ 

Not so good! This query doesn't make use of any indexes. This isn't surprising because there's only one index so far, a primary key on employee_number. It looks like adding an index on the commission field will improve matters. Because the commission field has duplicate values, it cannot be a primary key (of which you are only allowed one per table anyhow) or a unique key. And it's not a character field, so the only key available to you is an ordinary index:

mysql> ALTER TABLE sales_rep ADD INDEX(commission); Query OK, 5 rows affected (0.01 sec) Records: 5  Duplicates: 0  Warnings: 0

Now, if you rerun the check on the query, you get the following result:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE (commission+5)<20; +-----------+------+---------------+------+---------+------+------+------------+ | table     | type | possible_keys | key  | key_len | ref  | rows | Extra      | +-----------+------+---------------+------+---------+------+------+------------+ | sales_rep | ALL  | NULL          | NULL |    NULL | NULL |    5 | where used | +-----------+------+---------------+------+---------+------+------+------------+ 1 row in set (0.00 sec)

There's no improvement! MySQL is still examining every record. The reason is that commission+5 has to be calculated for each record. The commission field is read from each record in order to add 5 to it, and then compared to the constant 20. You should try not to perform any calculations on the index field. The solution is to perform the calculation on the constant, not the indexed field. In algebraic terms, (x + 5 < y) is the same as (x < y – 5), so you can rewrite the query as follows:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE commission<20-5; +-----------+-------+---------------+------------+---------+------+------+------------+ | table     | type  | possible_keys | key        | key_len | ref  | rows | Extra      | +-----------+-------+---------------+------------+---------+------+------+------------+ | sales_rep | range | commission    | commission |       2 | NULL |    3 | where used | +-----------+-------+---------------+------------+---------+------+------+------------+ 1 row in set (0.00 sec)

This is much better. MySQL performs the calculation once, coming up with a constant of 15, and can search the index for values less than this. You could also have entered the query as this:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE commission<15; +-----------+-------+---------------+------------+---------+------+------+------------+ | table     | type  | possible_keys | key        | key_len | ref  | rows | Extra      | +-----------+-------+---------------+------------+---------+------+------+------------+ | sales_rep | range | commission    | commission |       2 | NULL |    3 | where used | +-----------+-------+---------------+------------+---------+------+------+------------+ 1 row in set (0.00 sec)

where you worked out the constant yourself, but this would not be noticeably faster. Subtracting 5 from 20 costs MySQL almost an immeasurably small amount of time (or at least you'd be challenged to take a measurement!). Notice what happens if you wanted to return all the sales representatives who would earn a commission of exactly 20 percent after the 5-percent increase:

mysql> EXPLAIN SELECT * FROM sales_rep WHERE commission=15; +-----------+------+---------------+------------+---------+-------+------+------------+ | table     | type | possible_keys | key        | key_len | ref   | rows | Extra      | +-----------+------+---------------+------------+---------+-------+------+------------+ | sales_rep | ref  | commission    | commission |       2 | const |    1 | where used | +-----------+------+---------------+------------+---------+-------+------+------------+  1 row in set (0.00 sec) 

The query type changes from range to ref, which is a better one to use if possible. This can be easily understood because returning an exact value is less work than returning a range of values (or many exact values).

Using EXPLAIN with Leftmost Prefixing

Let's revisit the topic of leftmost prefixing and see how EXPLAIN can help you understand it better. Consider the following query:

mysql> EXPLAIN SELECT * FROM customer WHERE first_name='Yvonne'; +----------+------+---------------+------+---------+------+------+------------+ | table    | type | possible_keys | key  | key_len | ref  | rows | Extra      | +----------+------+---------------+------+---------+------+------+------------+ | customer | ALL  | NULL          | NULL |    NULL | NULL |    8 | where used | +----------+------+---------------+------+---------+------+------+------------+ 1 row in set (0.01 sec)

Because the first_name field is not the leftmost part of the index, it is useless to you for indexing purposes, and the join type ALL tells you this fact clearly. The next example shows what EXPLAIN makes of leftmost prefixing:

mysql> EXPLAIN SELECT * FROM customer WHERE surname='Clegg'  AND initial='X' AND first_name='Yvonne'; +----------+------+---------------+---------+---------+-------------------+------+------------+ | table    | type | possible_keys | key     | key_len | ref               | rows | Extra      | +----------+------+---------------+---------+---------+-------------------+------+------------+ | customer | ref  | surname       | surname |      78 | const,const,const |    1 | where used | +----------+------+---------------+---------+---------+-------------------+------+------------+ 1 row in set (0.01 sec)

In this example, the full three fields of the index are used, and the join type is ref because the index in question allows duplicates. If the table structure excluded the possibility of a duplicate combination of surname, initial, and first_name, the join type would have been eq_ ref. Note the ref column, const,const,const, indicating that all three parts of the index are compared against a constant value. The next example shows a similar situation:

mysql> EXPLAIN SELECT * FROM customer WHERE     surname='Clegg' AND initial='X';                         +----------+------+---------------+---------+---------+-------------+------+------------+ | table    | type | possible_keys | key     | key_len | ref         | rows | Extra      | +----------+------+---------------+---------+---------+-------------+------+------------+ | customer | ref  | surname       | surname |      47 | const,const |    1 | where used | +----------+------+---------------+---------+---------+-------------+------+------------+ 1 row in set (0.00 sec) 

Again, the index is used correctly, but only the first two fields are used. The key length here is shorter (meaning MySQL has less to scan and is thus a little quicker). The next case makes no use of leftmost prefixing:

mysql> EXPLAIN SELECT * FROM customer WHERE initial='X'; +----------+------+---------------+------+---------+------+------+------------+ | table    | type | possible_keys | key  | key_len | ref  | rows | Extra      | +----------+------+---------------+------+---------+------+------+------------+ | customer | ALL  | NULL          | NULL |    NULL | NULL |    8 | where used | +----------+------+---------------+------+---------+------+------+------------+ 1 row in set (0.00 sec)

This query does not adhere to the principles of leftmost prefixing and makes no use of an index. The next example also makes no use of leftmost prefixing, but does still make use of an index:

mysql> EXPLAIN SELECT * FROM customer WHERE surname='Clegg'  AND first_name='Yvonne'; +----------+------+---------------+---------+---------+-------+------+------------+ | table    | type | possible_keys | key     | key_len | ref   | rows | Extra      | +----------+------+---------------+---------+---------+-------+------+------------+ | customer | ref  | surname       | surname |      41 | const |    1 | where used | +----------+------+---------------+---------+---------+-------+------+------------+ 1 row in set (0.00 sec)

Although leftmost prefixing is not used here, because the first_name field is out of sequence, the surname is still enough to make good use of the index. In this case, it narrows the number of rows that MySQL needs to scan to only one, because the surname Clegg is unique, irrespective of any first names or initials.



Mastering MySQL 4
Mastering MySQL 4
ISBN: 0782141625
EAN: 2147483647
Year: 2003
Pages: 230
Authors: Ian Gilfillan

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net