|< Day Day Up >|
When you submit a request to the MySQL database engine, it must select one of many potential strategies to give you your requested results. These approaches often follow very different processing steps, yet all must arrive at the same results.
A good real-world analogy is to imagine that you are beginning a journey by car. If you only need to drive from your house to the one next door, you don't need to do much route planning: There's only one reasonable way to get there. However, suppose that you want to visit four friends, each of whom lives in a different city, all of whom are spread several hundred kilometers apart. It doesn't matter which friend you visit first, but you must see them all. There are dozens of streets and roads interlacing these cities: Some are high-speed freeways, whereas others are meandering country lanes. Unfortunately, you have very limited time to visit all of these friends, and you've heard rumors that traffic and construction are very bad on some of these roads.
If you were to sit down and map out all potential routes from the thousands of possible permutations, it's likely that it would take you longer to complete your mapping than if you simply picked one friend to visit at random, followed by another, and so on. Despite this overwhelming task, you do need some sort of plan, so the most logical thing to do is to obtain the most recent maps and quickly chart out the best possible course so that you visit all four friends as quickly as possible. When not considering information about construction and traffic, you would likely choose the high-speed freeways over alternate routes so that you could visit each friend as quickly as possible.
In a relational database management system like MySQL, the optimizer is responsible for charting the most efficient course to achieve your results. It must weigh many factors, such as table size, indexes, query columns, and so on before it can generate an accurate query plan.
For example, imagine that you issue a simple query that joins data from a table containing customer records with data from a table containing technical support entries. You ask the engine to find you all customers who live in Belgium (the country field in the customer records table) and have submitted a tech support case within the last six months (the date_registered field in the tech support table). Furthermore, suppose that there is an index in place on the customer records table's country field but not one on the tech support table's date_registered field.
How should the query proceed? Should it first scan all rows in the tech support table to find records with date_registered in the last six months and then use the index to search the customer records table on the country field? Or, should it do the opposite by first finding Belgians, and then finding appropriately time-stamped records from that group?
This is the kind of work that optimizers do: Using all available information about the table(s), pick an action plan that correctly satisfies the user's request in the most efficient way. However, optimizers are not flawless: They often require assistance and maintenance from database administrators, and can also take suggestions from developers on how best to achieve their results.
This chapter examines much more about MySQL's optimizer. First, this chapter explores how to provide the optimizer with the most up-to-date information so it can make the best decisions. Next, it discusses how to enable and interpret the extensive reporting that is available from the optimizer. The chapter then discusses how to give hints to the optimizer and override the automatically generated query plans.
Although there's an obvious interplay among your table design, index structure, and queries, this chapter focuses on the optimizer itself. Best practices for indexes and SQL are explored in Chapters 7 and 8 ("Indexing Strategies" and "Advanced SQL Tips"), respectively.
|< Day Day Up >|