A subset of all possible execution plans can be described as robust . While such plans are not always quite optimum, they are almost always close to optimum in real-world queries, and they have desirable characteristics, such as predictability and low likelihood of errors during execution. (A nonrobust join can fail altogether, with an out-of-TEMP-space error if a hash or sort -merge join needs more space than is available.) Robust plans tend to work well across a wide range of likely data distributions that might occur over time or between different database instances running the same application. Robust plans are also relatively forgiving of uncertainty and error; with a robust plan, a moderate error in the estimated selectivity of a filter might lead to a moderately suboptimal plan, but not to a disastrous plan. When you use robust execution plans, you can almost always solve a SQL tuning problem once, instead of solving it many times as you encounter different data distributions over time and at different customer sites. | Uncertainty and error are inevitable in the inputs for an optimization problem. For example, even with perfect information at parse time (at the time the database generates the execution plan), a condition like Last_Name = :LName has uncertain selectivity, depending on the actual value that will be bound to :LName at execution time. The unavoidability of uncertainty and error makes robustness especially important. | | Robust execution plans tend to have the following properties: -
Their execution cost is proportional to rows returned. -
They require almost no sort or hash space in memory. -
They need not change as all tables grow. -
They have moderate sensitivity to distributions and perform adequately across many instances running the same application, or across any given instance as data changes. -
They are particularly good when it turns out that a query returns fewer rows than you expect (when filters are more selective than they appear). | In a sense, a robust plan is optimistic : it assumes that you have designed your application to process a manageably small number of rows, even when it is not obvious how the query narrows the rowset down to such a small number. | | Robustness requirements imply that you should usually choose to: -
Drive to the first table on a selective index -
Drive to each subsequent table with a nested loop on the index of the full join key that points to a table that the database already read, following the links in the query diagram | Nested loops on full join keys generally scale with the number of rows that match query conditions and avoid memory required to execute hash joins or sort-merge joins, for which memory use might turn out to be excessive. Such excessive memory use can even lead to out-of-temp-space errors if the cached rowsets turn out to be larger than you expect. | | | Driving down before driving up avoids a potential explosion of rows earlier in the plan than you want when it turns out that the detail table has more details per master row than you expected. | | If you consider only robust plans, robustness rules alone answer the first two questions of finding the best execution plan, leaving only the question of join order: -
You will reach every table with a single index, an index on the full filter condition for the first table, and an index on the join key for each of the other tables. -
You will join all tables by nested loops. I later discuss when you can sometimes safely and profitably relax the robustness requirement for nested-loops joins, but for now I focus on the only remaining question for robust plans: the join order. I also later discuss what to do when the perfect execution plan is unavailable, usually because of missing indexes, but for now, assume you are looking for a truly optimum robust plan, unconstrained by missing indexes. |