The type of join plan used by a DBMS depends on indexes, table sizes, and selectivity. (Selectivity is the measure of the distinct values in an index; see Chapter 9, "Indexes.") Table 5-1 shows the strategies normally used by the Big Eight. Table 5-1. Join Plans Used by DBMSs
Nested-Loop Joins
A nested-loop join is based on some variation of the nested-loop pseudocode shown in Listing 5-1. Let's call the table in the outer loop the " outer table " and the table in the inner loop the " inner table ." This is straightforward terminology, and better than the terms "driven table" and "driving table"the term "driving table" is synonymous with "outer table," but we find that confusing. While studying Listing 5-1, keep this in mind:
Listing 5-1 Nested-Loop Pseudocodefor (each row in Table1) { /* outer loop */ for (each row in Table2) { /* inner loop */ if (Table1 join column matches Table2 join column) pass else fail } } Just using the technique shown in Listing 5-1 results in a huge improvement because the temporary table that results isn't a one-million-row Cartesian jointhe code filters before adding rows. But there are still a million comparisons to do, so we must get even more sophisticated. Consider that the important factor now is not the number of comparisonscomparisons are relatively cheapbut the number of page reads. [1] When we think of the loop in terms of pages, it looks like the pseudocode shown in Listing 5-2.
Listing 5-2 Nested-Loop Page Readsfor (each page in Table1) { /* outer loop */ for (each page in Table2) { /* inner loop */ for (each row in Table1-page) { /* cheap stuff */ for (each row in Table2-page) { if (join column matches) pass else fail } } } } And now, a first aha! moment. If there are 11 pages in the DBMS cache, and the inner table ( Table2 ) has only 10 pages, then after the DBMS has iterated once through the inner loop, the Table2 -page is in the cache when the second iteration begins. In that case, the total number of page reads is not the product of the tables' sizes, but merely the sum of the number of pages in both Table1 and Table2 . Result: end of geometric explosions. We can learn two lessons from this: Lesson for the DBMS maker When two tables differ greatly in size, the smaller table should be the inner table because there's more chance to fit all the pages in the cache, and caching matters only for the inner table. Lesson for you Fitting more rows into small tables has a major effect on some join speeds. Concentrate on making the "small" table smaller. Actually, in a sequential pass through a file, it's common to read multiple pages at a time. That doesn't affect the logic of what we're saying here at all, but you may as well know that people don't talk about "paged nested-loop joins." Instead they talk about "blocked nested-loop joins" because a group of pages is often called a "block." If the cache is too small, the DBMS can look for an index to alleviate the problem. If an index is doing its job, then there's no need to scan every row in the inner table. It's enough to look up the value in the table's index. If the table has a three-layer index, of which two layers are unlikely to be in the cache, then we can estimate that the number of page reads will be: (number of pages in outer table) + (number of pages in inner table * 2) And from this, we learn two more lessons: Lesson for the DBMS maker The table with the good index should be the inner table. Lesson for you Having an index on one table can help a lotbut having indexes on both tables is not quite as important. With a pure nested-loop join plan, one of the tables will be the outer table and the DBMS can go through all its pages sequentially.
So far we've looked only at the general case where entire tables are joined. That's frequent for reports , but a query in a short transaction usually takes this pattern: SELECT * FROM Table1, Table2 WHERE Table1.column1 = Table2.column1 AND Table1.column2 = <literal> The AND Table1.column2 = <literal> expression in this example is known as a restrictive expression . If you apply a restrictive expression, the number of rows decreases (because not all rows in a table will be true for the expression). In contrast, WHERE Table1.column1 = Table2.column1 is a join expression . If you apply a join expression, the number of rows increases (because several rows in Table2 might match the ones in Table1 ). Because of this, it's important to ensure that you restrict before you join. Given that, which should be the outer table in this example? Answer: Table1 , because if Table1 is the inner table, the DBMS would have to apply the restriction every time a row comparison is made in the inner loopwhich is still ( outer * inner rows ) times. Two more lessons can be learned from this: Lesson for the DBMS maker If a table has a restrictive expression on it, that table should be the outer table. Lesson for you Use restrictive expressions on joins whenever you can, because that will reduce the number of iterations of the outer loop. There are some joins that can be replaced by an INTERSECT . . . CORRESPONDING clause, and that works well enough if both tables have restrictive clauses (which can make nested-loop joining difficult because the DBMS has to apply a restriction every time a row comparison is made in the inner loop as well as the outer loop). However, because many DBMSs don't support INTERSECT, we won't discuss this option further. Instead, let's take another look now at the join expression, which so far we've seen only as Table1.column1 = Table2.column1 . Joins, of course, can get a little more complicated than that. First of all, there might be an ANDed expression, as in:
SELECT * FROM Table1, Table2 WHERE Table1.column1 = Table2.column1 AND Table1.column2 = Table2.column2 In this case, it's not enough to ensure that column1 and column2 in the inner table are indexed. Instead, you need to make a compound index on (column1 , column2) in the inner table (GAIN: 7/8 if compound index rather than two separate indexes). It's also important to make sure that the joined columns in Table1 and Table2 have exactly the same data type and the same size (GAIN: 7/8 if the data types are exactly the same). That's not a trivial efficiency mattermany DBMSs won't even use the index if there's a size mismatch. What if the inner table is too big to fit in a cache and has no useful index? Or what if there's a restrictive clause on the inner table too? Well, then you're in trouble! The only bright spot in this situation is if you're using Microsoft or Sybase. These DBMSs will internally "create temporary table which is clustered by the join key" then "populate temporary table by selecting the join columns and using the restrictive clause." Other DBMSs, more prosaically, simply make a temporary index on the inner table. That might sound like a solution, but it's better to have a permanent index than a temporary index that gets created and dropped whenever the query comes along. Another thing you can do if both tables are indexed is encourage the searches to occur in index order. For example, suppose you have this query: SELECT * FROM Table1, Table2 WHERE Table1.column1 = Table2.column1 For this example, assume that Table1.column1 contains these values: {15 , 35 , 5 , 7} . If lookups of Table2.column1 occur in that order, then the disk heads are jumping around. But if you force the DBMS to search Table2 . column1 in numeric order, that problem goes away. To do this, change the query to: SELECT * FROM Table1, Table2 WHERE Table1.column1 = Table2.column1 AND Table1.column1 > 0 GAIN: 4/7 WARNING Don't do this for Informix; it shows a loss. The gain shown is for only seven DBMSs. Remember that this works only if Table1.column1 is indexed and the index is used by the DBMS. The values found are now {5 , 7 , 15 , 35} , and the lookups in the inner loop will be going forward in the index, which is good for caching and which makes read-aheads more useful. A variation of this trick when the inner table is not indexed, is to encourage the searches to occur in ascending ROWID order. This can be done if ROWIDs are accessible to the programmer, but a DBMS can also do this sort of thing automatically. Here are our final three lessons about nested- loops : When your join contains an ANDed expression, make a compound index on the inner table's join columns. Make sure that the joined columns in your inner table and outer table have exactly the same data type and the same size. Search in index order. The Bottom Line: Nested-Loop Join PlansThe choice of inner/outer tables depends on restricting expressions, then index qualities, then table sizes. You can influence the choice or you can go with the flow. When conditions are good, nested-loops can produce result rows only two or three times more slowly than selections from a single unjoined table. Nested-loops are the only strategy that works with weird conjunctions, such as LIKE joins. Because they're flexible and easy to implement, nested-loop joins are supported by all DBMSs and are the default choice in most situations. Typical OLTP queries will involve nested-loop joins 100% of the time, and even reporting queries will involve nested-loop joins more than 50% of the time.
Sort-Merge JoinsA sort-merge join , sometimes known as a merge scan or simply merge join , is a tad more complex than a nested-loop, but we can still describe sort-merge joins in the few lines of pseudocode shown in Listing 5-3. Listing 5-3 Sort-Merge Pseudocodesort (Table1) sort (Table2) get first row (Table1) get first row (Table2) for (;;until no more rows in tables) { /* merge phase */ if (join-column in Table1 < join-column in Table2) get next row (Table1) elseif (join-column in Table1 > join-column in Table2) get next row (Table2) elseif (join-column in Table1 = join-column in Table2) { pass get next row (Table1) get next row (Table2) } } Note: There is an assumption here that no duplicates exist. Duplicates make the sort-merge plan more complex and slow. In the merge phase of a sort-merge join, the DBMS is always going forward. It never needs to get the same row twice, so this is a one-pass rather than a nested-loopa big advantage. The downside is the cost of sorting. However, efficient sorts should be faster than comparisons of everything to everything. In fact it's provable that sort-merge is faster than nested-loop if both tables are largejust compare the number of page reads. Even so, DBMSs will prefer nested-loops because they require less RAM, are more flexible, and have no startup overhead. So all we'll look at here is the ideal sort-merge query. It looks like this: SELECT * FROM Table1, Table2 WHERE Table1.column1 = Table2.column1 What makes this example ideal is that the comparison operator is equals and there are no restrictive expressionsthere are only join expressions, which is typical, in fact, of a query you'd use to produce a report. For such queries, you might gain with sort-merge but be warned that there's extra hassle, which might involve your DBA. (For example, with Sybase you need to explicitly "enable" sort-merge, and with Oracle you must have a large value for SORT_AREA_SIZE in init.ora.) A wonderful opportunity for the sort-merge occurs when Table1 and Table2 are already in order by the same key. That would be true in cases where column1 is the clustered key (that is, the column chosen to be the index key for a clustered index; see Chapter 9, "Indexes") in both tables. However, if only one table is indexed, then nested-loop can take advantage of that index just as readily as sort-merge can. The Bottom Line: Sort-Merge Join PlansSort-merge is excellent when the query type is right, the RAM is sufficient, and the two tables are so similar that neither is an obvious driver. If DBMSs would improve support of sort-merge by taking more advantage of existing indexes, sort-merges would be more important for OLTP work than they are currently. Hash JoinsA hash join is another method for producing a joined table. Given two input tables Table1 and Table2 , processing is as follows :
Thus, a hash join is a type of nested-loop join in which the inner table rows are looked up via a hash algorithm instead of via an index. Most DBMSs don't keep permanent hash tables so here we're talking about a temporary hash table that the DBMS creates in RAM just before beginning the loop and destroys after the end of the loop. For a hash join to work properly, the following conditions should all be true:
It might appear at first glance that Condition #1 is a showstopper when the inner table has more than, say, 100,000 rows. In fact, though, all DBMSs will perform a trick to keep from overflowing the hash areathey will read partitions of the table. Thus, for example, a DBMS with a 400,000-row table to search reads only 100,000 rows at a time, makes the hash list, does the comparisons, then clears the hash list for the next read. It has to perform the nested-loop four times instead of just once, but that's a lot better than bringing in so many rows at once that the hash table overflows and disk swapping occurs. The important condition is Condition #4. An in-memory hash lookup is always faster than an index lookup, truebut only by one or two I/Os for each iteration. Take our earlier example of a query with a restrictive expression: SELECT * FROM Table1, Table2 WHERE Table1.column1 = Table2.column1 AND Table1.column2 = <literal> If the AND Table1.column2 = <literal> expression is true 20 times, and it takes 20 I/Os to read Table2 , then you're at breakeven point. The cost of setting up the hash list would not be made up by the savings in the inner loop when there are so few iterations of the inner loop. The Bottom Line: Hash Join PlansHash joins beat sort-merge joins beat nested-loop joins when the join is an equijoin with no restrictive expression and the tables are largeeven when there are indexes that hash joins and sort-merge joins ignore. On the other hand, when there is any sort of restrictive expression, nested-loops can do the filtering before the joining, and will therefore beat both hash joins and sort-merge joins. Sort-merge is best only if there's lots of RAM for a sort, or there's been a presort, or if an efficient sorting mechanism is on disk. |