Old Style versus ANSI Style

   

Until June 2001, Oracle didn't support ANSI-style joins, but now that Oracle does, each of the Big Eight can work with either old-style joins, ANSI-style joins with USING, or ANSI-style joins with ON. Here are examples of each type:

 old-style join: SELECT * FROM Table1, Table2    WHERE Table1.column1 = Table2.column1  /* old style */ ANSI-style join with USING: SELECT * FROM Table1 JOIN Table2    USING (column1)                        /* ANSI style */ ANSI-style join with ON: SELECT * FROM Table1 JOIN Table2    ON Table1.column1 = Table2.column1     /* ANSI style */ 

Portability

DBMS vendors prefer the second ANSI-style join, with an ON clause. Only MySQL and Oracle also support a join with a USING clause.


The terms "old-style" join and "ANSI-style" join are nobody's official terminology, and we use them only because we needed some names . The fact is that the old style is acceptable in SQL:1999, and the SQL Standard does not deprecate it. ("Deprecate" is a formal word. It means that the SQL Standard foundation document doesn't hint that the old style will be dropped from a future version of the official Standard.) So both styles are acceptable. There remains some debate about which style is "right" between those who prefer the old style (especially Oracle programmers) and those who prefer the ANSI style ( especially Microsoft programmers). The only definite thing is that outer joins should be in ANSI style.

Does the style affect the performance? To answer this question, we wrote the same join two different ways, and tried them both out on the Big Eight. Watch the WHERE clause:

 SELECT * FROM Table1 JOIN Table2   ON Table1.column1 = Table2.column1   WHERE Table2.column2 = 22             /* ANSI style */ SELECT * FROM Table1, Table2   WHERE Table1.column1 = Table2.column1     AND Table2.column2 = 22              /* old style */ GAIN: 1/8 

We found that with most DBMSs, these two statements ran equally fast. But with IBM, the old style is sometimes faster. The reason is hinted at in this quote:

"When you use the ON clause to specify the join, you can use the WHERE clause as a post-join filter. The database server applies the post-join filter of the WHERE clause to the results of the outer join."

IBM Informix Guide to SQL Syntax Documentation

In other words, the restrictive expression is processed after the joining clause. That's what's supposed to happen logically. But if the clauses aren't optimized together at the same time, then the DBMS misses the best join plan: Make Table2 the outer table, find rows in the outer table where Table2. column2 = 22 , and then search for Table1.column1 = Table2.column1 in the inner loop. Our conclusion, then, is that old style is never slower and can be faster.

While we're on the subject of syntax, we should note that it's traditional to put the joining clauses before the restrictive expression, as we've done in all our examples. However, even with rule-based or primitive optimizers, putting the clauses in a different order will not affect performance.

Outer Joins

Only one outer join type is supported by all DBMSs: LEFT joins. And only one syntax is supported by all DBMSs for left joins: ANSI style. Table 5-2 shows the SQL Standard requirements and which options are supported by the Big Eight.

Here's an example of an outer join:

 SELECT Table1.column1   FROM Table1 LEFT JOIN Table2        ON Table1.column1 = Table2.column1 

An outer join can be much slower than an inner join because Table1 must be the outer table and Table2 must be the inner tableotherwise, the join won't work. So even if Table1 is smaller and better indexed and has a restrictive expression, it still can't be the inner table. And that means the DBMS can't pick the optimum join plan.

Table 5-2. ANSI/DBMS Outer Join Support
  Left Right Full
ANSI SQL Yes Yes Yes
IBM Yes Yes Yes
Informix Yes No No
Ingres Yes Yes Yes
InterBase Yes Yes Yes
Microsoft Yes Yes Yes
MySQL Yes Yes No
Oracle Yes Yes Yes
Sybase Yes Yes No

If the DBMS applies a restrictive expression after the join is complete, there won't be any weird problems involving [NOT] NULL. For example, suppose Table1 has a column called column1 that contains the values {1 , 3} . Meanwhile, Table2 also has a column1 , and its values are {1 , 2} . A simple left join of these tables produces these result rows (assume all columns have a default value that is NULL):

Table1.column1 Table2.column1
1 1
3 NULL

A less-simple left join on the same tables can be done with this syntax:

 SELECT Table1.column1, Table2.column1    FROM Table1 LEFT JOIN Table2         ON Table1.column1 = Table2.column1    WHERE Table2.column1 IS NULL 

The result of this query should be these result rows:

Table1.column1 Table2.column1
3 NULL

But that's not what Microsoft used to return! The plausible explanation was that Microsoft did the restrictive expression first. Good for performance; too bad it produced the wrong result. Fortunately, the current version of Microsoft produces the correct result.

If there's one thing that an outer join is definitely faster than, it's UNION. Before the introduction of outer join syntax, the simple left join had to be phrased as:

 SELECT Table1.column1, Table2.column1   /* the inner part */   FROM Table1, Table2   WHERE Table1.column1 = Table2.column1 UNION ALL SELECT column1, CAST(NULL AS INTEGER)   /* the outer part */   FROM Table1   WHERE Table1.column1 NOT IN      (SELECT Table2.column1 FROM Table2) 

If you see this syntax in some old program, replace it with a true left join. We tried that on the Big Eight and got a gain. Here's the equivalent left join statement:

 SELECT Table1.column1, Table2.column1   FROM Table1 LEFT JOIN Table2        ON Table1.column1 = Table2.column1 GAIN: 6/7 

Portability

MySQL doesn't support UNION or subqueries. The gain shown is for only seven DBMSs.


From the old Microsoft fiasco and the poor performance of the UNION, we glean two general rules:

  • A DBMS optimizer will usually optimize only within a single clause (and when it optimizes two clauses at once it can bungle).

  • A DBMS optimizer will always optimize only within a single SELECT.

   


SQL Performance Tuning
SQL Performance Tuning
ISBN: 0201791692
EAN: 2147483647
Year: 2005
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net