Formatting Your SQL Statement

Team-Fly    

 
Sams Teach Yourself SQL in 24 Hours, Third Edition
By Ronald R. Plew, Ryan K. Stephens
Table of Contents
Hour  17.  Improving Database Performance


Formatting your SQL statement sounds like an obvious statement; as obvious as it may sound, it is worth mentioning. There are several things that a newcomer to SQL will probably not take into consideration when building a SQL statement. The following sections discuss the listed considerations; some are common sense, others are not so obvious:

  • Formatting SQL statements for readability

  • The order of tables in the FROM clause

  • The placement of the most restrictive conditions in the WHERE clause

  • The placement of join conditions in the WHERE clause

graphics/note_icon.gif

Most relational database implementations have what is called an SQL optimizer, which evaluates a SQL statement and determines the best method for executing the statement based on the way a SQL statement is written and the availability of indexes in the database. Not all optimizers are the same. Please check your implementation or consult the database administrator to learn how the optimizer reads SQL code. You should understand how the optimizer works to effectively tune a SQL statement.


Formatting a Statement for Readability

Formatting a SQL statement for readability is pretty obvious, but many SQL statements have not been written neatly. Although the neatness of a statement does not affect the actual performance (the database does not care how neat the statement appears), careful formatting is the first step in tuning a statement. When you look at a SQL statement with tuning intentions, making the statement readable is always the first thing to do. How can you determine whether the statement is written well if it is difficult to read?

Some basic rules for making a statement readable include:

  • Always begin a new line with each clause in the statement For example, place the FROM clause on a separate line from the SELECT clause. Place the WHERE clause on a separate line from the FROM clause, and so on.

  • Use tabs or spaces for indentation when arguments of a clause in the statement exceed one line.

  • Use tabs and spaces consistently.

  • Use table aliases when multiple tables are used in the statement The use of the full table name to qualify each column in the statement quickly clutters the statement and makes reading it difficult.

  • Use remarks sparingly in SQL statements if they are available within your specific implementation Remarks are great for documentation, but too many of them clutter a statement.

  • Begin a new line with each column name in the SELECT clause if many columns are being selected.

  • Begin a new line with each table name in the FROM clause if many tables are being used.

  • Begin a new line with each condition of the WHERE clause You can easily see all conditions of the statement and the order in which they are used.

The following is an example of an unreadable statement:

 graphics/input_icon.gif  SELECT CUSTOMER_TBL.CUST_ID, CUSTOMER_TBL.CUST_NAME,   CUSTOMER_TBL.CUST_PHONE, ORDERS_TBL.ORD_NUM, ORDERS_TBL.QTY   FROM CUSTOMER_TBL, ORDERS_TBL   WHERE CUSTOMER_TBL.CUST_ID = ORDERS_TBL.CUST_ID   AND ORDERS_TBL.QTY > 1 AND CUSTOMER_TBL.CUST_NAME LIKE 'G%'   ORDER BY CUSTOMER_TBL.CUST_NAME;  graphics/output_icon.gif CUST_ID    CUST_NAME                      CUST_PHONE ORD_NUM           QTY ---------- ------------------------------ ---------- ----------------- --- 287        GAVINS PLACE                   3172719991 18D778             10 1 row selected. 

The following is an example of a reformatted statement for improved readability:

 graphics/input_icon.gif  SELECT C.CUST_ID,   C.CUST_NAME,   C.CUST_PHONE,   O.ORD_NUM,   O.QTY   FROM ORDERS_TBL O,   CUSTOMER_TBL C   WHERE O.CUST_ID = C.CUST_ID   AND O.QTY > 1   AND C.CUST_NAME LIKE 'G%'   ORDER BY 2;  graphics/output_icon.gif CUST_ID    CUST_NAME                      CUST_PHONE ORD_NUM           QTY ---------- ------------------------------ ---------- ----------------- --- 287        GAVINS PLACE                   3172719991 18D778             10 1 row selected. 

Both statements are exactly the same, but the second statement is much more readable. The second statement has been greatly simplified by using table aliases, which have been defined in the query's FROM clause. Spacing has been used to align the elements of each clause, making each clause stand out.

Again, making a statement more readable does not directly improve its performance, but it assists you in making modifications and debugging a lengthy and otherwise complex statement. Now you can easily identify the columns being selected, the tables being used, the table joins being performed, and the conditions being placed on the query.

graphics/note_icon.gif

It is especially important to establish coding standards in a multi- user programming environment. If all code is consistently formatted, shared code and modifications to code are much easier to manage.


Proper Arrangement of Tables in the FROM Clause

The arrangement or order of tables in the FROM clause may make a difference, depending on how the optimizer reads the SQL statement. For example, it may be more beneficial to list the smaller tables first and the larger tables last. Some users with lots of experience have found that listing the larger tables last in the FROM clause proves to be more efficient.

The following is an example FROM clause:

 FROM SMALLEST TABLE,       LARGEST TABLE 
graphics/note_icon.gif

Check your particular implementation for performance tips, if any, when listing multiple tables in the FROM clause.


Proper Order of Join Conditions

As you learned in Hour 13, "Joining Tables in Queries," most joins use a BASE TABLE to link tables that have one or more common columns on which to join. The BASE TABLE is the main table that most or all tables are joined to in a query. The column from the BASE TABLE is normally placed on the right side of a join operation in the WHERE clause. The tables being joined to the BASE TABLE are normally in order from smallest to largest, similar to the tables listed in the FROM clause.

Should there not be a BASE TABLE, the tables should be listed from smallest to largest, with the largest tables on the right side of the join operation in the WHERE clause. The join conditions should be in the first position(s) of the WHERE clause followed by the filter clause(s), as shown in the following:

 FROM TABLE1,                                 Smallest Table       TABLE2,                                 to      TABLE3                                  Largest Table, also BASE TABLE WHERE TABLE1.COLUMN = TABLE3.COLUMN          Join condition   AND TABLE2.COLUMN = TABLE3.COLUMN          Join condition [ AND CONDITION1 ]                           Filter condition [ AND CONDITION2 ]                           Filter condition 

In this example, TABLE3 is used as the BASE TABLE. TABLE1 and TABLE2 are joined to TABLE3 for both simplicity and proven efficiency.

graphics/tip_icon.gif

Because joins typically return a high percentage of rows of data from the table(s), join conditions should be evaluated after more restrictive conditions.


The Most Restrictive Condition

The most restrictive condition is typically the driving factor in achieving optimal performance for a SQL query. What is the most restrictive condition? The condition in the WHERE clause of a statement that returns the fewest rows of data. Conversely, the least restrictive condition is the condition in a statement that returns the most rows of data. This hour is concerned with the most restrictive condition simply because it is this condition that filters the data that is to be returned by the query the most.

It should be your goal for the SQL optimizer to evaluate the most restrictive condition first because a smaller subset of data is returned by the condition, thus reducing the query's overhead. The effective placement of the most restrictive condition in the query requires knowledge of how the optimizer operates. The optimizers, in some cases, seem to read from the bottom of the WHERE clause up. Therefore, you would want to place the most restrictive condition last in the WHERE clause, which is the condition that is first read by the optimizer.

 FROM TABLE1,                              Smallest Table       TABLE2,                              to      TABLE3                               Largest Table, also BASE TABLE WHERE TABLE1.COLUMN = TABLE3.COLUMN       Join condition   AND TABLE2.COLUMN = TABLE3.COLUMN       Join condition [ AND CONDITION1 ]                        Least restrictive [ AND CONDITION2 ]                        Most restrictive 
graphics/tip_icon.gif

If you do not know how your particular implementation's SQL optimizer works, the DBA does not know, or you do not have sufficient documentation, you can execute a large query that takes a while to run, and then rearrange conditions in the WHERE clause. Be sure to record the time it takes the query to complete each time you make changes. You should only have to run a couple of tests to figure out whether the optimizer reads the WHERE clause from the top to bottom or bottom to top. If possible, it is best to turn off database caching during the testing for more accurate results.


The following is an example using a phony table:

Table:

TEST

Row count:

95,867

Conditions:

WHERE LAST_NAME = 'SMITH'

returns 2,000 rows

WHERE CITY = 'INDIANAPOLIS'

returns 30,000 rows

Most restrictive condition is:

WHERE LAST_NAME = 'SMITH'

 QUERY1: graphics/input_icon.gif  SELECT COUNT(*)   FROM TEST   WHERE LAST_NAME = 'SMITH'   AND CITY = 'INDIANAPOLIS';  graphics/output_icon.gif COUNT(*) ----------      1,024 
 QUERY2: graphics/input_icon.gif  SELECT COUNT(*)   FROM TEST   WHERE CITY = 'INDIANAPOLIS'   AND LAST_NAME = 'SMITH';  graphics/output_icon.gif COUNT(*) ----------      1,024 

Suppose that QUERY1 completed in 20 seconds, whereas QUERY2 completed in 10 seconds. Because QUERY2 returned faster results and the most restrictive condition was listed last in the WHERE clause, it would be safe to assume that the optimizer reads the WHERE clause from the bottom up.

graphics/note_icon.gif

It is a good practice to try to use an indexed column as the most restrictive condition in a query. Indexes generally improve a query's performance.



Team-Fly    
Top
 


Sams Teach Yourself SQL in 24 Hours
Sams Teach Yourself SQL in 24 Hours (5th Edition) (Sams Teach Yourself -- Hours)
ISBN: 0672335417
EAN: 2147483647
Year: 2002
Pages: 275

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net