WHERE Clause: Setting Horizontal Limits

While selecting everything a table or view could contain might be of value for some operations, most of the time you will be looking for specific information — a person with a particular phone number, data falling into a certain date range, and so on. The table might contain several million rows, and you simply have no time to search for the information all by yourself. The SQL WHERE clause provides a mechanism for setting horizontal limits; specifically, it allows you to limit the number of rows in resultsets returned by a query through specifying some condition or set of conditions. Depending on what conditions you have specified with your query, there might be zero, one, or more records (rows) returned. The search criteria specified in the WHERE clause evaluate to TRUE or FALSE, and all the rules of Boolean algebra are fully applicable there.

Cross-References 

See Appendix L for information on Boolean algebra.

Using comparison operators

To specify conditions in the WHERE clause, SQL employs a number of operators. These are discussed in detail in Chapter 11. Here, we are going to touch them only briefly.

Consider the following query run against the ACME database in Microsoft SQL Server (the syntax and results would be identical in all "big three" databases). It returns some information about a particular customer, uniquely identified by the customer ID field CUST_ID_N. The uniqueness of the customer ID (not that of the address record) value is enforced by the primary key constraint as well as a UNIQUE constraint placed onto the column in the table CUSTOMER. (There could be only one customer associated with any given ID.)

SELECT cust_id_n,        cust_name_s,        cust_status_s                                         FROM   customer WHERE  cust_id_n = 7     CUST_ID_N   CUST_NAME_S           CUST_STATUS_S  ---------- ---------------------- -------------  7          WILE ELECTRONICS INC.   Y     (1 row(s) affected)

You were able to pinpoint the record because only one record satisfied your condition.

Now, nothing prevents a customer from having one or more addresses, which is illustrated by the following query requesting all address records for a customer with a customer ID equal to 7:

SELECT  addr_address_s,          addr_city_s,          addr_state_s,          addr_zip_s FROM    address WHERE   addr_custid_fn = 7     addr_address_s    addr_city_s        addr_state_s addr_zip_s  ----------------- ------------------ ------------ ----------  

411 S LONDON AVE  EGG HARBOR CITY    NJ           08215 232 EEL ST.       EGG HARBOR CITY    NJ           08215 454 OAK ST.       EGG HARBOR CITY    NJ           08215 456 WILLOW ST.    EGG HARBOR CITY    NJ           08215 678 MAPLE AVE.    EGG HARBOR CITY    NJ           08215 . . .              . . .             . . .        . . .  865 CEDAR AVE.    EGG HARBOR CITY    NJ           08215 911 MYRTLE ST.    EGG HARBOR CITY    NJ           08215 777 SITKA AVE.    EGG HARBOR CITY    NJ           08215 999 ELK AVE.      EGG HARBOR CITY    NJ           08215      (11 row(s) affected)

This query yields 11 records, containing all the addresses under which customer number 7 conducts its business.

Comparison operators could also determine a range of values. When you want to know what products in your database are selling for more than $20, use the following query:

SELECT  prod_description_s,          prod_price_n FROM    product WHERE   prod_price_n > 20     PROD_DESCRIPTION_S                           PROD_PRICE_N -------------------------------------------- ------------ STEEL NAILS 6''                                     33.28 STOOL CAPS 5''                                      26.82 BAR RAILS 24X48X128                                 23.10 BAR RAILS 30X45X60                                  27.00 BASES 30X45X60                                      23.10     (5 rows affected) 

This query returns information for all products whose price is over $20.

Compound operators: Using AND and OR

There could be more than one criterion specified with the query. For example, you may want to retrieve all the phone salespersons in your company that are not assigned to a customer. In the table, there are phone numbers and fax numbers; to eliminate the latter, the following query could be used:

SQL> SELECT  phone_salesmanid_fn,               phone_phonenum_s,               phone_type_s FROM         phone WHERE        phone_custid_fn IS NULL AND          phone_type_s = 'PHONE'     PHONE_SALESMANID_FN PHONE_PHONENUM_S     PHONE_TYPE_S ------------------- -------------------- --------------------                  23 (305) 555-8502       PHONE                  24 (626) 555-4435       PHONE                  25 (717) 555-5479       PHONE                  26 (718) 555-7879       PHONE                  27 (718) 555-5091       PHONE                  28 (814) 555-0324       PHONE     6 record(s) selected.

Only records where column PHONE_CUSTID_FN contains NULL and the type of the numbers is PHONE made it to the final resultset.

The records could be selected using the OR compound operator. In the following example, only records for the orders that were placed by customer 63 or that have an order ID equal to 30661 are taken into consideration.

SELECT ordhdr_id_n,        ordhdr_custid_fn FROM   order_header WHERE  ordhdr_id_n = 30661 OR     ordhdr_custid_fn = 63     ORDHDR_ID_N ORDHDR_CUSTID_FN  ----------- ----------------  30613       63 30614       63 30615       63 30661       1     4 records(s) selected. 

This provides you with the functionality to specify disparate selection criteria: the customer ID of the customer who had placed order 30661 is not 7, nevertheless it is present on the list of the records because you've specified that you are interested in the orders placed by customer 63 OR the order #30661, regardless of the customer ID. Using AND in this case would yield no results since order 30661 was not placed by customer 7, and no record satisfying both criteria at the same time would be found.

Using the BETWEEN operator

While it is possible to use a combination of => (greater than or equal to) and <= (less than or equal to) operators to achieve exactly the same results, the BETWEEN operator provides a more convenient (and often more efficient) way for selecting a range of values.

SELECT  prod_description_s,          prod_price_n FROM    product WHERE   prod_price_n BETWEEN 23.10 AND 30     PROD_DESCRIPTION_S                           PROD_PRICE_N -------------------------------------------- ------------ STOOL CAPS 5''                                      26.82 BAR RAILS 24X48X128                                 23.10 BAR RAILS 30X45X60                                  27.00 BASES 30X45X60                                      23.10     4 record(s) selected.

Note that the range of the values is inclusive, that is, the values used in specifying the range are included into the final resultset. This behavior is number-specific, if other database types (e.g., characters, dates, etc.) are used for criteria, results might differ. The operator could be used with virtually any data type: dates, characters, numbers, and so on. (See Chapter 11 for more information.)

Using the IN operator: Set membership test

When there is more than one exact criterion for the WHERE clause, and these criteria do not fit any range of values, you may use an OR statement. Consider the following query:

SELECT  cust_name_s,          cust_credhold_s  FROM    customer WHERE   cust_alias_s = 'MNGA71396' OR         cust_alias_s = 'MNGA71398' OR         cust_alias_s = 'MNGA71400'     CUST_NAME_S                           CUST_CREDHOLD_S ------------------------------------- --------------- WILE SEAL CORP.                       Y MAGNETICS USA INC.                    N MAGNETOMETRIC DEVICES INC.            N     3 record(s) selected.

Any records that correspond to either of the three specified criteria make it into the final resultset. The same result is easier achieved using an IN operator:

SELECT  cust_name_s,          cust_credhold_s  FROM    customer WHERE   cust_alias_s IN          ('MNGA71396', 'MNGA71398', 'MNGA71400')     CUST_NAME_S                        CUST_CREDHOLD_S ---------------------------------- --------------- WILE SEAL CORP.                    Y MAGNETICS USA INC.                 N MAGNETOMETRIC DEVICES INC.         N     3 record(s) selected.

The IN operator makes your life easier by replacing numerous OR statements and speeding up the query execution along the way. All values specified within an IN operator must be of the same data type as they refer to one column.

The NOT operator

The NOT operator negates results of the operator by making it perform a search for the results exactly opposite to those specified. Any of the operators and queries discussed to this point could have produced opposite results if NOT was used. The following example returns all the results that do not match the specified criteria — having the name with the second letter I, third L, and fourth E; only records that do not have such a sequence starting from the second position within the company name are selected:

SQL> SELECT  cust_name_s      FROM    customer      WHERE  cust_name_s  NOT LIKE  '_ILE%'     cust_name_s                                         --------------------------------------------------  MAGNETICS USA INC. MAGNETOMETRIC DEVICES INC. FAIR PARK GARDENS FAIR AND SONS AIR CONDTNG . . . KILBURN GLASS INDUSTRIES CARLTONBATES COMPANY DABAH BROS INC. . . . INSULECTRO INC. INTEGRATED POWER DESIGNS EASTERN SATELLITE COMPANY     32 record(s) selected

Using the IS NULL operator: Special test for NULLS

We have mentioned before that relational databases are using a special value to signify the absence of the data in the database table column — NULL. Since this value does not comply with the rules that all the other values follow (e.g., comparison, operations, etc.), they cannot be detected with the equation/comparison operator =; i.e., the syntax WHERE <column_name> = NULL, while being technically valid in Oracle or DB2 UDB (and valid in Microsoft SQL Server 2000 under certain circumstances), would never yield any data because the equation will always evaluate to FALSE.

The test for NULL is performed with the IS keyword, as in the example below, which retrieves information about salesmen that have customers without a PHONE_CUSTID_FN number.

SELECT  phone_salesmanid_fn,          phone_phonenum_s,          phone_type_s FROM    phone WHERE   phone_custid_fn IS NULL     PHONE_SALESMANID_FN PHONE_PHONENUM_S     PHONE_TYPE_S ------------------- -------------------- --------------------                  23 (305) 555-8502       PHONE                  24 (626) 555-4435       PHONE                  25 (717) 555-5479       PHONE                  26 (718) 555-7879       PHONE                  27 (718) 555-5091       PHONE                  28 (814) 555-0324       PHONE                  23 (305) 555-8501       FAX                  24 (626) 555-4434       FAX                  25 (717) 555-5478       FAX                  26 (718) 555-7878       FAX                  27 (718) 555-5091       FAX                  28 (814) 555-0323       FAX       12 record(s) selected.

Note 

In Microsoft SQL Server 2000 Transact-SQL, a NULL is never equal to another NULL unless you specifically instruct SQL Server to do so by issuing command SET ANSI_NULLS OFF; setting this parameter OFF within the session would allow you to compare a NULL value with another NULL value, setting it back ON (default) brings back the SQL99 standard behavior.

Just as easily, you may test for the absence of the NULL value in the specified column using the NOT operator. The IS NULL test returns FALSE or TRUE, depending on whether the value in the column is or is not a NULL. The syntax follows that of the English language: you cannot use NOT IS NULL (produces error), but IS NOT NULL yields the desired results. The following query produces the records for the salesmen whose customers do have phones (therefore the value in the field PHONE_CUSTID_FN is NOT a NULL):

SELECT  phone_custid_fn,          phone_phonenum_s,          phone_type_s FROM    phone WHERE   phone_custid_fn IS NOT NULL     PHONE_CUSTID_FN PHONE_PHONENUM_S     PHONE_TYPE_S --------------- -------------------- --------------------               1 (909) 555-9957       PHONE               1 (909) 555-9956       FAX               2 (501) 555-5702       PHONE               2 (501) 555-5701       FAX             ...             152 (541) 555-8979       PHONE             152 (541) 555-8978       FAX       74 records

Using subqueries in a WHERE clause

As in the SELECT clause, the subqueries could be used with the WHERE clause to provide missing values (or a set of values). For example, you cannot find information from the ORDER_HEADER table using a customer's name only, because the ORDER_HEADER table contains customer IDs, not the names; thus, the customer ID could be found in the table CUSTOMER using the customer name as a criterion, and then used to select values from the ORDER_HEADER table:

SELECT  ordhdr_nbr_s,         ordhdr_orderdate_d FROM    order_header WHERE   ordhdr_custid_fn =  (SELECT cust_id_n  FROM   customer  WHERE  cust_name_s = 'WILE ELECTRONICS INC.')     ordhdr_nbr_s    ordhdr_orderdate_d                                      ------------- ---------------------------  523733        2002-08-15 00:00:00.000 523734        2002-08-15 00:00:00.000 523735        2002-08-15 00:00:00.000     (3 row(s) affected)

The subquery executes first, thus providing the outer query with a set of values to select from. In this case, a single matching value is expected. If for some reason the subquery returns more than one row (and therefore more than one value could match the ORDHDR_CUSTID field), an error will be generated.

Any of the operators discussed earlier in this chapter could be used with the subqueries, though one has to pay attention to the kind of data required by them: the IN operator would accept a range of values, while LIKE requires a single value to function. For example, the following query uses the IN operator for several values returned by the subquery:

SELECT   phone_phonenum_s,           phone_type_s   FROM     phone WHERE    phone_salesmanid_fn IN (SELECT salesman_id_n  FROM   salesman  WHERE  salesman_code_s BETWEEN '07' and '10')     PHONE_PHONENUM_S     PHONE_TYPE_S -------------------- -------------------- (718) 555-7879       PHONE (718) 555-5091       PHONE (814) 555-0324       PHONE (718) 555-7878       FAX (718) 555-5091       FAX (814) 555-0323       FAX     6 record(s) selected. 

When the subquery returns a set of possible values, and the outer query employs some operator that compares them, how would this query distinguish between these values? SQL introduces two operators ANY and ALL to accomplish this distinction.

There is a view in the ACME database that contains order totals for a single company — WILE BESS COMPANY. The view was created to illustrate the usage of these two operators. Consider the following query:

SELECT    v_total.customer_name,            v_total.total_price FROM      v_customer_totals  v_total  WHERE     v_total.total_price >    ANY (SELECT vctw.total_price         FROM   v_customer_totals_wilebess) ORDER BY  total_price ASC     CUSTOMER_NAME                           TOTAL_PRICE --------------------------------------- ----------- WILE ELECTROMATIC INC.                  7511.00 WILE BESS COMPANY                       7511.00 WILE ELECTROMATIC INC.                  7799.20 WILE BESS COMPANY                       8390.00 WILE BESS COMPANY                       8390.00 . . .                                   . . . WILE SEAL CORP.                         15456.80 WILE ELECTRONICS INC.                   15456.80     29 rows selected.

It is looking for orders with a total greater than ANY order by WILE BESS COMPANY (i.e., greater than any amount present in the set returned by the subquery). That means that records selected by the outer query should be greater than any of the values — effectively meaning that they should be greater than the smallest of the values returned by the subquery (which is 6608.00). The resultset was sorted in ascending order to help you visualize the situation. Here is a query that would produce identical results:

SELECT  v_total.customer_name,          v_total.total_price FROM    v_customer_totals v_total WHERE   v_total.total_price >     (SELECT   MIN(vctw.total_price)     FROM     v_customer_totals_wilebess)     ORDER BY total_price ASC);

The subquery here returns a single value — the minimal TOTAL_PRICE, which could be compared with a single value from the outer query.

Deploying the ALL operator brings up completely different records. It will instruct the outer query to select records whose TOTAL_PRICE is greater than all the values — meaning greater than the greatest value returned by the subquery. The results are arranged in ascending order, to underscore the point that the lowest TOTAL_PRICE column from the view V_CUSTOMER_TOTAL value returned by the outer query is bigger than the greatest:

SELECT  v_total.customer_name,          v_total.total_price FROM    v_customer_totals v_total  WHERE   v_total.total_price >     ALL (SELECT vctw.total_price         FROM v_customer_totals_wilebess vctw) ORDER BY total_price DESC     CUSTOMER_NAME                       TOTAL_PRICE ----------------------------------- ----------- WILE ELECTRONICS INC.               15456.8 WILE SEAL CORP.                     15456.8     2 rows selected.

The same results could be achieved employing MAX function in the subquery, which ensures that only a single value (i.e., 15402.20) would be returned for comparison purposes:

SELECT  customer_name,          total_price FROM    v_customer_totals  WHERE   total_price > (SELECT MAX(total_price)                         FROM v_customer_totals_wilebess) ORDER BY total_price DESC;

Nested subqueries

The subquery could host a subquery in its turn. This is called nested subqueries. There is no theoretical limit on the nesting level — i.e., how many times there could be a query within a query — though some vendors limit it. Subquery is an expensive way (in computer resources terms) to find out information and should be used judiciously.

For example, to find all customers that ever ordered a product called CRATING MATERIAL 12X48X72, you could use the following query:

SELECT cust_name_s,         cust_alias_s  FROM   customer  WHERE  cust_id_n IN    (SELECT ordhdr_custid_fn     FROM order_header     WHERE ordhdr_id_n in      (SELECT ordline_ordhdrid_fn       FROM order_line       WHERE ordline_prodid_fn =        (SELECT prod_id_n          FROM product         WHERE prod_description_s = 'CRATING MATERIAL 12X48X72'        )     )   )  CUST_NAME_S                          CUST_ALIAS_S ------------------------------------ --------------- WILE ELECTROMATIC INC.                ECEL71460 WILE BESS COMPANY                     BSSE71641     2 rows selected.

The first query to be executed would be that with the highest nesting level — three in this case, then the second, then the first, and finally, when all necessary information is gathered, the main query would be executed.

There are also special cases of subquery called correlated subquery. The difference in regular and correlated subqueries is that the values from the outer subquery (host query) can be used as a parameter in the inner subquery. The correlated subqueries will be explained further in the chapter, in the section dealing with combining resultsets of the queries — namely, dealing with INTERSECT and MINUS keywords.




SQL Bible
Microsoft SQL Server 2008 Bible
ISBN: 0470257040
EAN: 2147483647
Year: 2005
Pages: 208

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net