Chapter 4: Restricting, Sorting, and Grouping Data


Unless your database tables are very small, or your data reporting needs are very limited, you will want to restrict the rows returned from your queries. In cases where you want to see the results of the queries in a particular order, you will want to sort the results. Grouping the data—for example, grouping sales figures by month, salary totals by department, and so forth—can be done in conjunction with restricting and sorting the data in a SQL statement.

Scott’s widget company has been growing by leaps and bounds over the past few months, and it has expanded to international locations. While Scott has enjoyed being the data analyst and DBA, he has turned over these roles to Janice. The employee-related database tables have been redesigned and turned over to the HR department. All of our examples from this point on will use the HR schema, which contains the following tables: COUNTRIES, DEPARTMENTS, EMPLOYEES, JOBS, JOB_HISTORY, LOCATIONS, and REGIONS. The names of these tables should be self-explanatory.

The WHERE Clause

A lot happens in the WHERE clause. This is the place where the rows (with columns both actual and derived) from the list specified in the SELECT clause _are trimmed down to only the results you need to see. Starting with the syntax described in Chapter 2, "SQL*Plus and iSQL*Plus Basics," we can expand the SELECT statement syntax as follows:

SELECT * | {[DISTINCT] column | expression [alias], ...}    FROM tablename [WHERE condition ... ];

The WHERE clause may have one or more conditions, separated by AND and OR and optionally grouped in parentheses to override the default precedence.

From the perspective of the table, the SELECT clause slices a table vertically, and the WHERE clause slices it horizontally.

click to expand

Comparison Conditions

A WHERE clause will often compare one column’s value to a constant or compare two of the columns to each other in some way. Table 4.1 lists the comparison operators that are valid within a WHERE clause.

Table 4.1: Comparison Operators

Comparison Operator

Definition

=

Equal to

>

Greater than

>=

Greater than or equal to

<

Less than

<=

Less than or equal to

<>, !=, ^=

Not equal to

In Chapter 3, “Oracle Database Functions,” you learned about operator precedence. The comparison operators are lower in precedence only to the arithmetic operators *, /, +, and and the concatenation operator ||. This makes a lot of sense when you consider how expressions are typically used in WHERE clauses: Some kind of arithmetic operation is performed on one or more columns or constants, and that result is compared to another constant, column, or arithmetic operation on one or more columns or constants. For instance, consider this WHERE clause:

where salary * 1.10 > 24000

This example will evaluate SALARY * 1.10 first, and then do the comparison _to 24000.

In Scott’s widget company, another corporate shakeup has occurred, and King is once again the president of the company. Janice, in her analyst role, is running some reports against the EMPLOYEES table for King, whose first task is to do a thorough salary review for all employees who have salaries that are within $10,000 of his salary. Janice knows that King’s salary is $24,000, so she will specify this numeric literal in the query, along with the $10,000 for the difference in salary:

select employee_id "Emp ID", last_name "Last Name",     salary "Salary" from employees     where salary + 10000 > 24000;     Emp ID Last Name                     Salary ---------- ------------------------- ----------        100 King                           24000        101 Kochhar                        17000        102 De Haan                        17000 3 rows selected. 

A few things come to mind right away. First, King himself is in the list. You will learn how to remove his name in the next section. Janice could have also written the WHERE clause the other way around:

where 24000 < salary + 10000;

and the results of the query would be the same. Janice could have also saved a bit of processing time by calculating the salary cutoff number before writing the query:

where salary > 14000;

How you write your WHERE clause may be about style, readability, and documentation more than it is about processing speeds, which is why the first version of the WHERE clause might be the best choice.

Note

Column aliases are not allowed in the WHERE clause. The actual column names must be used.

AND, OR, and NOT

The WHERE clause using comparison operators is really powerful, but in reality, you usually have more than one condition for selecting rows. Sometimes you need all of the conditions to be true, sometimes you need only one of the conditions to be true, and sometimes you want to specify what you don’t need. You can accomplish this by using AND, OR, and NOT in your WHERE clauses.

Using an AND between two comparison conditions will give you rows from the table that satisfy both conditions. In one of the queries above, Janice noticed that King’s name was returned in the query that was looking for other employees that had salaries close to King’s. There is no need to include King in this query. Since Janice knows King’s employee ID, she can remove him from the results of those queries by adding an AND condition, as follows:

select employee_id "Emp ID", last_name "Last Name",     salary "Salary" from employees     where salary + 10000 > 24000     and employee_id != 100;     Emp ID Last Name                     Salary ---------- ------------------------- ----------        101 Kochhar                        17000        102 De Haan                        17000 2 rows selected. 

The rules of precedence tell us that AND is very low on the list, and therefore the AND operation is performed last in the WHERE clause. However, for clarity, it doesn’t hurt to add parentheses to make the conditional expressions more obvious:

where (salary + 10000 > 24000)     and (employee_id != 100);

There are other ways to remove King from the query. We’ll discuss some of these methods in Chapter 6, “Advanced SQL Queries.”

Now King decides that he wants to include anyone who works in the IT department, in addition to those whose salaries are close to his. Janice recognizes that this is a job for the OR operator. She modifies the query to include those employees who are in the IT department, using the JOB_ID column:

select employee_id "Emp ID", last_name "Last Name",     salary "Salary" from employees     where (salary + 10000 > 24000)     and (employee_id != 100)     or job_id = ‘IT_PROG’;     Emp ID Last Name                     Salary ---------- ------------------------- ----------        101 Kochhar                        17000        102 De Haan                        17000        103 Hunold                          9000        104 Ernst                           6000        105 Austin                          4800        106 Pataballa                       4800        107 Lorentz                         4200 7 rows selected.

Since the AND has a higher priority than the OR, the salary and employee ID comparisons are evaluated to see if they are both true; if so, the row is returned. If either one or the other is not true, the row might still be returned if the employee is in the IT department. Janice can make this WHERE clause more readable by putting in the parentheses, even if they’re not needed:

where ((salary + 10000 > 24000) and (employee_id != 100)) or (job_id = ‘IT_PROG’);
Tip

When in doubt about operator precedence, use parentheses. Extra parentheses add a negligible amount of processing time and provide additional documentation benefits.

Janice expects that the other shoe will drop in a month or two, when King will ask for a report that has everyone else in it. This is a good place to use NOT. Janice can use this operator to negate the entire set of conditions that gave the first set of rows, thus returning the rest of the rows:

select employee_id "Emp ID", last_name "Last Name",    salary "Salary" from employees    where not    (       (salary + 10000 > 24000)       and (employee_id != 100)       or job_id = ‘IT_PROG’    ) ;     Emp ID Last Name                     Salary ---------- ------------------------- ----------        100 King                           24000        108 Greenberg                      12000        109 Faviet                          9000        110 Chen                            8200 ...        203 Mavris                          6500        204 Baer                           10000        205 Higgins                        12000        206 Gietz                           8300 100 rows selected.

Note how Janice merely put the entire previous WHERE clause into parentheses and added a NOT in the front. One query returns a given set of rows, and a second query returns everything but the given set of rows. So, between the two queries, she has covered the entire table. Janice will have this report ready for King when he asks for it.

BETWEEN, IN, and LIKE

The BETWEEN, IN, and LIKE operators provide more ways to trim down the number of rows returned from a query. BETWEEN gives you an easy way to check for a value that falls within a certain range. The IN operator can help you find values in a list. LIKE can help you find character strings that match a certain pattern. Adding NOT to these will give you just the opposite set of rows.

BETWEEN a Rock and a Hard Place

The BETWEEN operator in a WHERE clause will limit the rows to a range that is specified by a beginning value and an ending value; the range is inclusive. The values can be dates, numbers, or character strings. The column values to be compared will be converted to the datatypes of the values in the BETWEEN operator as needed.

Each quarter at Scott’s widget company, employees are recognized for years of service to the company. Janice is in charge of generating the report that lists the employees who have their anniversary within the next three months. Her query will use one of the functions mentioned in the previous chapter, EXTRACT, which returns one of the individual components of a DATE datatype.

select employee_id "Emp ID", department_id "Dept ID",    hire_date "Hire Date",    last_name || ‘, ‘ || first_name "Name" from employees    /* Oct to Dec */      where extract(month from hire_date) between 10 and 12;     Emp ID    Dept ID Hire Date Name ---------- ---------- --------- ----------------------        113        100 07-DEC-99 Popp, Luis        114         30 07-DEC-94 Raphaely, Den        116         30 24-DEC-97 Baida, Shelli        118         30 15-NOV-98 Himuro, Guy        123         50 10-OCT-97 Jasper, Susan Abigail        124         50 16-NOV-99 Mourgos, Kevin        130         50 30-OCT-97 Atkinson, Mozhe        135         50 12-DEC-99 Gee, Ki        138         50 26-OCT-97 Stiles, Stephen        141         50 17-OCT-95 Rajs, Trenna        145         80 01-OCT-96 Russell, John        148         80 15-OCT-99 Cambrault, Gerald        154         80 09-DEC-98 Cambrault, Nanette        155         80 23-NOV-99 Tuvault, Oliver        160         80 15-DEC-97 Doran, Louise        161         80 03-NOV-98 Sewall, Sarath        162         80 11-NOV-97 Vishney, Clara        191         50 19-DEC-99 Perkins, Randall 18 rows selected.

There is a lot going on in this query. First, notice that the columns are all aliased to make the output much more readable.

Janice also uses the concatenation operator || to make the output more readable. She could have used the CONCAT function here, although she would need to use it twice to get the same results.

There is also something else new in this example: the /* and */. These characters denote a comment in Oracle SQL. A comment is used to help document the SQL code that you’re writing. Documenting your SQL code is good not only for other developers who may need to modify your code in the future, but also for you when, months from now, you can’t quite remember why you used a particular table or function!

comment

Documentation for SQL statements. Comments are specified by using the pair /* and */ or by using --.

Alternatively, you can use -- to specify a comment, like this:

select * from employees -- All columns needed for finance

The main difference between using /* */ and -- is that the latter form treats everything to the end of the line as a comment, whereas the former treats everything as a comment until the closing */ is reached, which may be on the same line or several lines later.

Note

Although both /* */ and -- can almost be used interchangeably, the /* */ form must be used after the SELECT keyword when specifying optimizer hints. See Chapter 12, "Making Things Run Fast (Enough)," for details on how to specify hints to the optimizer.

Last, but not least, the query has the BETWEEN operator. The EXTRACT function will return a value from 1 to 12, and if this value falls in the range of 10 to 12, then the row is returned from the query.

What happens if you change the BETWEEN operator slightly and reverse the order of the months?

where extract(MONTH from HIRE_DATE) between 12 and 10;

Your intuition might tell you that this form of the WHERE clause would work, since 11 would still be between 12 and 10, just as 11 is between 10 and 12. But it doesn’t work. This is because of how Oracle’s SQL engine translates the arguments of the BETWEEN operator. When processing the query, Oracle changes BETWEEN to a pair of comparisons joined with an AND, as follows:

where extract(MONTH from HIRE_DATE) >= 12 and       extract(MONTH from HIRE_DATE) <= 10;

Since no number can be at the same time greater than or equal to 12 and less than or equal to 10, no rows will be returned from a query with this WHERE clause.

IN the Thick of Things

The IN operator makes it easy to specify a list of values to search for in a WHERE clause. The IN clause contains a list of one or more values, separated by commas and enclosed in parentheses:

IN (value1, value2, ...)

It is ideal for situations where the values to be selected aren’t in a range that the BETWEEN operator (or a pair of comparisons with an AND) can easily handle.

At Scott’s widget company, one of the vice presidents, one of the store managers, and one of the purchasing managers will be temporarily moving to Chicago to open a new branch office. The employees who report to them will also move. The manager IDs for these positions are 102, 114, and 121. Janice writes a query to identify the people who are moving along with their managers:

select employee_id "Emp ID", manager_id "Mgr ID",    last_name || ‘, ‘ || first_name "Name" from employees    where manager_id in (102, 114, 121);     Emp ID     Mgr ID Name ---------- ---------- ----------------------------        103        102 Hunold, Alexander        115        114 Khoo, Alexander        116        114 Baida, Shelli        117        114 Tobias, Sigal        118        114 Himuro, Guy        119        114 Colmenares, Karen        129        121 Bissot, Laura        130        121 Atkinson, Mozhe        131        121 Marlow, James        132        121 Olson, TJ        184        121 Sarchand, Nandita        185        121 Bull, Alexis        186        121 Dellinger, Julia        187        121 Cabrio, Anthony 14 rows selected.

The IN operator could be rewritten with a series of OR conditions, but once you need to use more than two or three values, the advantages of using IN become apparent.

Note

The Oracle SQL engine converts the IN operator to a series of OR conditions at runtime.

As you might expect, NOT IN is also valid. If the query you want to write sounds something like, "I want all the values except for these two or three…," then NOT IN is probably a good choice.

What’s Not to Like about LIKE?

The LIKE operator lets you do pattern matching in a query. You know how to search for exact strings and numbers, but in some cases, you know only a few digits of the number or a portion of the string you need to find.

pattern matching

Comparing a string in a database column to a string containing wildcard characters. These wildcard characters can represent zero, one, or more characters in the database column string.

The LIKE operator can be used interchangeably with an equal sign, except that the string specified with LIKE can contain wildcard characters. The wildcard characters allowed in LIKE are %, which represents zero or more characters, and _, which represents exactly one character.

For example, the pattern ‘Sm_th%’ will match ‘Smith’ and ‘Smythe’, but not ‘Smooth’. The pattern ‘%o%o%’ will match any string that contains at least two lowercase o characters.

Janice is writing an ad-hoc query for Employee Services that will retrieve the job titles that have the word Manager” somewhere in the title. She uses the LIKE operator:

select job_id, job_title from jobs    where job_title like ‘%Manager%’; JOB_ID     JOB_TITLE ---------- ----------------------------------- FI_MGR     Finance Manager AC_MGR     Accounting Manager SA_MAN     Sales Manager PU_MAN     Purchasing Manager ST_MAN     Stock Manager MK_MAN     Marketing Manager 6 rows selected.
Note

When numbers or dates are used with the LIKE operator, they are converted to character strings using the default conversion rules before comparing to the LIKE string.

What happens when you want to search for the _ or % characters themselves? The job IDs in Scott’s corporate database use underscores, so Janice would get erroneous results if she specified ‘ST_’ in the LIKE string to find store-related jobs. This would also return jobs that had ‘ASSISTANT’ or ‘COSTMGR’ in the job ID. To solve this problem, she uses the ESCAPE option of the LIKE clause. The ESCAPE option lets you define a special character—one that you don’t expect to find in your strings—to use before _ or % to indicate that you’re actually looking for a _ or % character. To find all the job descriptions for jobs that are store-related, and therefore begin with ‘ST_’, Janice uses the following query:

select job_id, job_title from jobs where job_id like ‘ST\_%’ escape ‘\’; JOB_ID     JOB_TITLE ---------- ----------------------------------- ST_MAN     Stock Manager ST_CLERK   Stock Clerk 2 rows selected.

The ESCAPE option is used only with LIKE, and it tells the SQL engine to treat the character that follows literally instead of as a wildcard character. Notice in the above example that the underscore is "escaped," but the % acts as it normally does and specifies that zero or more characters follow.

Warning

DBAs should keep an eye out for queries that use LIKE extensively. While this operator is very easy and intuitive for the user, queries with LIKE will scan the entire table, rather than use an index, unless there are no wildcards at the beginning of the string in the LIKE operator.

IS NULL and IS NOT NULL

As mentioned in previous chapters, NULLs can be very useful in the database for saving disk space and for identifying values that are unknown, as opposed to being blank or zero. The key to understanding NULLs is to know that they are not equal to anything. Therefore, NULLs won’t work with the standard comparison operators, such as +, /, >, =, and so forth. Janice learned this the hard way when she wanted to identify employees who made a commission of less than 15% or no commission at all. Here is the query she used:

select employee_id "Emp ID", last_name "Name", commission_pct "Comm%" from employees where commission_pct < 0.15;     Emp ID Name                           Comm% ---------- ------------------------- ----------        164 Marvins                           .1        165 Lee                               .1        166 Ande                              .1        167 Banda                             .1        173 Kumar                             .1        179 Johnson                           .1 6 rows selected.

This list appears to be way too short. That is because the rows in the EMPLOYEES table with NULL values for the commission do not pass the criteria of being less than 0.15; they don’t compare to any value because they are unknown.

This is where the IS NULL and IS NOT NULL operators come to the rescue. These two operators are the only ones that can do a direct comparison to values that are NULL in a database row. For Janice to fix her query, she needs to add an IS NULL condition to her WHERE clause:

select employee_id "Emp ID",       last_name "Name", commission_pct "Comm%" from employees       where commission_pct < 0.15       or commission_pct is null;     Emp ID Name                           Comm% ---------- ------------------------- ----------        100 King        101 Kochhar        102 De Haan ...        164 Marvins                           .1        165 Lee                               .1        166 Ande                              .1        167 Banda                             .1        173 Kumar                             .1        179 Johnson                           .1        180 Taylor ...        205 Higgins        206 Gietz 78 rows selected.
Warning

Be careful when constructing queries that operate on columns that can contain NULL values. A NULL is not the same as FALSE; it is the absence of a known value. This is a by-product of three-valued logic, where we have not just TRUE and FALSE, but TRUE, FALSE, and UNKNOWN.

You’ll see in the section on GROUP BY how multirow functions handle NULL values in a reasonable and expected way.




Oracle9i DBA JumpStart
Oracle9i DBA JumpStart
ISBN: 0782141897
EAN: 2147483647
Year: 2006
Pages: 138
Authors: Bob Bryla

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net