Chapter 6: Advanced SQL Queries


In the previous chapter, you saw how you can write queries that retrieve information from multiple tables. This chapter looks at more advanced types of queries. We will begin with relatively simple subqueries, which allow you to put one query inside another, rather than running two individual queries. Subqueries can be tied even more closely to the main query using a correlated subquery, where columns in the WHERE clause of the subquery directly reference columns in the main query.

Sometimes, you need to get similar information from more than one query, and there is some overlap between the results. You might not want to see the duplicates, or you might want to see only the results that two queries have in common. As you’ll learn here, you can use UNION and INTERSECT to accomplish these tasks. You’ll also learn how to use ROLLUP and CUBE to summarize table information.

Subqueries

A subquery places one query inside another one. The second query resides somewhere within the WHERE clause of a SELECT statement. One or more values returned by the subquery are used by the main query to return the results to the user.

subquery

A query that is embedded in a main, or parent, query and used to assist in filtering the result set from a query.

The types of operators allowed in the WHERE clause depend on whether the subquery returns one row or more than one row. If only a single row is returned from a query, the comparison operators =, !=, <, >, >=, <=, and so forth are valid. If more than one row is returned from a subquery, operators such as IN, NOT IN, ANY, and ALL are valid.

Single-Row Subqueries

The boss, King, wants to do his quarterly salary analysis. He would like to see which employees in the IT department are earning more than the average salary across all employees. Janice, the database analyst and DBA, realizes that this could be written as two queries, and decides to take that approach first before using a subquery. The average salary for an employee in the company is retrieved by a query you’ve seen in previous chapters:

select avg(salary) from employees; AVG(SALARY) -----------  6461.68224 1 row selected.

Using this information as a starting point, Janice writes a second query to see which employees in the IT department (department 60) have a higher salary than the average. She must cut and paste the number returned from the previous query into this new query:

select employee_id, last_name, first_name, salary from employees where salary > 6461.68224 and department_id = 60; EMPLOYEE_ID LAST_NAME     FIRST_NAME            SALARY ----------- ------------- ----------------- ----------         103 Hunold        Alexander               9000 1 row selected.

The only employee in the IT department making more than the company average salary is Alexander Hunold, who happens to be the manager of that department.

Janice wants to streamline this reporting function for King. She realizes that this can easily be written as a single-row subquery. She will embed the query she used to calculate the average into the second query, replacing the constant value as follows:

select employee_id, last_name, first_name, salary from employees where salary > (select avg(salary) from employees) and department_id = 60; EMPLOYEE_ID LAST_NAME     FIRST_NAME                 SALARY ----------- ------------- ---------------------- ----------         103 Hunold        Alexander                    9000 1 row selected.

single-row subquery

A subquery that returns a single row and is compared to a single value in the parent query.

Not only is the query more readable and easier to maintain than the version with two queries, but it also will be processed much more efficiently by the Oracle server.

Tip

As a general rule, a query, enclosed in parentheses, can take the place of a table name in the FROM clause or a column name in the SELECT or WHERE clause of a query.

King is starting to realize that the IT department may need some pay increases in the next fiscal year.

Multiple-Row Subqueries

Sometimes, you want to compare a column in a table to a list of results from a subquery, not just a single result. This is where a multiple-row subquery comes in handy. For example, King is following up on his analysis of employee salaries in the IT department, and he wants to see who else in the company is making the same salary as anyone in the IT department.

multiple-row subquery

A subquery that can return more than one row for comparison to the main, or parent, query using operators such as IN.

Janice starts out with the subquery to make sure that she starts with the right set of results to use for the main query. She wants to get the salaries for the employees in the IT department (department 60):

select salary from employees where department_id = 60; SALARY ----------       9000       6000       4800       4800       4200 5 rows selected.

So far, so good. She takes this query and makes is a subquery in the query that compares the salaries of all employees to this list by using the IN clause:

select employee_id, last_name, first_name, salary from employees where salary in (select salary from employees                  where department_id = 60); EMPLOYEE_ID LAST_NAME     FIRST_NAME             SALARY ----------- ------------- ------------------ ----------         158 McEwen        Allan                    9000         152 Hall          Peter                    9000         109 Faviet        Daniel                   9000         103 Hunold        Alexander                9000         202 Fay           Pat                      6000         104 Ernst         Janice                   6000         106 Pataballa     Valli                    4800         105 Austin        David                    4800         184 Sarchand      Nandita                  4200         107 Lorentz       Diana                    4200 10 rows selected.

But wait, something is not quite right here. King did not want to see the IT employees in this list; he wanted to include everyone but the IT employees. So Janice makes a slight change as follows, removing employees whose job title is not an IT job title:

select employee_id, last_name, first_name, salary from employees where salary in (select salary from employees                  where department_id = 60)       and job_id not like ‘IT_%’; EMPLOYEE_ID LAST_NAME     FIRST_NAME             SALARY ----------- ------------- ------------------ ----------         158 McEwen        Allan                    9000         152 Hall          Peter                    9000         109 Faviet        Daniel                   9000         202 Fay           Pat                      6000         184 Sarchand      Nandita                  4200 5 rows selected.

Note that Janice also could have checked for a department ID other than 60, as you have seen in previous queries.

Correlated Subqueries

A correlated subquery looks very much like a garden-variety subquery, with one important difference: The correlated subquery references a column in the main query as part of the qualification process to see if a given row will be returned by the query. For each row in the parent query, the subquery is evaluated to see if the row will be returned. In Janice’s situation, the salary of each individual employee is compared to the average salary for that employee’s department. The checkmarked rows in the parent query are returned.

click to expand

correlated subquery

A subquery that contains a reference to a column in the main, or parent, query.

Janice knows that King will be asking for more queries regarding salaries, so she comes up with a fairly generic query that will identify employees who are making more than the average salary for their department. As a first step, she builds the subquery that retrieves the average salary for a department:

select avg(salary) from employees     where department_id = 60; AVG(SALARY) -----------        5760 1 row selected. 

That query returns the average salary for department 60. In the correlated subquery, she will need to generalize it so that it will correlate with any department in the parent query. Next, she builds the parent query that compares a given employee’s salary to the average she just calculated:

select employee_id, last_name, salary   from employees   where department_id = 60 and     salary > 5760; EMPLOYEE_ID LAST_NAME              SALARY ----------- ------------------ ----------         103 Hunold                   9000         104 Ernst                    6000 2 rows selected.

Notice that there are two queries that can now be linked together into a correlated subquery to return all employees that earn more than the average for their department across all departments. If you’re not sure how to link these two queries, the hint is in the column names. Janice joins the two queries using the DEPARTMENT_ID column:

select employee_id, last_name, department_id, salary   from employees emp   where     salary > (select avg(salary) from employees               where department_id = emp.department_id); EMPLOYEE_ID LAST_NAME          DEPARTMENT_ID     SALARY ----------- ------------------ ------------- ----------         100 King                          90      24000         103 Hunold                        60       9000         104 Ernst                         60       6000         108 Greenberg                    100      12000         109 Faviet                       100       9000 ...                193 Everett                       50       3900         201 Hartstein                     20      13000         205 Higgins                      110      12000 38 rows selected.

As Janice expected, this query still shows that Hunold and Ernst make more than the average salary for department 60.

Multiple-Column Subqueries

There are times when you need to use a subquery that compares more than just one column between the parent query and the subquery. This is known as a multiple-column subquery. Typically, the IN clause is used to compare the outer query’s columns to the columns of the subquery.

multiple-column subquery

A subquery in which more than one column is selected for comparison to the main query using the same number of columns.

Note

Multiple-column subqueries can be rewritten as a compound WHERE clause with multiple logical operators. However, this approach is not as readable or maintainable as a multiple-column subquery.

The boss, King, wants to be able to identify employees that make the same salaries as other employees with the same job. He wants to specify an employee number and have the query return the other employees that have the same job title and make the same salary. Janice immediately realizes that this could be written as a multiple-column subquery. She decides to try out the query on one of the stock clerks, Hazel Philtanker, who has an employee number of 136:

select employee_id, last_name, job_id, salary   from employees   where (job_id, salary) in         (select job_id, salary from employees          where employee_id = 136); EMPLOYEE_ID LAST_NAME       JOB_ID         SALARY ----------- --------------- ---------- ----------         128 Markle          ST_CLERK         2200         136 Philtanker      ST_CLERK         2200 2 rows selected.

The query looks good, except that Hazel is included in the results. If King decides he doesn’t want to see the selected employee in the results, Janice can modify the query slightly and change it into a correlated multiple-column subquery:

select employee_id, last_name, job_id, salary   from employees emp   where (job_id, salary) in         (select job_id, salary from employees          where employee_id = 136          and employee_id != emp.employee_id); EMPLOYEE_ID LAST_NAME       JOB_ID         SALARY ----------- --------------- ---------- ----------         128 Markle          ST_CLERK         2200 1 row selected.




Oracle9i DBA JumpStart
Oracle9i DBA JumpStart
ISBN: 0782141897
EAN: 2147483647
Year: 2006
Pages: 138
Authors: Bob Bryla

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net