Recipe12.11.Returning Non-GROUP BY Columns


Recipe 12.11. Returning Non-GROUP BY Columns

Problem

You are executing a GROUP BY query, and you wish to return columns in your select list that are not also listed in your GROUP BY clause. This is not normally possible, as such ungrouped columns would not represent a single value per row.

Say that you want to find the employees who earn the highest and lowest salaries in each department, as well as the employees who earn the highest and lowest salaries in each job. You want to see each employee's name, the department he works in, his job title, and his salary. You want to return the following result set:

 DEPTNO ENAME  JOB         SAL DEPT_STATUS     JOB_STATUS ------ ------ --------- ----- --------------- --------------     10 MILLER CLERK      1300 LOW SAL IN DEPT TOP SAL IN JOB     10 CLARK  MANAGER    2450                 LOW SAL IN JOB     10 KING   PRESIDENT  5000 TOP SAL IN DEPT TOP SAL IN JOB     20 SCOTT  ANALYST    3000 TOP SAL IN DEPT TOP SAL IN JOB     20 FORD   ANALYST    3000 TOP SAL IN DEPT TOP SAL IN JOB     20 SMITH  CLERK       800 LOW SAL IN DEPT LOW SAL IN JOB     20 JONES  MANAGER    2975                 TOP SAL IN JOB     30 JAMES  CLERK       950 LOW SAL IN DEPT     30 MARTIN SALESMAN   1250                 LOW SAL IN JOB     30 WARD   SALESMAN   1250                 LOW SAL IN JOB     30 ALLEN  SALESMAN   1600                 TOP SAL IN JOB     30 BLAKE  MANAGER    2850 TOP SAL IN DEPT 

Unfortunately, including all these columns in the SELECT clause will ruin the grouping. Consider the following example. Employee "KING" earns the highest salary. You want to verify this with the following query:

 Select ename,max(sal)   from emp  group by ename 

Instead of seeing "KING" and KING's salary, the above query will return all 14 rows from table EMP. The reason is because of the grouping: the MAX(SAL) is applied to each ENAME. So, it would seem the above query can be stated as "find the employee with the highest salary" but in fact what it is doing is "find the highest salary for each ENAME in table EMP." This recipe explains a technique for including ENAME without the need to GROUP BY that column.

Solution

Use an inline view to find the high and low salaries by DEPTNO and JOB. Then keep only the employees who make those salaries.

DB2, Oracle, and SQL Server

Use the window functions MAX OVER and MIN OVER to find the highest and lowest salaries by DEPTNO and JOB. Then keep the rows where the salaries are those that are highest or lowest by DEPTNO or JOB:

  1 select deptno,ename,job,sal,  2        case when sal = max_by_dept  3             then 'TOP SAL IN DEPT'  4             when sal = min_by_dept  5             then 'LOW SAL IN DEPT'  6        end dept_status,  7        case when sal = max_by_job  8             then 'TOP SAL IN JOB'  9             when sal = min_by_job 10             then 'LOW SAL IN JOB' 11        end job_status 12   from ( 13 select deptno,ename,job,sal, 14        max(sal)over(partition by deptno) max_by_dept, 15        max(sal)over(partition by job)   max_by_job, 16        min(sal)over(partition by deptno) min_by_dept, 17        min(sal)over(partition by job)   min_by_job 18   from emp 19        ) emp_sals 20  where sal in (max_by_dept,max_by_job, 21                min_by_dept,min_by_job) 

PostgreSQL and MySQL

Use scalar subqueries to find the highest and lowest salaries by DEPTNO and JOB. Then keep only those employees who match those salaries:

  1 select deptno,ename,job,sal,  2        case when sal = max_by_dept  3             then 'TOP SAL IN DEPT'  4             when sal = min_by_dept  5             then 'LOW SAL IN DEPT'  6        end as dept_status,  7        case when sal = max_by_job  8             then 'TOP SAL IN JOB'  9             when sal = min_by_job 10             then 'LOW SAL IN JOB' 11        end as job_status 12   from ( 13 select e.deptno,e.ename,e.job,e.sal, 14        (select max(sal) from emp d 15          where d.deptno = e.deptno) as max_by_dept, 16        (select max(sal) from emp d 17          where d.job = e.job) as max_by_job, 18        (select min(sal) from emp d 19          where d.deptno = e.deptno) as min_by_dept, 20        (select min(sal) from emp d 21          where d.job = e.job) as min_by_job 22   from emp e 23        ) x 24  where sal in (max_by_dept,max_by_job, 25                min_by_dept,min_by_job) 

Discussion

DB2, Oracle, and SQL Server

The first step is to use the window functions MAX OVER and MIN OVER to find the highest and lowest salaries by DEPTNO and JOB:

  select deptno,ename,job,sal,        max(sal)over(partition by deptno) maxDEPT,        max(sal)over(partition by job) maxJOB,        min(sal)over(partition by deptno) minDEPT,        min(sal)over(partition by job) minJOB   from emp DEPTNO ENAME  JOB         SAL MAXDEPT MAXJOB MINDEPT MINJOB ------ ------ --------- ----- ------- ------ ------- ------     10 MILLER CLERK      1300    5000   1300    1300    800     10 CLARK  MANAGER    2450    5000   2975    1300   2450     10 KING   PRESIDENT  5000    5000   5000    1300   5000     20 SCOTT  ANALYST    3000    3000   3000     800   3000     20 FORD   ANALYST    3000    3000   3000     800   3000     20 SMITH  CLERK       800    3000   1300     800    800     20 JONES  MANAGER    2975    3000   2975     800   2450     20 ADAMS  CLERK      1100    3000   1300     800    800     30 JAMES  CLERK       950    2850   1300     950    800     30 MARTIN SALESMAN   1250    2850   1600     950   1250     30 TURNER SALESMAN   1500    2850   1600     950   1250     30 WARD   SALESMAN   1250    2850   1600     950   1250     30 ALLEN  SALESMAN   1600    2850   1600     950   1250     30 BLAKE  MANAGER    2850    2850   2975     950   2450 

At this point, every salary can be compared with the highest and lowest salaries by DEPTNO and JOB. Notice that the grouping (the inclusion of multiple columns in the SELECT clause) does not affect the values returned by MIN OVER and MAX OVER. This is the beauty of window functions: the aggregate is computed over a defined "group" or partition and returns multiple rows for each group. The last step is to simply wrap the window functions in an inline view and keep only those rows that match the values returned by the window functions. Use a simple CASE expression to display the "status" of each employee in the final result set:

  select deptno,ename,job,sal,        case when sal = max_by_dept             then 'TOP SAL IN DEPT'             when sal = min_by_dept             then 'LOW SAL IN DEPT'        end dept_status,        case when sal = max_by_job             then 'TOP SAL IN JOB'             when sal = min_by_job             then 'LOW SAL IN JOB'        end job_status   from ( select deptno,ename,job,sal,        max(sal)over(partition by deptno) max_by_dept,        max(sal)over(partition by job) max_by_job,        min(sal)over(partition by deptno) min_by_dept,        min(sal)over(partition by job) min_by_job   from emp        ) x  where sal in (max_by_dept,max_by_job,                min_by_dept,min_by_job) DEPTNO ENAME  JOB         SAL DEPT_STATUS     JOB_STATUS ------ ------ --------- ----- --------------- --------------     10 MILLER CLERK      1300 LOW SAL IN DEPT TOP SAL IN JOB     10 CLARK  MANAGER    2450                 LOW SAL IN JOB     10 KING   PRESIDENT  5000 TOP SAL IN DEPT TOP SAL IN JOB     20 SCOTT  ANALYST    3000 TOP SAL IN DEPT TOP SAL IN JOB     20 FORD   ANALYST    3000 TOP SAL IN DEPT TOP SAL IN JOB     20 SMITH  CLERK       800 LOW SAL IN DEPT LOW SAL IN JOB     20 JONES  MANAGER    2975                 TOP SAL IN JOB     30 JAMES  CLERK       950 LOW SAL IN DEPT     30 MARTIN SALESMAN   1250                 LOW SAL IN JOB     30 WARD   SALESMAN   1250                 LOW SAL IN JOB     30 ALLEN  SALESMAN   1600                 TOP SAL IN JOB     30 BLAKE  MANAGER    2850 TOP SAL IN DEPT 

PostgreSQL and MySQL

The first step is to use scalar subqueries to find the highest and lowest salaries by DEPTNO and JOB:

  select e.deptno,e.ename,e.job,e.sal,        (select max(sal) from emp d          where d.deptno = e.deptno) as maxDEPT,        (select max(sal) from emp d          where d.job = e.job) as maxJOB,        (select min(sal) from emp d          where d.deptno = e.deptno) as minDEPT,        (select min(sal) from emp d          where d.job = e.job) as minJOB   from emp e DEPTNO ENAME  JOB         SAL MAXDEPT MAXJOB MINDEPT MINJOB ------ ------ --------- ----- ------- ------ ------- ------     20 SMITH  CLERK       800    3000 1300       800    800     30 ALLEN  SALESMAN   1600    2850 1600       950   1250     30 WARD   SALESMAN   1250    2850 1600       950   1250     20 JONES  MANAGER    2975    3000 2975       800   2450     30 MARTIN SALESMAN   1250    2850 1600       950   1250     30 BLAKE  MANAGER    2850    2850 2975       950   2450     10 CLARK  MANAGER    2450    5000 2975      1300   2450     20 SCOTT  ANALYST    3000    3000 3000       800   3000     10 KING   PRESIDENT  5000    5000 5000      1300   5000     30 TURNER SALESMAN   1500    2850 1600       950   1250     20 ADAMS  CLERK      1100    3000 1300       800    800     30 JAMES  CLERK       950    2850 1300       950    800     20 FORD   ANALYST    3000    3000 3000       800   3000     10 MILLER CLERK      1300    5000 1300      1300    800 

The highest and lowest salaries by DEPTNO and JOB can now be compared with all other salaries in table EMP. The final step is to wrap the scalar subqueries in an inline view and simply keep the employees whose salaries match one of the scalar subqueries. Use a CASE expression to display each employee's status in the final result set:

  select deptno,ename,job,sal,        case when sal = max_by_dept             then 'TOP SAL IN DEPT'             when sal = min_by_dept             then 'LOW SAL IN DEPT'        end as dept_status,        case when sal = max_by_job             then 'TOP SAL IN JOB'             when sal = min_by_job             then 'LOW SAL IN JOB'        end as job_status   from ( select e.deptno,e.ename,e.job,e.sal,        (select max(sal) from emp d          where d.deptno = e.deptno) as max_by_dept,        (select max(sal) from emp d          where d.job = e.job) as max_by_job,        (select min(sal) from emp d          where d.deptno = e.deptno) as min_by_dept,        (select min(sal) from emp d          where d.job = e.job) as min_by_job   from emp e        ) x  where sal in (max_by_dept,max_by_job,                min_by_dept,min_by_job) DEPTNO ENAME  JOB         SAL DEPT_STATUS     JOB_STATUS ------ ------ --------- ----- --------------- --------------     10 CLARK  MANAGER    2450                 LOW SAL IN JOB     10 KING   PRESIDENT  5000 TOP SAL IN DEPT TOP SAL IN JOB     10 MILLER CLERK      1300 LOW SAL IN DEPT TOP SAL IN JOB     20 SMITH  CLERK       800 LOW SAL IN DEPT LOW SAL IN JOB     20 FORD   ANALYST    3000 TOP SAL IN DEPT TOP SAL IN JOB     20 SCOTT  ANALYST    3000 TOP SAL IN DEPT TOP SAL IN JOB     20 JONES  MANAGER    2975                 TOP SAL IN JOB     30 ALLEN  SALESMAN   1600                 TOP SAL IN JOB     30 BLAKE  MANAGER    2850 TOP SAL IN DEPT     30 MARTIN SALESMAN   1250                 LOW SAL IN JOB     30 JAMES  CLERK       950 LOW SAL IN DEPT     30 WARD   SALESMAN   1250                 LOW SAL IN JOB 




SQL Cookbook
SQL Cookbook (Cookbooks (OReilly))
ISBN: 0596009763
EAN: 2147483647
Year: 2005
Pages: 235

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net