Recipe11.11.Finding Knight Values


Recipe 11.11. Finding Knight Values

Problem

You want return a result set that contains each employee's name, the department they work in, their salary, the date they were hired, and the salary of the last employee hired, in each department. You want to return the following result set:

 DEPTNO ENAME             SAL HIREDATE    LATEST_SAL ------ ---------- ---------- ----------- ----------     10 MILLER            1300 23-JAN-1982      1300     10 KING              5000 17-NOV-1981      1300     10 CLARK             2450 09-JUN-1981      1300     20 ADAMS             1100 12-JAN-1983      1100     20 SCOTT             3000 09-DEC-1982      1100     20 FORD              3000 03-DEC-1981      1100     20 JONES             2975 02-APR-1981      1100     20 SMITH              800 17-DEC-1980      1100     30 JAMES              950 03-DEC-1981       950     30 MARTIN            1250 28-SEP-1981       950     30 TURNER            1500 08-SEP-1981       950     30 BLAKE             2850 01-MAY-1981       950     30 WARD              1250 22-FEB-1981       950     30 ALLEN             1600 20-FEB-1981       950 

The values in LATEST_SAL are the "Knight values" because the path to find them is analogous to a knight's path in the game of chess. You determine the result the way a knight determines a new location: by jumping to a row then turning and jumping to a different column (see Figure 11-1). To find the correct values for LATEST_SAL, you must first locate (jump to) the row with the latest HIREDATE in each DEPTNO, and then you select (jump to) the SAL column of that row.

Figure 11-1. A knight value comes from "up and over"


The term "Knight value" was coined by a very clever coworker of mine, Kay Young. After having him review the recipes for correctness I admitted to him that I was stumped and could not come up with a good title. Because you need to initially evaluate one row then "jump" and take a value from another, he came up with the term "Knight value."


Solution

DB2 and SQL Server

Use a CASE expression in a subquery to return the SAL of the last employee hired in each DEPTNO; for all other salaries, return zero. Use the window function MAX OVER in the outer query to return the non-zero SAL for each employee's department:

  1  select deptno,  2         ename,  3         sal,  4         hiredate,  5         max(latest_sal)over(partition by deptno) latest_sal  6    from (  7  select deptno,  8         ename,  9         sal, 10         hiredate, 11         case 12           when hiredate = max(hiredate)over(partition by deptno) 13           then sal else 0 14         end latest_sal 15    from emp 16         ) x 17   order by 1, 4 desc 

MySQL and PostgreSQL

Use a scalar subquery nested two levels deep. First, find the HIREDATE of the last employee in each DEPTO. Then use the aggregate function MAX (in case there are duplicates) to find the SAL of the last employee hired in each DEPTNO:

  1  select e.deptno,  2         e.ename,  3         e.sal,  4         e.hiredate,  5         (select max(d.sal)  6           from emp d  7          where d.deptno = e.deptno  8            and d.hiredate =  9                (select max(f.hiredate) 10                   from emp f 11                  where f.deptno = e.deptno)) as latest_sal 12    from emp e 13   order by 1, 4 desc 

Oracle

Use the window function MAX OVER to return the highest SAL for each DEPTNO. Use the functions DENSE_RANK and LAST, while ordering by HIREDATE, in the KEEP clause to return the highest SAL for the latest HIREDATE in a given DEPTNO:

 1  select deptno, 2          ename, 3          sal, 4          hiredate, 5           max(sal) 6             keep(dense_rank last order by hiredate) 7             over(partition by deptno) latest_sal 8    from emp 9  order by 1, 4 desc 

Discussion

DB2 and SQL Server

The first step is to use the window function MAX OVER in a CASE expression to find the employee hired last, or most recently, in each DEPTNO. If an employee's HIREDATE matches the value returned by MAX OVER, then use a CASE expression to return that employee's SAL; otherwise return 0. The results of this are shown below:

  select deptno,        ename,        sal,        hiredate,        case            when hiredate = max(hiredate)over(partition by deptno)            then sal else 0        end latest_sal   from emp  DEPTNO ENAME             SAL HIREDATE    LATEST_SAL ------ --------- ----------- ----------- ----------     10 CLARK            2450 09-JUN-1981          0     10 KING             5000 17-NOV-1981          0     10 MILLER           1300 23-JAN-1982       1300     20 SMITH             800 17-DEC-1980          0     20 ADAMS            1100 12-JAN-1983       1100     20 FORD             3000 03-DEC-1981          0     20 SCOTT            3000 09-DEC-1982          0     20 JONES            2975 02-APR-1981          0     30 ALLEN            1600 20-FEB-1981          0     30 BLAKE            2850 01-MAY-1981          0     30 MARTIN           1250 28-SEP-1981          0     30 JAMES             950 03-DEC-1981        950     30 TURNER           1500 08-SEP-1981          0     30 WARD             1250 22-FEB-1981          0 

Because the value for LATEST_SAL will be either 0 or the SAL of the employee(s) hired most recently, you can wrap the above query in an inline view and use MAX OVER again, but this time to return the greatest non-zero LATEST_SAL for each DEPTNO:

  select deptno,        ename,        sal,        hiredate,        max(latest_sal)over(partition by deptno) latest_sal   from ( select deptno,        ename,        sal,        hiredate,        case            when hiredate = max(hiredate)over(partition by deptno)            then sal else 0        end latest_sal   from emp        ) x  order by 1, 4 desc  DEPTNO  ENAME            SAL HIREDATE    LATEST_SAL ------- --------- ---------- ----------- ----------     10  MILLER          1300 23-JAN-1982       1300     10  KING            5000 17-NOV-1981       1300     10  CLARK           2450 09-JUN-1981       1300     20  ADAMS           1100 12-JAN-1983       1100     20  SCOTT           3000 09-DEC-1982       1100     20  FORD            3000 03-DEC-1981       1100     20  JONES           2975 02-APR-1981       1100     20  SMITH            800 17-DEC-1980       1100     30  JAMES            950 03-DEC-1981        950     30  MARTIN          1250 28-SEP-1981        950     30  TURNER          1500 08-SEP-1981        950     30  BLAKE           2850 01-MAY-1981        950     30  WARD            1250 22-FEB-1981        950     30  ALLEN           1600 20-FEB-1981        950 

MySQL and PostgreSQL

The first step is to use a scalar subquery to find the HIREDATE of the last employee hired in each DEPTNO:

  select e.deptno,        e.ename,        e.sal,        e.hiredate,         (select max(f.hiredate)            from emp f           where f.deptno = e.deptno) as last_hire    from emp e   order by 1, 4 desc  DEPTNO ENAME             SAL HIREDATE    LAST_HIRE ------ ---------- ---------- ----------- -----------     10 MILLER           1300 23-JAN-1982 23-JAN-1982     10 KING             5000 17-NOV-1981 23-JAN-1982     10 CLARK            2450 09-JUN-1981 23-JAN-1982     20 ADAMS            1100 12-JAN-1983 12-JAN-1983     20 SCOTT            3000 09-DEC-1982 12-JAN-1983     20 FORD             3000 03-DEC-1981 12-JAN-1983     20 JONES            2975 02-APR-1981 12-JAN-1983     20 SMITH             800 17-DEC-1980 12-JAN-1983     30 JAMES             950 03-DEC-1981 03-DEC-1981     30 MARTIN           1250 28-SEP-1981 03-DEC-1981     30 TURNER           1500 08-SEP-1981 03-DEC-1981     30 BLAKE            2850 01-MAY-1981 03-DEC-1981     30 WARD             1250 22-FEB-1981 03-DEC-1981     30 ALLEN            1600 20-FEB-1981 03-DEC-1981 

The next step is to find the SAL for the employee(s) in each DEPTNO hired on LAST_HIRE. Use the aggregate function MAX to keep the highest (if there are multiple employees hired on the same day):

  select e.deptno,        e.ename,        e.sal,        e.hiredate,         (select max(d.sal)            from emp d           where d.deptno   = e.deptno             and d.hiredate =                 (select max(f.hiredate)                    from emp f                   where f.deptno = e.deptno)) as latest_sal   from emp e  order by 1, 4 desc  DEPTNO ENAME             SAL HIREDATE     LATEST_SAL ------ ---------- ---------- -----------  ----------     10 MILLER           1300 23-JAN-1982        1300     10 KING             5000 17-NOV-1981        1300     10 CLARK            2450 09-JUN-1981        1300     20 ADAMS            1100 12-JAN-1983        1100     20 SCOTT            3000 09-DEC-1982        1100     20 FORD             3000 03-DEC-1981        1100     20 JONES            2975 02-APR-1981        1100     20 SMITH             800 17-DEC-1980        1100     30 JAMES             950 03-DEC-1981         950     30 MARTIN           1250 28-SEP-1981         950     30 TURNER           1500 08-SEP-1981         950     30 BLAKE            2850 01-MAY-1981         950     30 WARD             1250 22-FEB-1981         950     30 ALLEN            1600 20-FEB-1981         950 

Oracle

Users on Oracle8i Database can use the DB2 solution. For users on Oracle9i Database and later, you can use the solution presented below. The key to the Oracle solution is to take advantage of the KEEP clause. The KEEP clause allows you to rank the rows returned by a group/partition and work with the first or last row in the group. Consider what the solution looks like without KEEP:

  select deptno,        ename,        sal,        hiredate,        max(sal) over(partition by deptno) latest_sal   from emp  order by 1, 4 desc  DEPTNO ENAME             SAL HIREDATE    LATEST_SAL ------ ---------- ---------- ----------- ----------     10 MILLER           1300 23-JAN-1982       5000     10 KING             5000 17-NOV-1981       5000     10 CLARK            2450 09-JUN-1981       5000     20 ADAMS            1100 12-JAN-1983       3000     20 SCOTT            3000 09-DEC-1982       3000     20 FORD             3000 03-DEC-1981       3000     20 JONES            2975 02-APR-1981       3000     20 SMITH             800 17-DEC-1980       3000     30 JAMES             950 03-DEC-1981       2850     30 MARTIN           1250 28-SEP-1981       2850     30 TURNER           1500 08-SEP-1981       2850     30 BLAKE            2850 01-MAY-1981       2850     30 WARD             1250 22-FEB-1981       2850     30 ALLEN            1600 20-FEB-1981       2850 

Rather than returning the SAL of the latest employee hired, MAX OVER without KEEP simply returns the highest salary in each DEPTNO. KEEP, in this recipe, allows you to order the salaries by HIREDATE in each DEPTNO by specifying ORDER BY HIREDATE. Then, the function DENSE_RANK assigns a rank to each HIREDATE in ascending order. Finally, the function LAST determines which row to apply the aggregate function to: the "last" row based on the ranking of DENSE_ RANK. In this case, the aggregate function MAX is applied to the SAL column for the row with the "last" HIREDATE. In essence, keep the SAL of the HIREDATE ranked last in each DEPTNO.

You are ranking the rows in each DEPTNO based on one column (HIREDATE), but then applying the aggregation (MAX) on another column (SAL). This ability to rank in one dimension and aggregate over another is convenient as it allows you to avoid extra joins and inline views as are used in the other solutions. Finally, by adding the OVER clause after the KEEP clause you can return the SAL "kept" by KEEP for each row in the partition.

Alternatively, you can order by HIREDATE in descending order and "keep" the first SAL. Compare the two queries below, which return the same result set:

  select deptno,        ename,        sal,        hiredate,        max(sal)          keep(dense_rank last order by hiredate)          over(partition by deptno) latest_sal   from emp  order by 1, 4 desc  DEPTNO ENAME             SAL HIREDATE    LATEST_SAL ------ ---------- ---------- ----------- ----------     10 MILLER           1300 23-JAN-1982       1300     10 KING             5000 17-NOV-1981       1300     10 CLARK            2450 09-JUN-1981       1300     20 ADAMS            1100 12-JAN-1983       1100     20 SCOTT            3000 09-DEC-1982       1100     20 FORD             3000 03-DEC-1981       1100     20 JONES            2975 02-APR-1981       1100     20 SMITH             800 17-DEC-1980       1100     30 JAMES             950 03-DEC-1981        950     30 MARTIN           1250 28-SEP-1981        950     30 TURNER           1500 08-SEP-1981        950     30 BLAKE            2850 01-MAY-1981        950     30 WARD             1250 22-FEB-1981        950     30 ALLEN            1600 20-FEB-1981        950  select deptno,        ename,        sal,        hiredate,        max(sal)          keep(dense_rank first order by hiredate desc)          over(partition by deptno) latest_sal   from emp  order by 1, 4 desc  DEPTNO ENAME             SAL HIREDATE    LATEST_SAL ------ ---------- ---------- ----------- ----------     10 MILLER           1300 23-JAN-1982       1300     10 KING             5000 17-NOV-1981       1300     10 CLARK            2450 09-JUN-1981       1300     20 ADAMS            1100 12-JAN-1983       1100     20 SCOTT            3000 09-DEC-1982       1100     20 FORD             3000 03-DEC-1981       1100     20 JONES            2975 02-APR-1981       1100     20 SMITH             800 17-DEC-1980       1100     30 JAMES             950 03-DEC-1981        950     30 MARTIN           1250 28-SEP-1981        950     30 TURNER           1500 08-SEP-1981        950     30 BLAKE            2850 01-MAY-1981        950     30 WARD             1250 22-FEB-1981        950     30 ALLEN            1600 20-FEB-1981        950 




SQL Cookbook
SQL Cookbook (Cookbooks (OReilly))
ISBN: 0596009763
EAN: 2147483647
Year: 2005
Pages: 235

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net