Recipe 12.2. Pivoting a Result Set into Multiple RowsProblemYou want to turn rows into columns by creating a column corresponding to each of the values in a single given column. However, unlike in the previous recipe, you need multiple rows of output. For example, you want to return each employee and their position (JOB), and you currently use a query that returns the following result set: JOB ENAME --------- ---------- ANALYST SCOTT ANALYST FORD CLERK SMITH CLERK ADAMS CLERK MILLER CLERK JAMES MANAGER JONES MANAGER CLARK MANAGER BLAKE PRESIDENT KING SALESMAN ALLEN SALESMAN MARTIN SALESMAN TURNER SALESMAN WARD You would like to format the result set such that each job gets its own column: CLERKS ANALYSTS MGRS PREZ SALES ------ -------- ----- ---- ------ MILLER FORD CLARK KING TURNER JAMES SCOTT BLAKE MARTIN ADAMS JONES WARD SMITH ALLEN SolutionUnlike the first recipe in this chapter, the result set for this recipe consists of more than one row. Using the previous recipe's technique will not work for this recipe, as the MAX(ENAME) for each JOB would be returned, which would result in one ENAME for each JOB (i.e., one row will be returned as in the first recipe). To solve this problem, you must make each JOB/ENAME combination unique. Then, when you apply an aggregate function to remove NULLs, you don't lose any ENAMEs. DB2, Oracle, and SQL ServerUse the window function ROW_NUMBER OVER to make each JOB/ENAME combination unique. Pivot the result set using a CASE expression and the aggregate function MAX while grouping on the value returned by the window function: 1 select max(case when job='CLERK' 2 then ename else null end) as clerks, 3 max(case when job='ANALYST' 4 then ename else null end) as analysts, 5 max(case when job='MANAGER' 6 then ename else null end) as mgrs, 7 max(case when job='PRESIDENT' 8 then ename else null end) as prez, 9 max(case when job='SALESMAN' 10 then ename else null end) as sales 11 from ( 12 select job, 13 ename, 14 row_number()over(partition by job order by ename) rn 15 from emp 16 ) x 17 group by rn PostgreSQL and MySQLUse a scalar subquery to rank each employee by EMPNO. Pivot the result set using a CASE expression and the aggregate function MAX while grouping on the value returned by the scalar subquery: 1 select max(case when job='CLERK' 2 then ename else null end) as clerks, 3 max(case when job='ANALYST' 4 then ename else null end) as analysts, 5 max(case when job='MANAGER' 6 then ename else null end) as mgrs, 7 max(case when job='PRESIDENT' 8 then ename else null end) as prez, 9 max(case when job='SALESMAN' 10 then ename else null end) as sales 11 from ( 12 select e.job, 13 e.ename, 14 (select count(*) from emp d 15 where e.job=d.job and e.empno < d.empno) as rnk 16 from emp e 17 ) x 18 group by rnk DiscussionDB2, Oracle, and SQL ServerThe first step is to use the window function ROW_NUMBER OVER to help make each JOB/ENAME combination unique: select job, ename, row_number()over(partition by job order by ename) rn from emp JOB ENAME RN --------- ---------- ---------- ANALYST FORD 1 ANALYST SCOTT 2 CLERK ADAMS 1 CLERK JAMES 2 CLERK MILLER 3 CLERK SMITH 4 MANAGER BLAKE 1 MANAGER CLARK 2 MANAGER JONES 3 PRESIDENT KING 1 SALESMAN ALLEN 1 SALESMAN MARTIN 2 SALESMAN TURNER 3 SALESMAN WARD 4 Giving each ENAME a unique "row number" within a given job prevents any problems that might otherwise result from two employees having the same name and job. The goal here is to be able to group on row number (on RN) without dropping any employees from the result set due to the use of MAX. This step is the most important step in solving the problem. Without this first step, the aggregation in the outer query will remove necessary rows. Consider what the result set would look like without using ROW_NUMBER OVER, using the same technique as seen in the first recipe: select max(case when job='CLERK' then ename else null end) as clerks, max(case when job='ANALYST' then ename else null end) as analysts, max(case when job='MANAGER' then ename else null end) as mgrs, max(case when job='PRESIDENT' then ename else null end) as prez, max(case when job='SALESMAN' then ename else null end) as sales from emp CLERKS ANALYSTS MGRS PREZ SALES ---------- ---------- ---------- ---------- ---------- SMITH SCOTT JONES KING WARD Unfortunately, only one row is returned for each JOB: the employee with the MAX ENAME. When it comes time to pivot the result set, using MIN or MAX should serve as a means to remove NULLs from the result set, not restrict the ENAMEs returned. How this works will be come clearer as you continue through the explanation. The next step uses a CASE expression to organize the ENAMEs into their proper column (JOB): select rn, case when job='CLERK' then ename else null end as clerks, case when job='ANALYST' then ename else null end as analysts, case when job='MANAGER' then ename else null end as mgrs, case when job='PRESIDENT' then ename else null end as prez, case when job='SALESMAN' then ename else null end as sales from ( Select job, ename, row_number()over(partition by job order by ename) rn from emp ) x RN CLERKS ANALYSTS MGRS PREZ SALES -- ---------- ---------- ---------- ---------- ---------- 1 FORD 2 SCOTT 1 ADAMS 2 JAMES 3 MILLER 4 SMITH 1 BLAKE 2 CLARK 3 JONES 1 KING 1 ALLEN 2 MARTIN 3 TURNER 4 WARD At this point, the rows are transposed into columns and the last step is to remove the NULLs to make the result set more readable. To remove the NULLs use the aggregate function MAX and group by RN. (You can use the function MIN as well. The choice to use MAX is arbitrary, as you will only ever be aggregating one value per group.) There is only one value for each RN/JOB/ENAME combination. Grouping by RN in conjunction with the CASE expressions embedded within the calls to MAX ensures that each call to MAX results in picking only one name from a group of otherwise NULL values: select max(case when job='CLERK' then ename else null end) as clerks, max(case when job='ANALYST' then ename else null end) as analysts, max(case when job='MANAGER' then ename else null end) as mgrs, max(case when job='PRESIDENT' then ename else null end) as prez, max(case when job='SALESMAN' then ename else null end) as sales from ( Select job, ename, row_number()over(partition by job order by ename) rn from emp ) x group by rn CLERKS ANALYSTS MGRS PREZ SALES ------ -------- ----- ---- ------ MILLER FORD CLARK KING TURNER JAMES SCOTT BLAKE MARTIN ADAMS JONES WARD SMITH ALLEN The technique of using ROW_NUMBER OVER to create unique combinations of rows is extremely useful for formatting query results. Consider the query below that creates a sparse report showing employees by DEPTNO and JOB: select deptno dno, job, max(case when deptno=10 then ename else null end) as d10, max(case when deptno=20 then ename else null end) as d20, max(case when deptno=30 then ename else null end) as d30, max(case when job='CLERK' then ename else null end) as clerks, max(case when job='ANALYST' then ename else null end) as anals, max(case when job='MANAGER' then ename else null end) as mgrs, max(case when job='PRESIDENT' then ename else null end) as prez, max(case when job='SALESMAN' then ename else null end) as sales from ( Select deptno, job, ename, row_number()over(partition by job order by ename) rn_job, row_number()over(partition by job order by ename) rn_deptno from emp ) x group by deptno, job, rn_deptno, rn_job order by 1 DNO JOB D10 D20 D30 CLERKS ANALS MGRS PREZ SALES --- --------- ------ ----- ------ ------ ----- ----- ---- ------ 10 CLERK MILLER MILLER 10 MANAGER CLARK CLARK 10 PRESIDENT KING KING 20 ANALYST FORD FORD 20 ANALYST SCOTT SCOTT 20 CLERK ADAMS ADAMS 20 CLERK SMITH SMITH 20 MANAGER JONES JONES 30 CLERK JAMES JAMES 30 MANAGER BLAKE BLAKE 30 SALESMAN ALLEN ALLEN 30 SALESMAN MARTIN MARTIN 30 SALESMAN TURNER TURNER 30 SALESMAN WARD WARD By simply modifying what you group by (hence the nonaggregate items in the SELECT list above), you can produce reports with different formats. It is worth the time of changing things around to understand how these formats change based on what you include in your GROUP BY clause. PostgreSQL and MySQLThe technique for these RDBMSs is the same as for the others once a method of creating unique JOB/ENAME combinations is established. The first step is to use a scalar subquery to provide a "row number" or "rank" for each JOB/ENAME combination: select e.job, e.ename, (select count(*) from emp d where e.job=d.job and e.empno < d.empno) as rnk from emp e JOB ENAME RNK --------- ---------- ---------- CLERK SMITH 3 SALESMAN ALLEN 3 SALESMAN WARD 2 MANAGER JONES 2 SALESMAN MARTIN 1 MANAGER BLAKE 1 MANAGER CLARK 0 ANALYST SCOTT 1 PRESIDENT KING 0 SALESMAN TURNER 0 CLERK ADAMS 2 CLERK JAMES 1 ANALYST FORD 0 CLERK MILLER 0 Giving each JOB/ENAME combination a unique "rank" makes each row unique. Even if there are employees with the same name working the same job, no two employees will share the same rank within a job. This step is the most important step in solving the problem. Without this first step, the aggregation in the outer query will remove necessary rows. Consider what the result set would look like without applying a rank to each JOB/ENAME combination, using the same technique as seen in the first recipe: select max(case when job='CLERK' then ename else null end) as clerks, max(case when job='ANALYST' then ename else null end) as analysts, max(case when job='MANAGER' then ename else null end) as mgrs, max(case when job='PRESIDENT' then ename else null end) as prez, max(case when job='SALESMAN' then ename else null end) as sales from emp CLERKS ANALYSTS MGRS PREZ SALES ---------- ---------- ---------- ---------- ---------- SMITH SCOTT JONES KING WARD Unfortunately, only one row is returned for each JOB: the employee with the MAX ENAME. When it comes time to pivot the result set, using MIN or MAX should serve as a means to remove NULLs from the result set, not to restrict the ENAMEs returned. Now, that you see the purpose of applying a rank, you can move on to the next step. The next step uses a CASE expression to organize the ENAMEs into their proper column (JOB): select rnk, case when job='CLERK' then ename else null end as clerks, case when job='ANALYST' then ename else null end as analysts, case when job='MANAGER' then ename else null end as mgrs, case when job='PRESIDENT' then ename else null end as prez, case when job='SALESMAN' then ename else null end as sales from ( Select e.job, e.ename, (select count(*) from emp d where e.job=d.job and e.empno < d.empno) as rnk from emp e ) x RNK CLERKS ANALYSTS MGRS PREZ SALES --- ------ -------- ----- ---- ---------- 3 SMITH 3 ALLEN 2 WARD 2 JONES 1 MARTIN 1 BLAKE 0 CLARK 1 SCOTT 0 KING 0 TURNER 2 ADAMS 1 JAMES 0 FORD 0 MILLER At this point, the rows are transposed into columns and the last step is to remove the NULLs to make the result set more readable. To remove the NULLs use the aggregate function MAX and group by RNK. (MAX is an arbitrary choice. You can use the function MIN as well.) There is only one value for each RN/JOB/ENAME combination, so the application of the aggregate function is simply to remove NULLs: select max(case when job='CLERK' then ename else null end) as clerks, max(case when job='ANALYST' then ename else null end) as analysts, max(case when job='MANAGER' then ename else null end) as mgrs, max(case when job='PRESIDENT' then ename else null end) as prez, max(case when job='SALESMAN' then ename else null end) as sales from ( Select e.job, e.ename, (select count(*) from emp d where e.job=d.job and e.empno < d.empno) as rnk from emp e ) x group by rnk CLERKS ANALYSTS MGRS PREZ SALES ------ -------- ----- ---- ------ MILLER FORD CLARK KING TURNER JAMES SCOTT BLAKE MARTIN ADAMS JONES WARD SMITH ALLEN |