Recipe14.9.Adding a Column Header into a Double Pivoted Result Set

Recipe 14.9. Adding a Column Header into a Double Pivoted Result Set

Problem

You want to stack two result sets, and then pivot them into two columns. Additionally, you want to add a "header" for each group of rows in each column. For example, you have two tables containing information about employees working in different areas of development in your company (say, in research and applications):

  select * from it_research DEPTNO ENAME ------ --------------------    100 HOPKINS    100 JONES    100 TONEY    200 MORALES    200 P.WHITAKER    200 MARCIANO    200 ROBINSON    300 LACY    300 WRIGHT    300 J.TAYLOR  select * from it_apps DEPTNO ENAME ------ -----------------    400 CORRALES    400 MAYWEATHER    400 CASTILLO    400 MARQUEZ    400 MOSLEY    500 GATTI    500 CALZAGHE    600 LAMOTTA    600 HAGLER    600 HEARNS    600 FRAZIER    700 GUINN    700 JUDAH    700 MARGARITO

You would like to create a report listing the employees from each table in two columns. You want to return the DEPTNO followed by ENAME for each. Ultimately you want to return the following result set:

 RESEARCH             APPS -------------------- --------------- 100                  400   JONES                MAYWEATHER   TONEY                CASTILLO   HOPKINS              MARQUEZ 200                    MOSLEY   P.WHITAKER           CORRALES   MARCIANO           500   ROBINSON             CALZAGHE   MORALES              GATTI 300                  600   WRIGHT               HAGLER   J.TAYLOR             HEARNS   LACY                 FRAZIER                        LAMOTTA                      700                        JUDAH                        MARGARITO                        GUINN

Solution

For the most part, this solution requires nothing more than a simple stack-n-pivot (union then pivot) with an added twist: the DEPTNO must precede the ENAME for each employee returned. The technique here uses a Cartesian product to generate an extra row for each DEPTNO, so you have the required rows necessary to show all employees, plus room for the DEPTNO. The solution uses Oracle syntax, but since DB2 supports window functions that can compute moving windows (the framing clause), converting this solution to work for DB2 is trivial. Because the IT_ RESEARCH and IT_APPS tables exist only for this recipe, their table creation statements are shown along with this solution:

 create table IT_research (deptno number, ename varchar2(20)) insert into IT_research values (100,'HOPKINS') insert into IT_research values (100,'JONES') insert into IT_research values (100,'TONEY') insert into IT_research values (200,'MORALES') insert into IT_research values (200,'P.WHITAKER') insert into IT_research values (200,'MARCIANO') insert into IT_research values (200,'ROBINSON') insert into IT_research values (300,'LACY') insert into IT_research values (300,'WRIGHT') insert into IT_research values (300,'J.TAYLOR') create table IT_apps (deptno number, ename varchar2(20)) insert into IT_apps values (400,'CORRALES') insert into IT_apps values (400,'MAYWEATHER') insert into IT_apps values (400,'CASTILLO') insert into IT_apps values (400,'MARQUEZ') insert into IT_apps values (400,'MOSLEY') insert into IT_apps values (500,'GATTI') insert into IT_apps values (500,'CALZAGHE') insert into IT_apps values (600,'LAMOTTA') insert into IT_apps values (600,'HAGLER') insert into IT_apps values (600,'HEARNS') insert into IT_apps values (600,'FRAZIER') insert into IT_apps values (700,'GUINN') insert into IT_apps values (700,'JUDAH') insert into IT_apps values (700,'MARGARITO')  1 select max(decode(flag2,0,it_dept)) research,  2        max(decode(flag2,1,it_dept)) apps  3   from (  4 select sum(flag1)over(partition by flag2  5                           order by flag1,rownum) flag,  6        it_dept, flag2  7   from (  8 select 1 flag1, 0 flag2,  9        decode(rn,1,to_char(deptno),' '||ename) it_dept 10   from ( 11 select x.*, y.id, 12        row_number( )over(partition by x.deptno order by y.id) rn 13   from ( 14 select deptno, 15        ename, 16        count(*)over(partition by deptno) cnt 17   from it_research 18        ) x, 19        (select level id from dual connect by level <= 2) y 20        ) 21  where rn <= cnt+1 22 union all 23 select 1 flag1, 1 flag2, 24        decode(rn,1,to_char(deptno),' '||ename) it_dept 25   from ( 26 select x.*, y.id, 27        row_number( )over(partition by x.deptno order by y.id) rn 28   from ( 29 select deptno, 30        ename, 31        count(*)over(partition by deptno) cnt 32   from it_apps 33        ) x, 34        (select level id from dual connect by level <= 2) y 35        ) 36  where rn <= cnt+1 37        ) tmp1 38        ) tmp2 39  group by flag

Discussion

Like many of the other warehousing/report type queries, the solution presented looks quite convoluted but once broken down you'll seen it's nothing more than a stack-n-pivot with a Cartesian twist (on the rocks, with a little umbrella). The way to break this query down is to work on each part of the UNION ALL first, then bring it together for the pivot. Let's start with the lower portion of the UNION ALL:

  select 1 flag1, 1 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y        ) z  where rn <= cnt+1 FLAG1      FLAG2 IT_DEPT ----- ---------- --------------------------     1          1 400     1          1   MAYWEATHER     1          1   CASTILLO     1          1   MARQUEZ     1          1   MOSLEY     1          1   CORRALES     1          1 500     1          1   CALZAGHE     1          1   GATTI     1          1 600     1          1   HAGLER     1          1   HEARNS     1          1   FRAZIER     1          1   LAMOTTA     1          1 700     1          1   JUDAH     1          1   MARGARITO     1          1   GUINN

Let's examine exactly how that result set is put together. Breaking down the above query to its simplest components, you have inline view X, which simply returns each ENAME and DEPTNO and the number of employees in each DEPTNO from table IT_APPS. The results are as follows:

  select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps DEPTNO ENAME                       CNT ------ -------------------- ----------    400 CORRALES                      5    400 MAYWEATHER                    5    400 CASTILLO                      5    400 MARQUEZ                       5    400 MOSLEY                        5    500 GATTI                         2    500 CALZAGHE                      2    600 LAMOTTA                       4    600 HAGLER                        4    600 HEARNS                        4    600 FRAZIER                       4    700 GUINN                         3    700 JUDAH                         3    700 MARGARITO                     3

The next step is to create a Cartesian product between the rows returned from inline view X and two rows generated from DUAL using CONNECT BY. The results of this operation are as follows:

  select *   from ( select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y  order by 2 DEPTNO ENAME      CNT  ID ------ ---------- --- ---    500 CALZAGHE     2   1    500 CALZAGHE     2   2    400 CASTILLO     5   1    400 CASTILLO     5   2    400 CORRALES     5   1    400 CORRALES     5   2    600 FRAZIER      4   1    600 FRAZIER      4   2    500 GATTI        2   1    500 GATTI        2   2    700 GUINN        3   1    700 GUINN        3   2    600 HAGLER       4   1    600 HAGLER       4   2    600 HEARNS       4   1    600 HEARNS       4   2    700 JUDAH        3   1    700 JUDAH        3   2    600 LAMOTTA      4   1    600 LAMOTTA      4   2    700 MARGARITO    3   1    700 MARGARITO    3   2    400 MARQUEZ      5   1    400 MARQUEZ      5   2    400 MAYWEATHER   5   1    400 MAYWEATHER   5   2    400 MOSLEY       5   1    400 MOSLEY       5   2

As you can see from these results, each row from inline view X is now returned twice due to the Cartesian product with inline view Y. The reason a Cartesian is needed will become clear shortly. The next step is to take the current result set and rank each employee within his DEPTNO by ID (ID has a value of 1 or 2 as was returned by the Cartesian product). The result of this ranking is shown in the output from the following query:

  select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y DEPTNO ENAME      CNT ID          RN ------ ---------- --- --- ----------    400 CORRALES     5   1          1    400 MAYWEATHER   5   1          2    400 CASTILLO     5   1          3    400 MARQUEZ      5   1          4    400 MOSLEY       5   1          5    400 CORRALES     5   2          6    400 MOSLEY       5   2          7    400 MAYWEATHER   5   2          8    400 CASTILLO     5   2          9    400 MARQUEZ      5   2         10    500 GATTI        2   1          1    500 CALZAGHE     2   1          2    500 GATTI        2   2          3    500 CALZAGHE     2   2          4    600 LAMOTTA      4   1          1    600 HAGLER       4   1          2    600 HEARNS       4   1          3    600 FRAZIER      4   1          4    600 LAMOTTA      4   2          5    600 HAGLER       4   2          6    600 FRAZIER      4   2          7    600 HEARNS       4   2          8    700 GUINN        3   1          1    700 JUDAH        3   1          2    700 MARGARITO    3   1          3    700 GUINN        3   2          4    700 JUDAH        3   2          5    700 MARGARITO    3   2          6

Each employee is ranked; then his duplicate is ranked. The result set contains duplicates for all employees in table IT_APP, along with their ranking within their DEPTNO. The reason you need to generate these extra rows is because you need a slot in the result set to slip in the DEPTNO in the ENAME column. If you Cartesian-join IT_APPS with a one-row table, you get no extra rows (because cardinality of any table x1 = cardinality of that table).

The next step is to take the results returned thus far and pivot the result set such that all the ENAMES are returned in one column but are preceded by the DEPTNO they are in. The following query shows how this happens:

  select 1 flag1, 1 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y        ) z  where rn <= cnt+1 FLAG1      FLAG2 IT_DEPT ----- ---------- -------------------------     1          1 400     1          1   MAYWEATHER     1          1   CASTILLO     1          1   MARQUEZ     1          1   MOSLEY     1          1   CORRALES     1          1 500     1          1   CALZAGHE     1          1   GATTI     1          1 600     1          1   HAGLER     1          1   HEARNS     1          1   FRAZIER     1          1   LAMOTTA     1          1 700     1          1   JUDAH     1          1   MARGARITO     1          1   GUINN

FLAG1 and FLAG2 come into play later and can be ignored for the moment. Focus your attention on the rows in IT_DEPT. The number of rows returned for each DEPTNO is CNT*2, but all that is needed is CNT+1, which is the filter in the WHERE clause. RN is the ranking for each employee. The rows kept are all those ranked less than or equal to CNT+1; i.e., all employees in each DEPTNO plus one more (this extra employee is the employee who is ranked first in their DEPTNO). This extra row is where the DEPTNO will slide in. By using DECODE (an older Oracle function that gives more or less the equivalent of a CASE expression) to evaluate the value of RN, you can slide the value of DEPTNO into the result set. The employee who was at position 1 (based on the value of RN) is still shown in the result set, but is now last in each DEPTNO (because the order is irrelevant, this is not a problem). That pretty much covers the lower part of the UNION ALL.

The upper part of the UNION ALL is processed in the same way as the lower part so there's no need to explain how that works. Instead, let's examine the result set returned when stacking the queries:

  select 1 flag1, 0 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno,        ename,        count(*)over(partition by deptno) cnt   from it_research        ) x,        (select level id from dual connect by level <= 2) y        )  where rn <= cnt+1 union all select 1 flag1, 1 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y        )  where rn <= cnt+1 FLAG1      FLAG2 IT_DEPT ----- ---------- -----------------------     1          0 100     1          0   JONES     1          0   TONEY     1          0   HOPKINS     1          0 200     1          0   P.WHITAKER     1          0   MARCIANO     1          0   ROBINSON     1          0   MORALES     1          0 300     1          0   WRIGHT     1          0   J.TAYLOR     1          0   LACY     1          1 400     1          1   MAYWEATHER     1          1   CASTILLO     1          1   MARQUEZ     1          1   MOSLEY     1          1   CORRALES     1          1 500     1          1   CALZAGHE     1          1   GATTI     1          1 600     1          1   HAGLER     1          1   HEARNS     1          1   FRAZIER     1          1   LAMOTTA     1          1 700     1          1   JUDAH     1          1   MARGARITO     1          1   GUINN

At this point, it isn't clear what FLAG1's purpose is, but you can see that FLAG2 identifies which rows come from which part of the UNION ALL (0 for the upper part, 1 for the lower part).

The next step is to wrap the stacked result set in an inline view and create a running total on FLAG1 (finally, its purpose is revealed!), which will act as a ranking for each row in each stack. The results of the ranking (running total) are shown below:

  select sum(flag1)over(partition by flag2                       order by flag1,rownum) flag,        it_dept, flag2   from ( select 1 flag1, 0 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number()over(partition by x.deptno order by y.id) rn   from ( select deptno,        ename,        count(*)over(partition by deptno) cnt   from it_research        ) x,        (select level id from dual connect by level <= 2) y        )  where rn <= cnt+1 union all select 1 flag1, 1 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y        )  where rn <= cnt+1        ) tmp1 FLAG IT_DEPT              FLAG2 ---- --------------- ----------    1   100                    0    2     JONES                0    3     TONEY                0    4     HOPKINS              0    5   200                    0    6     P.WHITAKER           0    7     MARCIANO             0    8     ROBINSON             0    9     MORALES              0   10    300                   0   11     WRIGHT               0   12     J.TAYLOR             0   13     LACY                 0    1   400                    1    2     MAYWEATHER           1    3     CASTILLO             1    4     MARQUEZ              1    5     MOSLEY               1    6     CORRALES             1    7   500                    1    8     CALZAGHEe            1    9     GATTI                1   10   600                    1   11     HAGLER               1   12     HEARNS               1   13     FRAZIER              1   14     LAMOTTA              1   15   700                    1   16     JUDAH                1   17     MARGARITO            1   18     GUINN                1

The last remaining step (finally!) is to pivot the value returned by TMP1 on FLAG2 while grouping by FLAG (the running total generated in TMP1). The results from TMP1 are wrapped in an inline view and pivoted (wrapped in a final inline view called TMP2). The final solution and result set is shown below:

  select max(decode(flag2,0,it_dept)) research,        max(decode(flag2,1,it_dept)) apps   from ( select sum(flag1)over(partition by flag2                           order by flag1,rownum) flag,        it_dept, flag2   from ( select 1 flag1, 0 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno,        ename,        count(*)over(partition by deptno) cnt   from it_research        ) x,        (select level id from dual connect by level <= 2) y        )  where rn <= cnt+1 union all select 1 flag1, 1 flag2,        decode(rn,1,to_char(deptno),' '||ename) it_dept   from ( select x.*, y.id,        row_number( )over(partition by x.deptno order by y.id) rn   from ( select deptno deptno,        ename,        count(*)over(partition by deptno) cnt   from it_apps        ) x,        (select level id from dual connect by level <= 2) y        )  where rn <= cnt+1        ) tmp1        ) tmp2  group by flag RESEARCH             APPS -------------------- --------------- 100                  400   JONES                MAYWEATHER   TONEY                CASTILLO   HOPKINS              MARQUEZ 200                    MOSLEY   P.WHITAKER           CORRALES   MARCIANO           500   ROBINSON             CALZAGHE   MORALES              GATTI 300                  600   WRIGHT               HAGLER   J.TAYLOR             HEARNS   LACY                 FRAZIER                        LAMOTTA                      700                        JUDAH                        MARGARITO                        GUINN