Recipe10.4.Filling in Missing Values in a Range of Values

Recipe 10.4. Filling in Missing Values in a Range of Values

Problem

You want to return the number of employees hired each year for the entire decade of the 1980s, but there are some years in which no employees were hired. You would like to return the following result set:

 YR          CNT ---- ---------- 1980          1 1981         10 1982          2 1983          1 1984          0 1985          0 1986          0 1987          0 1988          0 1989          0

Solution

The trick to this solution is returning zeros for years that saw no employees hired. If no employee was hired in a given year, then no rows for that year will exist in table EMP. If the year does not exist in the table, how can you return a count, any count, even zero? The solution requires you to outer join. You must supply a result set that returns all the years you want to see, and then perform a count against table EMP to see if there were any employees hired in each of those years.

DB2

Use table EMP as a pivot table (because it has 14 rows) and the built-in function YEAR to generate one row for each year in the decade of 1980. Outer join to table EMP and count how many employees were hired each year:

  1 select x.yr, coalesce(y.cnt,0) cnt  2   from (  3 select year(min(hiredate)over( )) -  4        mod(year(min(hiredate)over( )),10) +  5        row_number( )over( )-1 yr  6   from emp fetch first 10 rows only  7        ) x  8   left join  9        ( 10 select year(hiredate) yr1, count(*) cnt 11   from emp 12  group by year(hiredate) 13        ) y 14     on ( x.yr = y.yr1 )

Oracle

Use table EMP as a pivot table (because it has 14 rows) and the built-in functions TO_NUMBER and TO_CHAR to generate one row for each year in the decade of 1980. Outer join to table EMP and count how many employees were hired each year:

  1 select x.yr, coalesce(cnt,0) cnt  2   from (  3 select extract(year from min(hiredate)over( )) -  4        mod(extract(year from min(hiredate)over( )),10) +  5        rownum-1 yr  6   from emp  7  where rownum <= 10  8        ) x,  9        ( 10 select to_number(to_char(hiredate,'YYYY')) yr, count(*) cnt 11   from emp 12  group by to_number(to_char(hiredate,'YYYY')) 13        ) y 14  where x.yr = y.yr(+)

If you're using Oracle9 i Database or later, you can implement the solution using the newly supported JOIN clause:

  1 select x.yr, coalesce(cnt,0) cnt  2   from (  3 select extract(year from min(hiredate)over( )) -  4        mod(extract(year from min(hiredate)over( )),10) +  5        rownum-1 yr  6   from emp  7  where rownum <= 10  8        ) x  9   left join 10        ( 11 select to_number(to_char(hiredate,'YYYY')) yr, count(*) cnt 12   from emp 13  group by to_number(to_char(hiredate,'YYYY')) 14        ) y 15     on ( x.yr = y.yr )

PostgreSQL and MySQL

Use table T10 as a pivot table (because it has 10 rows) and the built-in function EXTRACT to generate one row for each year in the decade of 1980. Outer join to table EMP and count how many employees were hired each year:

  1 select y.yr, coalesce(x.cnt,0) as cnt  2   from (  3 select min_year-mod(cast(min_year as int),10)+rn as yr  4   from (  5 select (select min(extract(year from hiredate))  6           from emp) as min_year,  7        id-1 as rn  8   from t10  9        ) a 10        ) y 11   left join 12        ( 13 select extract(year from hiredate) as yr, count(*) as cnt 14   from emp 15  group by extract(year from hiredate) 16        ) x 17     on ( y.yr = x.yr )

SQL Server

  1 select x.yr, coalesce(y.cnt,0) cnt  2   from (  3 select top (10)  4        (year(min(hiredate)over( )) -  5         year(min(hiredate)over( ))%10)+  6         row_number( )over(order by hiredate)-1 yr  7   from emp  8        ) x  9   left join 10        ( 11 select year(hiredate) yr, count(*) cnt 12   from emp 13  group by year(hiredate) 14        ) y 15     on ( x.yr = y.yr )

Discussion

Despite the difference in syntax, the approach is the same for all solutions. Inline view X returns each year in the decade of the '80s by first finding the year of the earliest HIREDATE. The next step is to add RN1 to the difference between the earliest year and the earliest year modulus ten. To see how this works, simply execute inline view X and return each of the values involved separately. Listed below is the result set for inline view X using the window function MIN OVER (DB2, Oracle, SQL Server) and a scalar subquery (MySQL, PostgreSQL):

  select year(min(hiredate)over( )) -        mod(year(min(hiredate)over( )),10) +        row_number( )over( )-1 yr,        year(min(hiredate)over( )) min_year,        mod(year(min(hiredate)over( )),10) mod_yr,        row_number( )over( )-1 rn   from emp fetch first 10 rows only   YR   MIN_YEAR     MOD_YR         RN ---- ---------- ---------- ---------- 1980       1980          0          0 1981       1980          0          1 1982       1980          0          2 1983       1980          0          3 1984       1980          0          4 1985       1980          0          5 1986       1980          0          6 1987       1980          0          7 1988       1980          0          8 1989       1980          0          9  select min_year-mod(min_year,10)+rn as yr,        min_year,        mod(min_year,10) as mod_yr        rn   from ( select (select min(extract(year from hiredate))           from emp) as min_year,         id-1 as rn   from t10        ) x   YR   MIN_YEAR     MOD_YR         RN ---- ---------- ---------- ---------- 1980       1980          0          0 1981       1980          0          1 1982       1980          0          2 1983       1980          0          3 1984       1980          0          4 1985       1980          0          5 1986       1980          0          6 1987       1980          0          7 1988       1980          0          8 1989       1980          0          9

Inline view Y returns the year for each HIREDATE and the number of employees hired during that year:

  select year(hiredate) yr, count(*) cnt   from emp  group by year(hiredate)    YR        CNT ----- ----------  1980          1  1981         10  1982          2  1983          1

For the final solution, outer join inline view Y to inline view X so that every year is returned even if there are no employees hired.