Recipe 7.7. Generating a Running ProductProblemYou want to compute a running product on a numeric column. The operation is similar to "Calculating a Running Total," but using multiplication instead of addition. SolutionBy way of example, the solutions all compute running products of employee salaries. While a running product of salaries may not be all that useful, the technique can easily be applied to other, more useful domains. DB2 and OracleUse the windowing function SUM OVER and take advantage of the fact that you can simulate multiplication by adding logarithms: 1 select empno,ename,sal, 2 exp(sum(ln(sal))over(order by sal,empno)) as running_prod 3 from emp 4 where deptno = 10 EMPNO ENAME SAL RUNNING_PROD ----- ---------- ---- -------------------- 7934 MILLER 1300 1300 7782 CLARK 2450 3185000 7839 KING 5000 15925000000 It is not valid in SQL to compute logarithms of values less than or equal to zero. If you have such values in your tables you need to avoid passing those invalid values to SQL's LN function. Precautions against invalid values and NULLs are not provided in this solution for the sake of readability, but you should consider whether to place such precautions in production code that you write. If you absolutely must work with negative and zero values, then this solution may not work for you. An alternative, Oracle-only solution is to use the MODEL clause that became available in Oracle Database 10g. In the following example, each SAL is returned as a negative number to show that negative values will not cause a problem for the running product: 1 select empno, ename, sal, tmp as running_prod 2 from ( 3 select empno,ename,-sal as sal 4 from emp 5 where deptno=10 6 ) 7 model 8 dimension by(row_number()over(order by sal desc) rn ) 9 measures(sal, 0 tmp, empno, ename) 10 rules ( 11 tmp[any] = case when sal[cv()-1] is null then sal[cv()] 12 else tmp[cv()-1]*sal[cv()] 13 end 14 ) EMPNO ENAME SAL RUNNING_PROD ----- ---------- ---- -------------------- 7934 MILLER -1300 -1300 7782 CLARK -2450 3185000 7839 KING -5000 -15925000000 MySQL, PostgreSQL, and SQL ServerYou still use the approach of summing logarithms, but these platforms do not support windowing functions, so use a scalar subquery instead: 1 select e.empno,e.ename,e.sal, 2 (select exp(sum(ln(d.sal))) 3 from emp d 4 where d.empno <= e.empno 5 and e.deptno=d.deptno) as running_prod 6 from emp e 7 where e.deptno=10 EMPNO ENAME SAL RUNNING_PROD ----- ---------- ---- -------------------- 7782 CLARK 2450 2450 7839 KING 5000 12250000 7934 MILLER 1300 15925000000 SQL Server users use LOG instead of LN. DiscussionExcept for the MODEL clause solution, which is only usable with Oracle Database 10g or later, all the solutions take advantage of the fact that you can sum two numbers by:
The one caveat when using this approach is that it doesn't work for summing zero or negative values, because any value less than or equal to zero is out of range for an SQL logarithm. DB2 and OracleFor an explanation of how the window function SUM OVER works, see the previous recipe "Generating a Running Total." In Oracle Database 10g and later, you can generate running products via the MODEL clause. Using the MODEL clause along with the window function ROW_NUMBER allows you to easily access prior rows. Each item in the MEASURES list can be accessed like an array. The arrays can then be searched by using the items in the DIMENSIONS list (which are the values returned by ROW_NUMBER, alias RN): select empno, ename, sal, tmp as running_prod,rn from ( select empno,ename,-sal as sal from emp where deptno=10 ) model dimension by(row_number()over(order by sal desc) rn ) measures(sal, 0 tmp, empno, ename) rules () EMPNO ENAME SAL RUNNING_PROD RN ----- ---------- ---------- ------------ ---------- 7934 MILLER -1300 0 1 7782 CLARK -2450 0 2 7839 KING -5000 0 3 Observe that SAL[1] has a value of1300. Because the numbers are increasing by one with no gaps, you can reference prior rows by subtracting one. The RULES clause: rules ( tmp[any] = case when sal[cv()-1] is null then sal[cv()] else tmp[cv()-1]*sal[cv()] end ) uses the built-in operator, ANY, to work through each row without hard-coding. ANY in this case will be the values 1, 2, and 3. TMP[n] is initialized to zero. A value is assigned to TMP[n] by evaluating the current value (the function CV returns the current value) of the corresponding SAL row. TMP[1] is initially zero and SAL[1] is1300. There is no value for SAL[0] so TMP[1] is set to SAL[1]. After TMP[1] is set, the next row is TMP[2]. First SAL[1] is evaluated (SAL[CV( )1] is SAL[1] because the current value of ANY is now 2). SAL[1] is not null, it is1300, so TMP[2] is set to the product of TMP[1] and SAL[2]. This is continued for all the rows. MySQL, PostgreSQL, and SQL ServerSee "Generating a Running Total" earlier in this chapter for an explanation of the subquery approach used for the MySQL, PostgreSQL, and SQL Server solutions. Be aware that the output of the subquery-based solution is slightly different from that of the Oracle and DB2 solutions due to the EMPNO comparison (the running product is computed in a different order). Like a running total, the summation is driven by the predicate of the scalar subquery; the ordering of rows is by EMPNO for this solution whereas the Oracle/DB2 solution order is by SAL. |