Recipe7.6.Generating a Running Total


Recipe 7.6. Generating a Running Total

Problem

You want to calculate a running total of values in a column.

Solution

As an example, the following solutions show how to compute a running total of salaries for all employees. For readability, results are ordered by SAL whenever possible so that you can easily eyeball the progression of the running total.

DB2 and Oracle

Use the windowing version of the function SUM to compute a running total:

  1 select ename, sal, 2        sum(sal) over (order by sal,empno) as running_total 3   from emp 4   order by 2 ENAME              SAL  RUNNING_TOTAL ----------  ----------  ------------- SMITH              800            800 JAMES              950           1750 ADAMS             1100           2850 WARD              1250           4100 MARTIN            1250           5350 MILLER            1300           6650 TURNER            1500           8150 ALLEN             1600           9750 CLARK             2450          12200 BLAKE             2850          15050 JONES             2975          18025 SCOTT             3000          21025 FORD              3000          24025 KING              5000          29025 

MySQL, PostgreSQL, and SQL Server

Use a scalar subquery to compute a running total (without the use of a window function such as SUM OVER, you cannot easily order the result set by SAL as in the DB2 and Oracle solution). Ultimately, the running total is correct (the final value is the same as the above recipe), but the intermediate values differ due to the lack of ordering:

  1 select e.ename, e.sal, 2        (select sum(d.sal) from emp d 3          where d.empno <= e.empno) as running_total 4   from emp e 5  order by 3 ENAME              SAL  RUNNING_TOTAL ----------  ----------  ------------- SMITH              800            800 ALLEN             1600           2400 WARD              1250           3650 JONES             2975           6625 MARTIN            1250           7875 BLAKE             2850          10725 CLARK             2450          13175 SCOTT             3000          16175 KING              5000          21175 TURNER            1500          22675 ADAMS             1100          23775 JAMES              950          24725 FORD              3000          27725 MILLER            1300          29025 

Discussion

Generating a running total is one of the tasks made simple by the new ANSI windowing functions. For DBMSs that do not yet support these windowing functions, a scalar subquery (joining on a field with unique values) is required.

DB2 and Oracle

The windowing function SUM OVER makes generating a running total a simple task. The ORDER BY clause in the solution includes not only the SAL column, but also the EMPNO column (which is the primary key) to avoid duplicate values in the running total. The column RUNNING_TOTAL2 in the following example illustrates the problem that you might otherwise have with duplicates:

  select empno, sal,        sum(sal)over(order by sal,empno) as running_total1,        sum(sal)over(order by sal) as running_total2   from emp  order by 2 ENAME              SAL  RUNNING_TOTAL1  RUNNING_TOTAL2 ----------  ----------  --------------  -------------- SMITH              800             800             800 JAMES              950            1750            1750 ADAMS             1100            2850            2850 WARD              1250            4100            5350 MARTIN            1250            5350            5350 MILLER            1300            6650            6650 TURNER            1500            8150            8150 ALLEN             1600            9750            9750 CLARK             2450           12200           12200 BLAKE             2850           15050           15050 JONES             2975           18025           18025 SCOTT             3000           21025           24025 FORD              3000           24025           24025 KING              5000           29025           29025 

The values in RUNNING_TOTAL2 for WARD, MARTIN, SCOTT, and FORD are incorrect. Their salaries occur more than once, and those duplicates are summed together and added to the running total. This is why EMPNO (which is unique) is needed to produce the (correct) results that you see in RUNNING_TOTAL1. Consider this: for ADAMS you see 2850 for RUNNING_TOTAL1 and RUNNING_TOTAL2. Add WARD's salary of 1250 to 2850 and you get 4100, yet RUNNING_TOTAL2 returns 5350. Why? Since WARD and MARTIN have the same SAL, their two 1250 salaries are added together to yield 2500, which is then added to 2850 to arrive at 5350 for both WARD and MARTIN. By specifying a combination of columns to order by that cannot result in duplicate values (e.g., any combination of SAL and EMPNO is unique), you ensure the correct progression of the running total.

MySQL, PostgreSQL, and SQL Server

Until windowing functions are fully supported for these DBMSs, you can use a scalar subquery to compute a running total. You must join on a column with unique values; otherwise the running total will have incorrect values in the event that duplicate salaries exist. The key to this recipe's solution is the join on D.EMPNO to E. EMPNO, which returns (sums) every D.SAL where D.EMPNO is less than or equal E.EMPNO. This can be understood easily by rewriting the scalar subquery as a join for a handful of the employees:

  select e.ename as ename1, e.empno as empno1, e.sal as sal1,          d.ename as ename2, d.empno as empno2, d.sal as sal2   from emp e, emp d  where d.empno <= e.empno    and e.empno = 7566 ENAME           EMPNO1        SAL1  ENAME           EMPNO2        SAL2 ----------  ----------  ----------  ----------  ----------  ---------- JONES             7566        2975  SMITH             7369         800 JONES             7566        2975  ALLEN             7499        1600 JONES             7566        2975  WARD              7521        1250 JONES             7566        2975  JONES             7566        2975 

Every value in EMPNO2 is compared against every value in EMPNO1. For every row where the value in EMPNO2 is less than or equal to the value in EMPNO1, the value in SAL2 is included in the sum. In this snippet, the EMPNO values for employees Smith, Allen, Ward, and Jones are compared against the EMPNO of Jones. Since all four employees' EMPNOs meet the condition of being less than or equal to Jones' EMPNO, those salaries are summed. Any employee whose EMPNO is greater than Jones' is not included in the SUM (in this snippet). The way the full query works is by summing all the salaries where the corresponding EMPNO is less than or equal to 7934 (Miller's EMPNO), which is the highest in the table.




SQL Cookbook
SQL Cookbook (Cookbooks (OReilly))
ISBN: 0596009763
EAN: 2147483647
Year: 2005
Pages: 235

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net