Recipe 7.6. Generating a Running TotalProblemYou want to calculate a running total of values in a column. SolutionAs an example, the following solutions show how to compute a running total of salaries for all employees. For readability, results are ordered by SAL whenever possible so that you can easily eyeball the progression of the running total. DB2 and OracleUse the windowing version of the function SUM to compute a running total: 1 select ename, sal, 2 sum(sal) over (order by sal,empno) as running_total 3 from emp 4 order by 2 ENAME SAL RUNNING_TOTAL ---------- ---------- ------------- SMITH 800 800 JAMES 950 1750 ADAMS 1100 2850 WARD 1250 4100 MARTIN 1250 5350 MILLER 1300 6650 TURNER 1500 8150 ALLEN 1600 9750 CLARK 2450 12200 BLAKE 2850 15050 JONES 2975 18025 SCOTT 3000 21025 FORD 3000 24025 KING 5000 29025 MySQL, PostgreSQL, and SQL ServerUse a scalar subquery to compute a running total (without the use of a window function such as SUM OVER, you cannot easily order the result set by SAL as in the DB2 and Oracle solution). Ultimately, the running total is correct (the final value is the same as the above recipe), but the intermediate values differ due to the lack of ordering: 1 select e.ename, e.sal, 2 (select sum(d.sal) from emp d 3 where d.empno <= e.empno) as running_total 4 from emp e 5 order by 3 ENAME SAL RUNNING_TOTAL ---------- ---------- ------------- SMITH 800 800 ALLEN 1600 2400 WARD 1250 3650 JONES 2975 6625 MARTIN 1250 7875 BLAKE 2850 10725 CLARK 2450 13175 SCOTT 3000 16175 KING 5000 21175 TURNER 1500 22675 ADAMS 1100 23775 JAMES 950 24725 FORD 3000 27725 MILLER 1300 29025 DiscussionGenerating a running total is one of the tasks made simple by the new ANSI windowing functions. For DBMSs that do not yet support these windowing functions, a scalar subquery (joining on a field with unique values) is required. DB2 and OracleThe windowing function SUM OVER makes generating a running total a simple task. The ORDER BY clause in the solution includes not only the SAL column, but also the EMPNO column (which is the primary key) to avoid duplicate values in the running total. The column RUNNING_TOTAL2 in the following example illustrates the problem that you might otherwise have with duplicates: select empno, sal, sum(sal)over(order by sal,empno) as running_total1, sum(sal)over(order by sal) as running_total2 from emp order by 2 ENAME SAL RUNNING_TOTAL1 RUNNING_TOTAL2 ---------- ---------- -------------- -------------- SMITH 800 800 800 JAMES 950 1750 1750 ADAMS 1100 2850 2850 WARD 1250 4100 5350 MARTIN 1250 5350 5350 MILLER 1300 6650 6650 TURNER 1500 8150 8150 ALLEN 1600 9750 9750 CLARK 2450 12200 12200 BLAKE 2850 15050 15050 JONES 2975 18025 18025 SCOTT 3000 21025 24025 FORD 3000 24025 24025 KING 5000 29025 29025 The values in RUNNING_TOTAL2 for WARD, MARTIN, SCOTT, and FORD are incorrect. Their salaries occur more than once, and those duplicates are summed together and added to the running total. This is why EMPNO (which is unique) is needed to produce the (correct) results that you see in RUNNING_TOTAL1. Consider this: for ADAMS you see 2850 for RUNNING_TOTAL1 and RUNNING_TOTAL2. Add WARD's salary of 1250 to 2850 and you get 4100, yet RUNNING_TOTAL2 returns 5350. Why? Since WARD and MARTIN have the same SAL, their two 1250 salaries are added together to yield 2500, which is then added to 2850 to arrive at 5350 for both WARD and MARTIN. By specifying a combination of columns to order by that cannot result in duplicate values (e.g., any combination of SAL and EMPNO is unique), you ensure the correct progression of the running total. MySQL, PostgreSQL, and SQL ServerUntil windowing functions are fully supported for these DBMSs, you can use a scalar subquery to compute a running total. You must join on a column with unique values; otherwise the running total will have incorrect values in the event that duplicate salaries exist. The key to this recipe's solution is the join on D.EMPNO to E. EMPNO, which returns (sums) every D.SAL where D.EMPNO is less than or equal E.EMPNO. This can be understood easily by rewriting the scalar subquery as a join for a handful of the employees: select e.ename as ename1, e.empno as empno1, e.sal as sal1, d.ename as ename2, d.empno as empno2, d.sal as sal2 from emp e, emp d where d.empno <= e.empno and e.empno = 7566 ENAME EMPNO1 SAL1 ENAME EMPNO2 SAL2 ---------- ---------- ---------- ---------- ---------- ---------- JONES 7566 2975 SMITH 7369 800 JONES 7566 2975 ALLEN 7499 1600 JONES 7566 2975 WARD 7521 1250 JONES 7566 2975 JONES 7566 2975 Every value in EMPNO2 is compared against every value in EMPNO1. For every row where the value in EMPNO2 is less than or equal to the value in EMPNO1, the value in SAL2 is included in the sum. In this snippet, the EMPNO values for employees Smith, Allen, Ward, and Jones are compared against the EMPNO of Jones. Since all four employees' EMPNOs meet the condition of being less than or equal to Jones' EMPNO, those salaries are summed. Any employee whose EMPNO is greater than Jones' is not included in the SUM (in this snippet). The way the full query works is by summing all the salaries where the corresponding EMPNO is less than or equal to 7934 (Miller's EMPNO), which is the highest in the table. |