Recipe 7.9. Calculating a ModeProblemYou want to find the mode (for those of you who don't recall, the mode in mathematics is the element that appears most frequently for a given set of data) of the values in a column. For example, you wish to find mode of the salaries in DEPTNO 20. Based on the following salaries: select sal from emp where deptno = 20 order by sal SAL ---------- 800 1100 2975 3000 3000 the mode is 3000. SolutionDB2 and SQL ServerUse the window function DENSE_RANK to rank the counts of the salaries to facilitate extracting the mode: 1 select sal 2 from ( 3 select sal, 4 dense_rank()over( order by cnt desc) as rnk 5 from ( 6 select sal, count(*) as cnt 8 from emp 9 where deptno = 20 10 group by sal 11 ) x 12 ) y 13 where rnk = 1 OracleUsers on Oracle8i Database can use the solution provided for DB2. If you are on Oracle9i Database and later, you can use the KEEP extension to the aggregate function MAX to find the mode SAL. One important note is that if there are ties, i.e., multiple rows that are the mode, the solution using KEEP will only keep one, and that is the one with the highest salary. If you want to see all modes (if more than one exists), you must modify this solution or simply use the DB2 solution presented above. In this case, since 3000 is the mode SAL in DEPTNO 20 and is also the highest SAL, this solution is sufficient: 1 select max(sal) 2 keep(dense_rank first order by cnt desc) sal 3 from ( 4 select sal, count(*) cnt 5 from emp 6 where deptno=20 7 group by sal 8 ) MySQL and PostgreSQLUse a subquery to find the mode: 1 select sal 2 from emp 3 where deptno = 20 4 group by sal 5 having count(*) >= all ( select count(*) 6 from emp 7 where deptno = 20 8 group by sal ) DiscussionDB2 and SQL ServerThe inline view X returns each SAL and the number of times it occurs. Inline view Y uses the window function DENSE_RANK (which allows for ties) to sort the results. The results are ranked based on the number of times each SAL occurs as is seen below: 1 select sal, 2 dense_rank()over(order by cnt desc) as rnk 3 from ( 4 select sal,count(*) as cnt 5 from emp 6 where deptno = 20 7 group by sal 8 ) x SAL RNK ----- ---------- 3000 1 800 2 1100 2 2975 2 The outermost portion of query simply keeps the row(s) where RNK is 1. OracleThe inline view returns each SAL and the number of times it occurs and is shown below: select sal, count(*) cnt from emp where deptno=20 group by sal SAL CNT ----- ---------- 800 1 1100 1 2975 1 3000 2 The next step is to use the KEEP extension of the aggregate function MAX to find the mode. If you analyze the KEEP clause shown below you will notice three subclauses, DENSE_RANK, FIRST, and ORDER BY CNT DESC: keep(dense_rank first order by cnt desc) What this does is extremely convenient for finding the mode. The KEEP clause determines which SAL will be returned by MAX by looking at the value of CNT returned by the inline view. Working from right to left, the values for CNT are ordered in descending order, then the first is kept of all the values for CNT returned in DENSE_RANK order. Looking at the result set from the inline view, you can see that 3000 has the highest CNT of 2. The MAX(SAL) returned is the greatest SAL that has the greatest CNT, in this case 3000. See AlsoChapter 11, the section on "Finding Knight Values," for a deeper discussion of Oracle's KEEP extension of aggregate functions. MySQL and PostgreSQLThe subquery returns the number of times each SAL occurs. The outer query returns any SAL that has a number of occurrences greater than or equal to all of the counts returned by the subquery (or to put it another way, the outer query returns the most common salaries in DEPTNO 20). |