Recipe7.9.Calculating a Mode


Recipe 7.9. Calculating a Mode

Problem

You want to find the mode (for those of you who don't recall, the mode in mathematics is the element that appears most frequently for a given set of data) of the values in a column. For example, you wish to find mode of the salaries in DEPTNO 20. Based on the following salaries:

  select sal   from emp  where deptno = 20  order by sal        SAL ----------        800       1100       2975       3000       3000 

the mode is 3000.

Solution

DB2 and SQL Server

Use the window function DENSE_RANK to rank the counts of the salaries to facilitate extracting the mode:

  1 select sal  2   from (  3 select sal,  4        dense_rank()over( order by cnt desc) as rnk  5   from (  6 select sal, count(*) as cnt  8   from emp  9  where deptno = 20 10  group by sal 11        ) x 12        ) y 13  where rnk = 1 

Oracle

Users on Oracle8i Database can use the solution provided for DB2. If you are on Oracle9i Database and later, you can use the KEEP extension to the aggregate function MAX to find the mode SAL. One important note is that if there are ties, i.e., multiple rows that are the mode, the solution using KEEP will only keep one, and that is the one with the highest salary. If you want to see all modes (if more than one exists), you must modify this solution or simply use the DB2 solution presented above. In this case, since 3000 is the mode SAL in DEPTNO 20 and is also the highest SAL, this solution is sufficient:

 1 select max(sal) 2        keep(dense_rank first order by cnt desc) sal 3   from ( 4 select sal, count(*) cnt 5   from emp 6  where deptno=20 7  group by sal 8        ) 

MySQL and PostgreSQL

Use a subquery to find the mode:

 1 select sal 2   from emp 3  where deptno = 20 4  group by sal 5 having count(*) >= all ( select count(*) 6                               from emp 7                              where deptno = 20 8                              group by sal ) 

Discussion

DB2 and SQL Server

The inline view X returns each SAL and the number of times it occurs. Inline view Y uses the window function DENSE_RANK (which allows for ties) to sort the results.

The results are ranked based on the number of times each SAL occurs as is seen below:

 1 select sal, 2        dense_rank()over(order by cnt desc) as rnk 3   from ( 4 select sal,count(*) as cnt 5   from emp 6  where deptno = 20 7  group by sal 8        ) x   SAL          RNK -----   ----------  3000            1   800            2  1100            2  2975            2 

The outermost portion of query simply keeps the row(s) where RNK is 1.

Oracle

The inline view returns each SAL and the number of times it occurs and is shown below:

 select sal, count(*) cnt   from emp  where deptno=20  group by sal   SAL          CNT -----   ----------   800            1  1100            1  2975            1  3000            2 

The next step is to use the KEEP extension of the aggregate function MAX to find the mode. If you analyze the KEEP clause shown below you will notice three subclauses, DENSE_RANK, FIRST, and ORDER BY CNT DESC:

 keep(dense_rank first order by cnt desc) 

What this does is extremely convenient for finding the mode. The KEEP clause determines which SAL will be returned by MAX by looking at the value of CNT returned by the inline view. Working from right to left, the values for CNT are ordered in descending order, then the first is kept of all the values for CNT returned in DENSE_RANK order. Looking at the result set from the inline view, you can see that 3000 has the highest CNT of 2. The MAX(SAL) returned is the greatest SAL that has the greatest CNT, in this case 3000.

See Also

Chapter 11, the section on "Finding Knight Values," for a deeper discussion of Oracle's KEEP extension of aggregate functions.

MySQL and PostgreSQL

The subquery returns the number of times each SAL occurs. The outer query returns any SAL that has a number of occurrences greater than or equal to all of the counts returned by the subquery (or to put it another way, the outer query returns the most common salaries in DEPTNO 20).




SQL Cookbook
SQL Cookbook (Cookbooks (OReilly))
ISBN: 0596009763
EAN: 2147483647
Year: 2005
Pages: 235

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net