Recipe7.13.Computing Averages Without High and Low Values


Recipe 7.13. Computing Averages Without High and Low Values

Problem

You want to compute an average, but you wish to exclude the highest and lowest values in order to (hopefully) reduce the effect of skew. For example, you want to compute the average salary of all employees excluding the highest and lowest salaries.

Solution

MySQL and PostgreSQL

Use subqueries to exclude high and low values:

 1 select avg(sal) 2   from emp 3  where sal not in ( 4     (select min(sal) from emp), 5     (select max(sal) from emp) 6  ) 

DB2, Oracle, and SQL Server

Use an inline view with the windowing functions MAX OVER and MIN OVER to generate a result set from which you can easily eliminate the high and low values:

 1 select avg(sal) 2   from ( 3 select sal, min(sal)over() min_sal, max(sal)over( ) max_sal 4   from emp 5        ) x 6  where sal not in (min_sal,max_sal) 

Discussion

MySQL and PostgreSQL

The subqueries return the highest and lowest salaries in the table. By using NOT IN against the values returned, you exclude the highest and lowest salaries from the average. Keep in mind that if there are duplicates (if multiple employees have the highest or lowest salaries), they will all be excluded from the average. If your goal is to exclude only a single instance of the high and low values, simply subtract them from the SUM and then divide:

 select (sum(sal)-min(sal)-max(sal))/(count(*)-2)   from emp 

DB2, Oracle, and SQL Server

Inline view X returns each salary along with the highest and lowest salary:

  select sal, min(sal)over() min_sal, max(sal)over( ) max_sal   from emp       SAL   MIN_SAL   MAX_SAL --------- --------- ---------       800       800      5000      1600       800      5000      1250       800      5000      2975       800      5000      1250       800      5000      2850       800      5000      2450       800      5000      3000       800      5000      5000       800      5000      1500       800      5000      1100       800      5000       950       800      5000      3000       800      5000      1300       800      5000 

You can access the high and low salary at every row, so finding which salaries are highest and/or lowest is trivial. The outer query filters the rows returned from inline view X such that any salary that matches either MIN_SAL or MAX_SAL is excluded from the average.




SQL Cookbook
SQL Cookbook (Cookbooks (OReilly))
ISBN: 0596009763
EAN: 2147483647
Year: 2005
Pages: 235

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net