Section 2.9. Aggregating Aggregates

   

2.9 Aggregating Aggregates

2.9.1 Problem

You want to select some data, aggregate it, and then aggregate it again. Here's an example of this type of problem: the administration at the university is preparing an internal report for the dean, who wants to compare grading habits of professors. One of the measures the dean wants to look at is the spread between the average term paper Score for each student in a given course. The spread is the difference between the best and the worst student Score per course. Your job is to find the best and the worst average Score in each course and calculate the difference.

2.9.2 Solution

Use a nested SELECT statement in the FROM clause of your query to generate the first aggregation. Then write your enclosing SQL statement so that it takes those aggregated results and aggregates them again. For example:

 SELECT CourseId, MAX(l.s) Best ,MIN(l.s) Worst,     MAX(l.s)-MIN(l.s) Spread  FROM (    SELECT CourseId, AVG(Score) AS s     FROM Students     GROUP BY CourseId, StudentName) AS l GROUP BY CourseId 

The query will return the following result:

 CourseId  Best   Worst  Spread                                    --------- ------ ------ ---------  ACCN101   14.15  12.33  1.81 MGMT120   21.66  14.16  7.50 

2.9.3 Discussion

SQL does not allow you to directly enclose one aggregate function within another. In other words, a query written in the following manner would not execute:

 SELECT CourseId, MAX(AVG(stock)), MIN(AVG(stock)) FROM Students GROUP BY CourseId, studentsName 

However, you can work around this by using a result from one query as the source for a second query. This trick can be used as a general solution for problems where you have to aggregate already aggregated data. In the case of the solution shown in this recipe, the innermost query takes all the term paper Scores for each student in each course and returns the average of those Scores. The intermediate results look like this:

 CourseId             s         -------------------- --------- ACCN101              12.333333 MGMT120              21.666666 ACCN101              12.533333 ACCN101              14.150000 MGMT120              14.166666 

The intermediate results are summarized by course and student. The outermost query then summarizes those results, but this time at the course level. This allows the use of the MAX function to compute the highest average student Score in a course, while the MIN function is used to compute the lowest average student Score in a course.

Please note that the query shown in this solution is expensive, because it requires two scans of the table ” one for the inner query and one for the outer query. In addition, the ability to nest a SELECT statement in the FROM clause is a recent addition to the SQL standard. Older versions of SQL Server may not support it.

   


Transact-SQL Cookbook
Transact-SQL Cookbook
ISBN: 1565927567
EAN: 2147483647
Year: 2005
Pages: 152

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net