Aggregate Functions


The essence of reporting is typically to distill a population of data into a value or values representing a trend or summary. This is what aggregation is all about. Aggregate functions answer the questions asked by the consumers of data:

  • "What were the total sales of chicken gizzard by-products for last month?"

  • "What is the average price paid for food condiments by male Brazilians between the ages of 19 and 24?"

  • "What was the longest order-to-shipping time of all orders last quarter?"

  • "Who is the oldest employee still working in the mail room?"

Aggregate functions return a scalar value (a single value) applying a specific aggregate operation. The return data type is comparable to that of the column or value passed to the function. Aggregates are often used along with grouping, rollup, and pivoting operations to produce results for data analysis. This is covered in greater detail in Chapter 7. The focus here is on some of the more common functions in simple SELECT queries.

Aggregate functions can be used with scalar input values, rather than in a SELECT query, but what's the point? I can pass the value 15 to each of these aggregate functions and each will return the same result:

 SELECT AVG(15) SELECT SUM(15) SELECT MIN(15) SELECT MAX(15) 

They all return 15. After all, the average, sum, smallest, and largest value in a range of one value is that value. What happens if I count one value?

 SELECT COUNT(15) 

I get 1. I counted one value.

All right, now let's do something useful. Aggregate functions are really only valuable when used with a range of values in a result set. Each function performs its magic on all non-null values of a column. Unless you are applying grouping (which you will see in Chapter 7) you cannot return both aggregated values and regular column values in the same SELECT statement.

AVG()

The AVG() function returns the average for a range of numeric values, for all non-null values. For example, a table contains the following gymnastics meet scores:

Gymnast

Event

Score

Sara

Vault

9.25

Cassie

Vault

8.75

Delaney

Vault

9.25

Sammi

Vault

8.05

Erika

Vault

8.60

Sara

Beam

9.70

Cassie

Beam

9.00

Delaney

Beam

9.25

Sammi

Beam

8.95

Erika

Beam

8.85

The following query is executed with these values:

 SELECT AVG(Score) 

The result would be 8.965.

If three girls didn't compete in some events and the table had some missing scores, these might be represented as NULLs:

Gymnast

Event

Score

Sara

Vault

9.25

Cassie

Vault

8.75

Delaney

Vault

NULL

Sammi

Vault

8.05

Erika

Vault

8.60

Sara

Beam

9.70

Cassie

Beam

NULL

Delaney

Beam

9.25

Sammi

Beam

NULL

Erika

Beam

8.85

In this case, the NULL values are not considered, and the average is calculated based on the existing numerical values. The result would be 8.921429.

However, if the missing scores were counted against the team, and the column contained zero values instead, this would seriously affect the overall score (6.245) and their chances of moving on to state competition.

COUNT()

The COUNT() function returns an integer value for the number of non-null values in the column range. For instance, if the gymnastics data in the previous example were in a table called GymEvent and I wanted to know the number of events that Sammi received a score on, I could execute the following query:

 SELECT COUNT(Score) FROM GymEvent WHERE Gymnast = 'Sammi' 

The result would be 1 since the score for Sammi's Beam event was NULL. If you need a count of all rows in a table, regardless of NULL values, use the following syntax:

 SELECT COUNT(*) FROM table 

Using the previous example with Sammi, a COUNT(*) query would look like this:

 SELECT COUNT(*) FROM GymEvent WHERE Gymnast = 'Sammi' 

Because the COUNT(*) function ignores NULL values, the result of this query would be 2.

MIN() and MAX()

The MIN() function returns the smallest (minimum) non-null value for a column range. The MAX() function returns the maximum or largest value. These functions can be used with most data types and work according to the sorting rules of the type. To make this point, suppose a table contains the following values stored in two different columns, one as an integer type and the other as a character type:

Column1 (Int type)

Column2 (VarChar type)

2

2

4

4

12

12

19

19

What will the MIN() and MAX() functions return? You may be surprised.

Column1 (Int type)

Column2 (VarChar type)

MIN()

2

12

MAX()

19

4

Because values in Column2 are stored as characters rather than numbers, it is sorted according to the ASCII value of each character, from left to right. This is why 12 is less than any other value and 4 is greater than any other value.

SUM()

The SUM() function is one of the most commonly used aggregates and is fairly self-explanatory. Like the AVG() function, it works with numeric data types and returns the additive sum of all non-null values in a column range.

You'll learn to use all of the aggregate functions in Chapter 7, including statistical functions. You'll also see how to create user-defined aggregates with SQL Server 2005.




Beginning Transact-SQL with SQL Server 2000 and 2005
Beginning Transact-SQL With SQL Server 2000 and 2005
ISBN: 076457955X
EAN: 2147483647
Year: 2006
Pages: 131
Authors: Paul Turley

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net