Recipe 13.2. Per-Group Descriptive Statistics


Problem

You want to produce descriptive statistics for each subgroup of a set of observations.

Solution

Use aggregate functions, but employ a GROUP BY clause to arrange observations into the appropriate groups.

Discussion

Section 13.1 shows how to compute descriptive statistics for the entire set of scores in the testscore table. To be more specific, you can use GROUP BY to divide the observations into groups and calculate statistics for each of them. For example, the subjects in the testscore table are listed by age and sex, so it's possible to calculate similar statistics by age or sex (or both) by application of appropriate GROUP BY clauses.

Here's how to calculate by age:

mysql> SELECT age, COUNT(score) AS n,     -> SUM(score) AS sum,     -> MIN(score) AS minimum,     -> MAX(score) AS maximum,     -> AVG(score) AS mean,     -> STDDEV_SAMP(score) AS 'std. dev.',     -> VAR_SAMP(score) AS 'variance'     -> FROM testscore     -> GROUP BY age; +-----+---+------+---------+---------+--------+-----------+----------+ | age | n | sum  | minimum | maximum | mean   | std. dev. | variance | +-----+---+------+---------+---------+--------+-----------+----------+ |   5 | 4 |   22 |       4 |       7 | 5.5000 |    1.2910 |   1.6667 | |   6 | 4 |   27 |       4 |       9 | 6.7500 |    2.2174 |   4.9167 | |   7 | 4 |   30 |       6 |       9 | 7.5000 |    1.2910 |   1.6667 | |   8 | 4 |   32 |       6 |      10 | 8.0000 |    1.8257 |   3.3333 | |   9 | 4 |   35 |       7 |      10 | 8.7500 |    1.2583 |   1.5833 | +-----+---+------+---------+---------+--------+-----------+----------+ 

By sex:

mysql> SELECT sex, COUNT(score) AS n,     -> SUM(score) AS sum,     -> MIN(score) AS minimum,     -> MAX(score) AS maximum,     -> AVG(score) AS mean,     -> STDDEV_SAMP(score) AS 'std. dev.',     -> VAR_SAMP(score) AS 'variance'     -> FROM testscore     -> GROUP BY sex; +-----+----+------+---------+---------+--------+-----------+----------+ | sex | n  | sum  | minimum | maximum | mean   | std. dev. | variance | +-----+----+------+---------+---------+--------+-----------+----------+ | M   | 10 |   71 |       4 |       9 | 7.1000 |    1.7920 |   3.2111 | | F   | 10 |   75 |       4 |      10 | 7.5000 |    1.9579 |   3.8333 | +-----+----+------+---------+---------+--------+-----------+----------+ 

By age and sex:

mysql> SELECT age, sex, COUNT(score) AS n,     -> SUM(score) AS sum,     -> MIN(score) AS minimum,     -> MAX(score) AS maximum,     -> AVG(score) AS mean,     -> STDDEV_SAMP(score) AS 'std. dev.',     -> VAR_SAMP(score) AS 'variance'     -> FROM testscore     -> GROUP BY age, sex; +-----+-----+---+------+---------+---------+--------+-----------+----------+ | age | sex | n | sum  | minimum | maximum | mean   | std. dev. | variance | +-----+-----+---+------+---------+---------+--------+-----------+----------+ |   5 | M   | 2 |    9 |       4 |       5 | 4.5000 |    0.7071 |   0.5000 | |   5 | F   | 2 |   13 |       6 |       7 | 6.5000 |    0.7071 |   0.5000 | |   6 | M   | 2 |   17 |       8 |       9 | 8.5000 |    0.7071 |   0.5000 | |   6 | F   | 2 |   10 |       4 |       6 | 5.0000 |    1.4142 |   2.0000 | |   7 | M   | 2 |   14 |       6 |       8 | 7.0000 |    1.4142 |   2.0000 | |   7 | F   | 2 |   16 |       7 |       9 | 8.0000 |    1.4142 |   2.0000 | |   8 | M   | 2 |   15 |       6 |       9 | 7.5000 |    2.1213 |   4.5000 | |   8 | F   | 2 |   17 |       7 |      10 | 8.5000 |    2.1213 |   4.5000 | |   9 | M   | 2 |   16 |       7 |       9 | 8.0000 |    1.4142 |   2.0000 | |   9 | F   | 2 |   19 |       9 |      10 | 9.5000 |    0.7071 |   0.5000 | +-----+-----+---+------+---------+---------+--------+-----------+----------+ 




MySQL Cookbook
MySQL Cookbook
ISBN: 059652708X
EAN: 2147483647
Year: 2004
Pages: 375
Authors: Paul DuBois

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net