7.16.1 Problem
You want to produce a summary based on date or time values.
7.16.2 Solution
Use GROUP BY to categorize temporal values into bins of the appropriate duration. Often this will involve using expressions to extract the significant parts of dates or times.
7.16.3 Discussion
To put records in time order, you use an ORDER BY clause to sort a column that has a temporal type. If instead you want to summarize records based on groupings into time intervals, you need to determine how to categorize each record into the proper interval and use GROUP BY to group them accordingly.
Sometimes you can use temporal values directly if they group naturally into the desired categories. This is quite likely if a table represents date or time parts using separate columns. For example, the baseball1.com master ballplayer table represents birth dates using separate year, month, and day columns. To see how many ballplayers were born on each day of the year, perform a calendar date summary that uses the month and day values but ignores the year:
mysql> SELECT birthmonth, birthday, COUNT(*) -> FROM master -> WHERE birthmonth IS NOT NULL AND birthday IS NOT NULL -> GROUP BY birthmonth, birthday; +------------+----------+----------+ | birthmonth | birthday | COUNT(*) | +------------+----------+----------+ | 1 | 1 | 47 | | 1 | 2 | 40 | | 1 | 3 | 50 | | 1 | 4 | 38 | ... | 12 | 28 | 33 | | 12 | 29 | 32 | | 12 | 30 | 32 | | 12 | 31 | 27 | +------------+----------+----------+
A less fine-grained summary can be obtained by using only the month values:
mysql> SELECT birthmonth, COUNT(*) -> FROM master -> WHERE birthmonth IS NOT NULL -> GROUP BY birthmonth; +------------+----------+ | birthmonth | COUNT(*) | +------------+----------+ | 1 | 1311 | | 2 | 1144 | | 3 | 1243 | | 4 | 1179 | | 5 | 1118 | | 6 | 1105 | | 7 | 1244 | | 8 | 1438 | | 9 | 1314 | | 10 | 1438 | | 11 | 1314 | | 12 | 1269 | +------------+----------+
Sometimes temporal values can be used directly, even when not represented as separate columns. To determine how many drivers were on the road and how many miles were driven each day, group the records in the driver_log table by date:
mysql> SELECT trav_date, -> COUNT(*) AS 'number of drivers', SUM(miles) As 'miles logged' -> FROM driver_log GROUP BY trav_date; +------------+-------------------+--------------+ | trav_date | number of drivers | miles logged | +------------+-------------------+--------------+ | 2001-11-26 | 1 | 115 | | 2001-11-27 | 1 | 96 | | 2001-11-29 | 3 | 822 | | 2001-11-30 | 2 | 355 | | 2001-12-01 | 1 | 197 | | 2001-12-02 | 2 | 581 | +------------+-------------------+--------------+
However, this summary will grow lengthier as you add more records to the table. At some point, the number of distinct dates likely will become so large that the summary fails to be useful, and you'd probably decide to change the category size from daily to weekly or monthly.
When a temporal column contains so many distinct values that it fails to categorize well, it's typical for a summary to group records using expressions that map the relevant parts of the date or time values onto a smaller set of categories. For example, to produce a time-of-day summary for records in the mail table, do this:[1]
[1] Note that the result includes an entry only for hours of the day actually represented in the data. To generate a summary with an entry for every hour, use a join to fill in the "missing" values. See Recipe 12.10.
mysql> SELECT HOUR(t) AS hour, -> COUNT(*) AS 'number of messages', -> SUM(size) AS 'number of bytes sent' -> FROM mail -> GROUP BY hour; +------+--------------------+----------------------+ | hour | number of messages | number of bytes sent | +------+--------------------+----------------------+ | 7 | 1 | 3824 | | 8 | 1 | 978 | | 9 | 2 | 2904 | | 10 | 2 | 1056806 | | 11 | 1 | 5781 | | 12 | 2 | 195798 | | 13 | 1 | 271 | | 14 | 1 | 98151 | | 15 | 1 | 1048 | | 17 | 2 | 2398338 | | 22 | 1 | 23992 | | 23 | 1 | 10294 | +------+--------------------+----------------------+
To produce a day-of-week summary instead, use the DAYOFWEEK( ) function:
mysql> SELECT DAYOFWEEK(t) AS weekday, -> COUNT(*) AS 'number of messages', -> SUM(size) AS 'number of bytes sent' -> FROM mail -> GROUP BY weekday; +---------+--------------------+----------------------+ | weekday | number of messages | number of bytes sent | +---------+--------------------+----------------------+ | 1 | 1 | 271 | | 2 | 4 | 2500705 | | 3 | 4 | 1007190 | | 4 | 2 | 10907 | | 5 | 1 | 873 | | 6 | 1 | 58274 | | 7 | 3 | 219965 | +---------+--------------------+----------------------+
To make the output more meaningful, you might want to use DAYNAME( ) to display weekday names instead. However, because day names sort lexically (for example, "Tuesday" sorts after "Friday"), use DAYNAME( ) only for display purposes. Continue to group on the numeric day values so that output rows sort that way:
mysql> SELECT DAYNAME(t) AS weekday, -> COUNT(*) AS 'number of messages', -> SUM(size) AS 'number of bytes sent' -> FROM mail -> GROUP BY DAYOFWEEK(t); +-----------+--------------------+----------------------+ | weekday | number of messages | number of bytes sent | +-----------+--------------------+----------------------+ | Sunday | 1 | 271 | | Monday | 4 | 2500705 | | Tuesday | 4 | 1007190 | | Wednesday | 2 | 10907 | | Thursday | 1 | 873 | | Friday | 1 | 58274 | | Saturday | 3 | 219965 | +-----------+--------------------+----------------------+
A similar technique can be used for summarizing month-of-year categories that are sorted by numeric value but displayed by month name.
Uses for temporal categorizations are plentiful:
GROUP BY FROM_DAYS(TO_DAYS(col_name)) GROUP BY YEAR(col_name), MONTH(col_name), DAYOFMONTH(col_name) GROUP BY DATE_FORMAT(col_name,'%Y-%m-%e')
Using the mysql Client Program
Writing MySQL-Based Programs
Record Selection Techniques
Working with Strings
Working with Dates and Times
Sorting Query Results
Generating Summaries
Modifying Tables with ALTER TABLE
Obtaining and Using Metadata
Importing and Exporting Data
Generating and Using Sequences
Using Multiple Tables
Statistical Techniques
Handling Duplicates
Performing Transactions
Introduction to MySQL on the Web
Incorporating Query Resultsinto Web Pages
Processing Web Input with MySQL
Using MySQL-Based Web Session Management
Appendix A. Obtaining MySQL Software
Appendix B. JSP and Tomcat Primer
Appendix C. References