The GROUP BY Clause

Team-Fly    

 
Sams Teach Yourself SQL in 24 Hours, Third Edition
By Ronald R. Plew, Ryan K. Stephens
Table of Contents
Hour  10.  Sorting and Grouping Data


The GROUP BY Clause

The GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups. The GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause.

The position of the GROUP BY clause in a query is as follows:

 SELECT  FROM WHERE GROUP BY ORDER BY 

The GROUP BY clause must follow the conditions in the WHERE clause and must precede the ORDER BY clause if one is used.

The following is the SELECT statement's syntax, including the GROUP BY clause:

 graphics/syntax_icon.gif SELECT COLUMN1, COLUMN2 FROM TABLE1, TABLE2 WHERE CONDITIONS GROUP BY COLUMN1, COLUMN2 ORDER BY COLUMN1, COLUMN2 

The following sections give examples and explanations of the GROUP BY clause's use in a variety of situations.

Grouping Selected Data

Grouping data is a simple process. The selected columns (the column list following the SELECT keyword in a query) are the columns that can be referenced in the GROUP BY clause. If a column is not found in the SELECT statement, it cannot be used in the GROUP BY clause. This is logical if you think about ithow can you group data on a report if the data is not displayed?

If the column name has been qualified, the qualified name must go into the GROUP BY clause. The column name can also be represented by a number, which is discussed later in this hour. When grouping the data, the order of columns grouped does not have to match the column order in the SELECT clause.

Group Functions

Typical group functionsthose that are used with the GROUP BY clause to arrange data in groupsinclude AVG, MAX, MIN, SUM, and COUNT. These are the aggregate functions that you learned about during Hour 9, "Summarizing Data Results from a Query." Remember that the aggregate functions were used for single values in Hour 9; now, you use the aggregate functions for group values.

Creating Groups and Using Aggregate Functions

There are conditions that the SELECT clause has that must be met when using GROUP BY. Specifically, whatever columns are selected must appear in the GROUP BY clause, except for any aggregate values. The columns in the GROUP BY clause do not necessarily have to be in the same order as they appear in the SELECT clause. Should the columns in the SELECT clause be qualified, the qualified names of the columns must be used in the GROUP BY clause. The following are some examples of syntax for the GROUP BY clause:

Example

SELECT EMP_ID, CITY

FROM EMPLOYEE_TBL

GROUP BY CITY, EMP_ID;

graphics/analysis_icon.gif

The SQL statement selects the EMP_ID and the CITY from the EMPLOYEE_TBL and groups the data returned by the CITY and then EMP_ID.

graphics/note_icon.gif

Note the order of the columns selected, versus the order of the columns in the GROUP BY clause.


Example

SELECT EMP_ID, SUM(SALARY)

FROM EMPLOYEE_PAY_TBL

GROUP BY SALARY, EMP_ID;

graphics/analysis_icon.gif

This SQL statement returns the EMP_ID and the total of the salary groups, as well as groups both the salaries and employee IDs.

Example

SELECT SUM(SALARY)

FROM EMPLOYEE_PAY_TBL;

graphics/analysis_icon.gif

This SQL statement returns the total of all the salaries from the EMPLOYEE_PAY_TBL.

Example

SELECT SUM(SALARY)

FROM EMPLOYEE_PAY_TBL

GROUP BY SALARY;

graphics/analysis_icon.gif

This SQL statement returns the totals for the different groups of salaries.

Practical examples using real data follow. In this first example, you can see that there are three distinct cities in the EMPLOYEE_TBL table.

 graphics/input_icon.gif  SELECT CITY   FROM EMPLOYEE_TBL;  graphics/output_icon.gif CITY ------------- GREENWOOD INDIANAPOLIS WHITELAND INDIANAPOLIS INDIANAPOLIS INDIANAPOLIS 6 rows selected. 

In the following example, you select the city and a count of all records for each city. You see a count on each of the three distinct cities because you are using a GROUP BY clause.

 graphics/input_icon.gif  SELECT CITY, COUNT(*)   FROM EMPLOYEE_TBL   GROUP BY CITY;  graphics/output_icon.gif CITY           COUNT(*) -------------- -------- GREENWOOD             1 INDIANAPOLIS          4 WHITELAND             1 3 rows selected. 

The following is a query from a temporary table created based on EMPLOYEE_TBL and EMPLOYEE_PAY_TBL. You will soon learn how to join two tables for a query.

 graphics/input_icon.gif  SELECT *   FROM EMP_PAY_TMP;  graphics/output_icon.gif CITY         LAST_NAM FIRST_NA   PAY_RATE     SALARY ------------ -------- ---------- ------------ ------ GREENWOOD    STEPHENS TINA                     30000 INDIANAPOLIS PLEW     LINDA         14.75 WHITELAND    GLASS    BRANDON                  40000 INDIANAPOLIS GLASS    JACOB                    20000 INDIANAPOLIS WALLACE  MARIAH           11 INDIANAPOLIS SPURGEON TIFFANY          15 6 rows selected. 

In the following example, you retrieve the average pay rate and salary on each distinct city using the aggregate function AVG. There is no average pay rate for GREENWOOD or WHITELAND because no employees living in those cities are paid hourly.

 graphics/input_icon.gif  SELECT CITY, AVG(PAY_RATE), AVG(SALARY)   FROM EMP_PAY_TMP   GROUP BY CITY;  graphics/output_icon.gif CITY         AVG(PAY_RATE) AVG(SALARY) ------------ ------------- ----------- GREENWOOD                        30000 INDIANAPOLIS    13.5833333       20000 WHITELAND                        40000 3 rows selected. 

In the next example, you combine the use of multiple components in a query to return grouped data. You still want to see the average pay rate and salary, but only for INDIANAPOLIS and WHITELAND. You group the data by CITY, of which you have no choice because you are using aggregate functions on the other columns. Lastly, you want to order the report by 2, and then 3, which is the average pay rate and then average salary, respectively. Study the following details and output:

 graphics/input_icon.gif  SELECT CITY, AVG(PAY_RATE), AVG(SALARY)   FROM EMP_PAY_TMP   WHERE CITY IN ('INDIANAPOLIS','WHITELAND')   GROUP BY CITY   ORDER BY 2,3;  graphics/output_icon.gif CITY         AVG(PAY_RATE) AVG(SALARY) ------------ ------------- ----------- INDIANAPOLIS    13.5833333       20000 WHITELAND                        40000 

Values are sorted before NULL values; therefore, the record for INDIANAPOLIS was displayed first. GREENWOOD was not selected, but if it were, its record would have been displayed before WHITELAND's record because GREENWOOD's average salary is $30,000 (the second sort in the ORDER BY clause was on average salary).

The last example in this section shows the use of the MAX and MIN aggregate functions with the GROUP BY clause.

 graphics/input_icon.gif  SELECT CITY, MAX(PAY_RATE), MIN(SALARY)   FROM EMP_PAY_TMP   GROUP BY CITY;  graphics/output_icon.gif CITY         MAX(PAY_RATE) MIN(SALARY) ------------ ------------- ----------- GREENWOOD                        30000 INDIANAPOLIS            15       20000 WHITELAND                        40000 3 rows selected. 

Representing Column Names with Numbers

Unlike the ORDER BY clause, the GROUP BY clause cannot be ordered by using an integer to represent the column name except when using a UNION and the column names are different. The following is an example of representing column names with numbers:

 graphics/mysql_icon.gif SELECT EMP_ID, SUM(SALARY) FROM EMPLOYEE_PAY_TBL UNION SELECT EMP_ID, SUM(PAY_RATE) FROM EMPLOYEE_PAY_TBL GROUP BY 2, 1; 

This SQL statement returns the employee ID and the group totals for the salaries. When using the UNION operator, the results of the two SELECT statements are merged into one result set. The GROUP BY is performed on the entire result set. The order for the groupings is 2 representing salary, and 1 representing EMP_ID.


Team-Fly    
Top
 


Sams Teach Yourself SQL in 24 Hours
Sams Teach Yourself SQL in 24 Hours (5th Edition) (Sams Teach Yourself -- Hours)
ISBN: 0672335417
EAN: 2147483647
Year: 2002
Pages: 275

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net