13.2 Pushing the GROUPING Envelope

   

The grouping examples you have seen so far represent simple ways of aggregating data using Oracle's extensions of the GROUP BY clause. These simple mechanisms were introduced in Oracle8i. In Oracle9i Database, Oracle enhanced this new functionality in some interesting and useful ways. Oracle now allows for:

  • Repeating column names in the GROUP BY clause

  • Grouping on composite columns

  • Concatenated groupings

13.2.1 Repeated Column Names in the GROUP BY Clause

In Oracle8i, repeating column names are not allowed in a GROUP BY clause. If the GROUP BY clause contains an extension (i.e., ROLLUP or CUBE), you cannot use the same column inside the extension as well as outside the extension. The following SQL is invalid in Oracle8i:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY o.year, ROLLUP (o.year, o.month, r.name);                          * ERROR at line 6: ORA-30490: Ambiguous expression in GROUP BY ROLLUP or CUBE list

However, the same query works in Oracle9i Database and later:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY o.year, ROLLUP (o.year, o.month, r.name);       YEAR MONTH     REGION                    TOTAL ---------- --------- -------------------- ----------       2000 January   Mid-Atlantic            1221394       2000 January   New England             1018430       2000 January   Southeast US             758042       2000 January                           2997866       2000 February  Mid-Atlantic             857352       2000 February  New England             1231492       2000 February  Southeast US            1236846       2000 February                          3325690       2000 March     Mid-Atlantic            1274062       2000 March     New England             1132966       2000 March     Southeast US            1311986       2000 March                             3719014       2001 January   Mid-Atlantic             610697       2001 January   New England              509215       2001 January   Southeast US             379021       2001 January                           1498933       2001 February  Mid-Atlantic             428676       2001 February  New England              615746       2001 February  Southeast US             618423       2001 February                          1662845       2001 March     Mid-Atlantic             637031       2001 March     New England              566483       2001 March     Southeast US             655993       2001 March                             1859507       2000                                  10042570       2001                                   5021285       2000                                  10042570       2001                                   5021285 28 rows selected.

Repetition of o.year in the GROUP BY clause as well as in the ROLLUP operation repeats the summary rows of each year in the output and suppresses the grand total. Repetition of column names in a GROUP BY clause isn't very useful, but it's worth knowing that such constructs are allowed in Oracle9i and later.

13.2.2 Grouping on Composite Columns

Oracle8i supports grouping on individual columns only. Oracle9i extends the grouping operations to include grouping on composite columns. A composite column is a collection of two or more columns, but their values are treated as one for the grouping computation. Oracle8i allows group operations of the form ROLLUP (a,b,c), while, Oracle9i allows group operations of the form ROLLUP (a,(b,c)) as well. In this case, (b,c) is treated as one column for the purpose of the grouping computation. For example:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP ((o.year, o.month),r.name);       YEAR MONTH     REGION                    TOTAL ---------- --------- -------------------- ----------       2000 January   Mid-Atlantic            1221394       2000 January   New England             1018430       2000 January   Southeast US             758042       2000 January                           2997866       2000 February  Mid-Atlantic             857352       2000 February  New England             1231492       2000 February  Southeast US            1236846       2000 February                          3325690       2000 March     Mid-Atlantic            1274062       2000 March     New England             1132966       2000 March     Southeast US            1311986       2000 March                             3719014       2001 January   Mid-Atlantic             610697       2001 January   New England              509215       2001 January   Southeast US             379021       2001 January                           1498933       2001 February  Mid-Atlantic             428676       2001 February  New England              615746       2001 February  Southeast US             618423       2001 February                          1662845       2001 March     Mid-Atlantic             637031       2001 March     New England              566483       2001 March     Southeast US             655993       2001 March                             1859507                                             15063855 25 rows selected.

In this example, two columns (o.year, o.month) are treated as one composite column. This causes Oracle to treat the combination of year and month as one dimension, and the summary rows are computed accordingly. Although this query is not allowed in Oracle8i, you can fake composite column groupings in Oracle8i by using the concatenation operator (||) to combine two columns and treat the result as one composite column. Oracle8i can then produce the same result as the previous query in Oracle 9i. For example:

SELECT TO_CHAR(o.year)||' '||TO_CHAR(TO_DATE(o.month,'MM'),'Month')         Year_Month,         r.name region, SUM(o.tot_sales)  FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY  ROLLUP (TO_CHAR(o.year)||' '||         TO_CHAR(TO_DATE(o.month,'MM'),'Month'), r.name); YEAR_MONTH           REGION               SUM(O.TOT_SALES) -------------------- -------------------- ---------------- 2000 February        Mid-Atlantic                   857352 2000 February        New England                   1231492 2000 February        Southeast US                  1236846 2000 February                                      3325690 2000 January         Mid-Atlantic                  1221394 2000 January         New England                   1018430 2000 January         Southeast US                   758042 2000 January                                       2997866 2000 March           Mid-Atlantic                  1274062 2000 March           New England                   1132966 2000 March           Southeast US                  1311986 2000 March                                         3719014 2001 February        Mid-Atlantic                   428676 2001 February        New England                    615746 2001 February        Southeast US                   618423 2001 February                                      1662845 2001 January         Mid-Atlantic                   610697 2001 January         New England                    509215 2001 January         Southeast US                   379021 2001 January                                       1498933 2001 March           Mid-Atlantic                   637031 2001 March           New England                    566483 2001 March           Southeast US                   655993 2001 March                                         1859507                                                   15063855 25 rows selected.

This query converts the numeric month into the string expression of the name of the month and concatenates it with the string representation of the year. The same expression has to be used in the SELECT list and the ROLLUP clause. The expression TO_CHAR(o.year)||' '||TO_CHAR(TO_DATE( o.month,'MM'),'Month') is treated as one composite column.

13.2.3 Concatenated Groupings

With Oracle9i and later, you can have multiple ROLLUP, CUBE, or GROUPING SETS operations, or a combination of these under the GROUP BY clause in a query. This is not allowed in Oracle8i. You will get an error message if you attempt the following query in Oracle8i:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP (o.year, o.month), ROLLUP(r.name);                                    * ERROR at line 6: ORA-30489: Cannot have more than one rollup/cube expression list

However, the same query works in Oracle9i and later:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP (o.year, o.month), ROLLUP(r.name);       YEAR MONTH     REGION                    TOTAL ---------- --------- -------------------- ----------       2000 January   Mid-Atlantic            1221394       2000 January   New England             1018430       2000 January   Southeast US             758042       2000 January                           2997866       2000 February  Mid-Atlantic             857352       2000 February  New England             1231492       2000 February  Southeast US            1236846       2000 February                          3325690       2000 March     Mid-Atlantic            1274062       2000 March     New England             1132966       2000 March     Southeast US            1311986       2000 March                             3719014       2000           Mid-Atlantic            3352808       2000           New England             3382888       2000           Southeast US            3306874       2000                                  10042570       2001 January   Mid-Atlantic             610697       2001 January   New England              509215       2001 January   Southeast US             379021       2001 January                           1498933       2001 February  Mid-Atlantic             428676       2001 February  New England              615746       2001 February  Southeast US             618423       2001 February                          1662845       2001 March     Mid-Atlantic             637031       2001 March     New England              566483       2001 March     Southeast US             655993       2001 March                             1859507       2001           Mid-Atlantic            1676404       2001           New England             1691444       2001           Southeast US            1653437       2001                                   5021285                      Mid-Atlantic            5029212                      New England             5074332                      Southeast US            4960311                                             15063855 36 rows selected.

When you have multiple grouping operations (ROLLUP, CUBE, or GROUPING SETS) in a GROUP BY clause, what you have is called a concatenated grouping. The result of the concatenated grouping is to produce a cross-product of groupings from each grouping operation. Therefore, the query:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY ROLLUP(o.year),  ROLLUP (o.month), ROLLUP (r.name);

behaves as a CUBE and produces the same result as the query:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year, o.month, r.name);

Since a CUBE contains aggregates for all possible combinations of the grouping columns, the concatenated grouping of CUBES is no different from a regular CUBE, and all the following queries return the same result as the query shown previously:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year, o.month), CUBE (r.name); SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year), CUBE (o.month, r.name); SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY CUBE (o.year), CUBE (o.month), CUBE (r.name);

13.2.3.1 Concatenated groupings with GROUPING SETS

Concatenated groupings come in handy while using GROUPING SETS. Since GROUPING SETS produces only the subtotal rows, you can specify just the aggregation levels you want in your output by using a concatenated grouping of GROUPING SETS. The concatenated grouping of GROUPING SETS (a,b) and GROUPING SETS (c,d) will produce aggregate rows for the aggregation levels (a,c), (a,d), (b,c), and (b,d). The concatenated grouping of GROUPING SETS (a,b) and GROUPING SETS (c) will produce aggregate rows for the aggregation levels (a,c) and (b,c). For example:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.year, o.month), GROUPING SETS (r.name);       YEAR MONTH     REGION                    TOTAL ---------- --------- -------------------- ----------       2000           Mid-Atlantic            3352808       2000           New England             3382888       2000           Southeast US            3306874       2001           Mid-Atlantic            1676404       2001           New England             1691444       2001           Southeast US            1653437            January   Mid-Atlantic            1832091            January   New England             1527645            January   Southeast US            1137063            February  Mid-Atlantic            1286028            February  New England             1847238            February  Southeast US            1855269            March     Mid-Atlantic            1911093            March     New England             1699449            March     Southeast US            1967979 15 rows selected.

The concatenated grouping GROUP BY GROUPING SETS (O.YEAR, O.MONTH), GROUPING SETS (R.NAME) in this example produces rows for aggregate levels (O.YEAR, R.NAME) and (O.MONTH, R.NAME). Therefore, you see aggregate rows for (Year, Region) and (Month, Region) combinations in the output. The following example extends the previous query:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.year, o.month), GROUPING SETS (o.year, r. name);           YEAR MONTH     REGION                    TOTAL     ---------- --------- -------------------- ----------  1:      2000                                  10042570  2:      2001                                   5021285  3:      2000 January                           2997866  4:      2000 February                          3325690  5:      2000 March                             3719014  6:      2001 January                           1498933  7:      2001 February                          1662845  8:      2001 March                             1859507  9:      2000           Mid-Atlantic            3352808 10:      2000           New England             3382888 11:      2000           Southeast US            3306874 12:      2001           Mid-Atlantic            1676404 13:      2001           New England             1691444 14:      2001           Southeast US            1653437 15:           January   Mid-Atlantic            1832091 16:           January   New England             1527645 17:           January   Southeast US            1137063 18:           February  Mid-Atlantic            1286028 19:           February  New England             1847238 20:           February  Southeast US            1855269 21:           March     Mid-Atlantic            1911093 22:           March     New England             1699449 23:           March     Southeast US            1967979 23 rows selected.

This example produces four grouping combinations. Table 13-1 describes the various grouping combinations produced by this query and references their corresponding row numbers in the output.

Table 13-1. Grouping combinations

Grouping combination

Corresponding rows

(o.year, o.year)

1-2

(o.year, r.name)

9-14

(o.month, o.year)

3-8

(o.month, r.name)

15-23


The GROUPING SETS operation is independent of the order of columns. Therefore, the following two queries will produce the same results as shown previously:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.year, r.name), GROUPING SETS (o.year, o.month); SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.month, o.year), GROUPING SETS (r.name, o.year);

It is permissible to have a combination of ROLLUP, CUBE, and GROUPING SETS in a single GROUP BY clause, as in the following example:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (o.month, o.year), ROLLUP(r.name), CUBE (o.year);

However, the output from such queries seldom makes any sense. You should carefully evaluate the need for such a query if you intend to write one.

13.2.3.2 ROLLUP and CUBE as arguments to GROUPING SETS

Unlike the ROLLUP and CUBE operations, the GROUPING SETS operation can take a ROLLUP or a CUBE as its argument. As you have seen earlier, GROUPING SETS produces only subtotal rows. However, there are times when you may need to print the grand total along with the subtotals. In such situations, you can perform the GROUPING SETS operation on ROLLUP operations, as in the following example:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (ROLLUP (o.year),                          ROLLUP (o.month),                          ROLLUP (r. name));       YEAR MONTH     REGION                    TOTAL ---------- --------- -------------------- ----------                      Mid-Atlantic            5029212                      New England             5074332                      Southeast US            4960311            January                           4496799            February                          4988535            March                             5578521       2000                                  10042570       2001                                   5021285                                             15063855                                             15063855                                             15063855 11 rows selected.

This example produces the subtotals for each dimension, as expected from the regular GROUPING SETS operations. Also, it produces the grand total across all the dimensions. However, you get three identical grand-total rows. The grand-total rows are repeated because they are produced by each ROLLUP operation inside the GROUPING SETS. If you insist on only one grand-total row, you may use the DISTINCT keyword in the SELECT clause:

SELECT Distinct o.year,                  TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month,                 r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (ROLLUP (o.year), ROLLUP (o.month),                          ROLLUP (r. name));  YEAR MONTH     REGION                    TOTAL ----- --------- -------------------- ----------  2000                                  10042570  2001                                   5021285        February                         4988535        January                          4496799        March                            5578521                 Mid-Atlantic            5029212                 New England             5074332                 Southeast US            4960311                                        15063855 9 rows selected.

In this example, the DISTINCT keyword eliminated the duplicate grand-total rows. You can also eliminate duplicate rows by using the GROUP_ID function, as discussed later in this chapter.

If you are interested in subtotals and totals on composite dimensions, you can use composite or concatenated ROLLUP operations within GROUPING SETS, as in the following example:

SELECT o.year, TO_CHAR(TO_DATE(o.month, 'MM'), 'Month') month, r.name region, SUM(o.tot_sales) total FROM all_orders o JOIN region r ON r.region_id = o.region_id WHERE o.month BETWEEN 1 AND 3 GROUP BY GROUPING SETS (ROLLUP (o.year, o.month), ROLLUP(r.name));      YEAR MONTH     REGION                    TOTAL --------- --------- -------------------- ----------                     Mid-Atlantic            5029212                     New England             5074332                     Southeast US            4960311      2000 January                           2997866      2000 February                          3325690      2000 March                             3719014      2000                                  10042570      2001 January                           1498933      2001 February                          1662845      2001 March                             1859507      2001                                   5021285                                            15063855                                            15063855 13 rows selected.

This query generates subtotals for (year, month) combinations, subtotals for the region, subtotals for the year, and the grand total. Note that there are duplicate grand-total rows because of the multiple ROLLUP operations within the GROUPING SETS operation.



Mastering Oracle SQL
Mastering Oracle SQL, 2nd Edition
ISBN: 0596006322
EAN: 2147483647
Year: 2003
Pages: 154

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net