4.3 ANSI SQL Window Functions

   

SQL2003 allows for a window_clause in aggregate function calls, the addition of which makes those functions into window functions. Both Oracle and DB2 support this window function syntax. This section describes how to use the window_clause within Oracle and DB2.

Oracle tends to refer to window functions as analytic functions .


Window, or analytic, functions are similar to standard aggregate functions in that they operate on multiple rows, or groups of rows, within the result set returned from a query. However, the groups of rows that a window function operates on are defined not by a GROUP BY clause, but by partitioning and windowing clauses. Furthermore, the order within these groups is defined by an ordering clause, but that order only affects function evaluation, and has no effect on the order in which rows are returned by the query.

Window functions are the last items in a query to be evaluated except for the ORDER BY clause. Because of this late evaluation, window functions cannot be used within the WHERE , GROUP BY , or HAVING clauses.


4.3.1 SQL2003's Window Syntax

SQL2003 specifies the following syntax for window functions:

   FUNCTION_NAME   (   expr   ) OVER {   window_name   (   window_specification   )}   window_specification   ::= [   window_name   ][   partitioning   ][   ordering   ][   framing   ]   partitioning   ::= PARTITION BY   value   [,   value   ...] [COLLATE   collation_name   ]   ordering   ::= ORDER [SIBLINGS] BY   rule   [,   rule   ...]   rule   ::= {   value     position     alias   } [ASCDESC] [NULLS {FIRSTLAST}]   framing   ::= {ROWSRANGE} {   start     between   } [   exclusion   ]   start   ::= {UNBOUNDED PRECEDING   unsigned-integer   PRECEDINGCURRENT ROW}   between   ::= BETWEEN   bound   AND   bound     bound   ::= {   start   UNBOUNDED FOLLOWING   unsigned-integer   FOLLOWING}   exclusion   ::= {EXCLUDE CURRENT ROWEXCLUDE GROUP               EXCLUDE TIESEXCLUDE NO OTHERS} 

4.3.2 Oracle's Window Syntax

Oracle's window function syntax is as follows :

   FUNCTION_NAME   (   expr   ) OVER (   window_clause   )   window_clause   ::= [   partitioning   ] [   ordering   [   framing   ]]   partitioning   ::= PARTITION BY   value   [,   value   ...]   ordering   ::= ORDER [SIBLINGS] BY   rule   [,   rule   ...]   rule   ::= {   value     position     alias   } [ASCDESC]           [NULLS {FIRSTLAST}]   framing   ::= {ROWSRANGE} {   not_range     begin   AND   end   }   not_range   ::= {UNBOUNDED PRECEDING               CURRENT ROW   value   PRECEDING}   begin   ::= {UNBOUNDED PRECEDING           CURRENT ROW   value   {PRECEDINGFOLLOWING}}   end   ::= {UNBOUNDED FOLLOWING         CURRENT ROW   value   {PRECEDINGFOLLOWING}} 

4.3.3 DB2's Window Syntax

DB2's syntax is similar to Oracle's. For OLAP, ranking, and numbering functions, DB2 allows the following syntax:

   FUNCTION_NAME   (   expr   ) OVER (   window_clause   )   window_clause   ::= [   partitioning   ] [   ordering   ]   partitioning   ::= PARTITION BY (   value   [,   value   ...])   ordering   ::= {ORDER BY   rule   [,   rule   ...]  ORDER OF   table_name   }   rule   ::= {   value     position     alias   } [ASCDESC [NULLS {FIRSTLAST}]] 

When aggregate functions (e.g. AVG ) are used as window functions, DB2 allows the addition of a framing clause:

   FUNCTION_NAME   (   expr   ) OVER (   window_clause   )   window_clause   ::= [   partitioning   ] [   ordering   [   framing   ]] [   all     framing   ]   all   ::= RANGE UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING   partitioning   ::= PARTITION BY (   value   [,   value   ...])   ordering   ::= {ORDER BY   rule   [,   rule   ...]  ORDER OF   table_name   }   rule   ::= {   value     position     alias   } [ASCDESC [NULLS {FIRSTLAST}]]   framing   ::= {ROWSRANGE} {   group_start     group_between     group_end   }   group_start   ::= {UNBOUNDED PRECEDING   unsigned-integer   PRECEDING                 CURRENT ROW}   group_between   ::= BETWEEN {UNBOUNDED PRECEDING   unsigned_integer   PRECEDING   unsigned_integer   FOLLOWINGCURRENT ROW}                   AND {UNBOUNDED FOLLOWING   unsigned_integer   PRECEDING   unsigned_integer   FOLLOWINGCURRENT ROW}   group_end   ::= UNBOUNDED FOLLOWING   unsigned-integer   FOLLOWING} 

4.3.4 Partitioning

Partitioning the rows operated on by the partition clause is similar to using the GROUP BY expression on a standard SELECT statement. The partitioning clause takes a list of expressions that will be used to divide the result set into groups. We'll use the following table as the basis for some examples:

  SELECT * FROM odd_nums;  NUM        ODD ---------- ----------          0          0          1          1          2          0          3          1 

The following results illustrate the effects of partitioning by ODD . The sum of the even numbers is 2 (0+2), and the sum of the odd numbers is 4 (1+3). The second column of the result set reports the sum of all values in the partition to which that row belongs. Yet all the detail rows are returned. The query provides summary results in the context of detail rows:

  SELECT NUM, SUM(NUM) OVER (PARTITION BY ODD) S   FROM ODD_NUMS;  NUM            S ---------      ---------- 0              2 2              2 1              4 3              4 

Not using a partitioning clause at all will sum all of the numbers in the NUM column for each row returned by the query. In effect, the entire result set is treated as a single, large partition:

  SELECT NUM, SUM(NUM) OVER ( ) S FROM ODD_NUMS;  NUM            S ---------      ---------- 0              6 1              6 2              6 3              6 

4.3.5 Ordering

You specify the order of the rows on which an analytic function operates using the ordering clause. However, this analytic ordering clause does not define the result set ordering. To define the overall result set ordering, you must use the query's ORDER BY clause. The following use of Oracle's FIRST_VALUE function illustrates the effect of different orderings of the partitions:

  SELECT NUM,   SUM(NUM) OVER (PARTITION BY ODD) S,   FIRST_VALUE(NUM) OVER (PARTITION BY ODD ORDER BY NUM ASC) first_asc,   FIRST_VALUE(NUM) OVER (PARTITION BY ODD ORDER BY NUM DESC) first_desc   FROM ODD_NUMS;  NUM          S  FIRST_ASC FIRST_DESC ---------- ---------- ---------- ----------          0          2          0          2          2          2          0          2          1          4          1          3          3          4          1          3 

As you can see, the ORDER BY clauses in the window function invocations affect the ordering of the rows in the respective partitions when those functions are evaluated. Thus, ORDER BY NUM ASC orders partitions in ascending order, resulting in for the first value in the even-number partition and 1 for the first value in the odd-number partition. ORDER BY NUM DESC has the opposite effect.

The preceding query also illustrates an important point: using window functions, you can summarize and order many different ways in the same query.


4.3.6 Grouping or Windowing

Many analytic functions also allow you to specify a virtual, moving window surrounding a row within a partition. You do this using the framing clause. Such moving windows are useful for running calculations such as a running total.

The following, Oracle-based example uses the framing clause on the analytic variant of SUM to calculate a running sum of the values in the first column. No partitioning clause is used, so each invocation of SUM operates over the entire result set. However, the ORDER BY clause sorts the rows for SUM in ascending order of NUM 's value, and the BETWEEN clause (which is the windowing clause) causes each invocation of SUM to include values for NUM only up through the current row. Each successive invocation of SUM includes yet another value for NUM , in order, from the lowest value of NUM to the greatest:

  SELECT NUM, SUM(NUM) OVER (ORDER BY NUM ROWS   BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) S FROM ODD_NUMS;  NUM            S ---------      ---------- 0              0 1              1 2              3 3              6 

This example's a bit too easy, as the order of the final result set happens to match the order of the running total. That doesn't need to be the case. The following example generates the same results, but in a different order. You can see that the running total values are appropriate for each value of NUM , but the rows are presented in a different order than before. The result set ordering is completely independent of the ordering used for window function calculations:

  SELECT NUM, SUM(NUM) OVER (ORDER BY NUM ROWS   BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) S FROM ODD_NUMS   ORDER BY NUM DESC;  NUM          S ---------- ----------          3          6          2          3          1          1          0          0 

4.3.7 List of Window Functions

SQL2003 specifies that any aggregate function may also be used as a window function. Both Oracle and DB2 largely follow the standard in that respect, so you'll find that you can take just about any aggregate function ( certainly the standard ones) and apply to it the window function syntax described in the preceding sections.

In addition to the aggregate functions, SQL2003 defines the window functions described in the following sections. Only Oracle and DB2 currently implement these functions. All examples use the following table and data, which is a variation on the ODD_NUMS table used earlier to illustrate the concepts of partitioning, ordering, and grouping:

  SELECT * FROM test4;  NUM        ODD ---------- ----------          0          0          1          1          2          0          3          1          3          1          4          0          5          1 

Platform-specific window functions for Oracle (there are none for DB2) are included in the lists found under Section 4.5 later in this chapter.

CUME_DIST( )


Calculates the cumulative distribution, or relative rank, of the current row to other rows in the same partition. The calculation for a given row is as follows:

 number of peer or preceding rows / number of rows in partition 

Because the result for a given row depends on the number of rows preceding that row in the same partition, it's important to always specify an ORDER BY clause when invoking this function.

SQL2003 Syntax

 CUME_DIST( ) OVER {   window_name   (   window_specification   )} 

DB2

DB2 does not support the CUME_DIST( ) window function.

Oracle

Oracle does not allow the framing portion of the windowing syntax. Oracle requires the ordering clause:

 CUME_DIST( ) OVER ([   partitioning   ]   ordering   ) 

Example

The following Oracle-based example uses CUME_DIST( ) to generate a relative rank for each row, ordering by NUM , after partitioning the data by ODD :

  SELECT NUM, ODD, CUME_DIST( ) OVER   (PARTITION BY ODD ORDER BY NUM) cumedist   FROM test4;  NUM        ODD   CUMEDIST ---------- ---------- ----------          0          0 .333333333          2          0 .666666667          4          0          1          1          1        .25          3          1        .75          3          1        .75          5          1          1 

Following is an explanation of the calculation behind the rank for the row in which NUM =0:

  1. Because of the ORDER BY clause, the rows in the partition are ordered as follows:

NUM=0
NUM=2
NUM=4
  1. There are no rows preceding NUM =0.

  2. There is one row that is a peer of NUM=0, and that is the NUM =0 row itself. Thus, the divisor is 1.

  3. There are three rows in the partition as a whole, making the dividend 3.

  4. The result of 1/3 is.33 repeating, as shown in the example output.

DENSE_RANK( )

Assigns a rank to each row in a partition, which should be ordered in some manner. The rank for a given row is computed by counting the number of rows preceding the row in question, and then adding 1 to the result. Rows with duplicate ORDER BY values will rank the same. Unlike the case with RANK( ) , gaps in rank numbers will not result from two rows sharing the same rank.

SQL2003 Syntax

 DENSE_RANK( ) OVER {   window_name   (   window_specification   )} 

DB2

DB2 requires the ordering clause and does not allow the framing clause:

 DENSE_RANK( ) OVER ([   partitioning   ]   ordering   ) 

Oracle

Oracle also requires the ordering clause and does not allow the framing clause:

 DENSE_RANK( ) OVER ([   partitioning   ]   ordering   ) 

Example

Compare the results from the following Oracle-based example to those shown in the section on the RANK( ) function:

  SELECT NUM, DENSE_RANK( ) OVER (ORDER BY NUM) rank   FROM test4;  NUM       RANK ---------- ----------          0          1          1          2          2          3          3          4          3          4          4          5          5          6 

The two rows where NUM=3 are both ranked at #3. The next higher row is ranked at #4. Rank numbers are not skipped , hence the term "dense."

RANK( )

Assigns a rank to each row in a partition, which should be ordered in some manner. The rank for a given row is computed by counting the number of rows preceding the row in question, and then adding 1 to the result. Rows with duplicate ORDER BY values will rank the same, and will lead to subsequent gaps in rank numbers.

SQL2003 Syntax

 RANK( ) OVER {   window_name   (   window_specification   )} 

DB2

DB2 requires the ordering clause and does not allow the framing clause:

 RANK( ) OVER ([   partitioning   ]   ordering   ) 

Oracle

Oracle also requires the ordering clause and does not allow the framing clause:

 RANK( ) OVER ([   partitioning   ]   ordering   ) 

Example

The following Oracle-based example uses the NUM column to rank the rows in the test4 table:

  SELECT NUM, RANK( ) OVER (ORDER BY NUM) rank   FROM test4;  NUM       RANK ---------- ----------          0          1          1          2          2          3          3          4          3          4          4          6          5          7 

Because both rows where NUM =3 rank the same at #4, the next higher row will be ranked at #6. The #5 rank is skipped.

PERCENT_RANK

Computes the relative rank of a row by dividing that row's rank less 1 by the number of rows in the partition, also less 1:

 (rank - 1) / (rows - 1) 

Compare this calculation to that used for CUME_DIST .

SQL2003 Syntax

 PERCENT_RANK( ) OVER ({   window_name   (   window_specification   )} 

DB2

DB2 does not support the PERCENT_RANK( ) window function.

Oracle

Oracle requires the ordering clause and does not allow the framing clause:

 PERCENT_RANK( ) OVER ([   partitioning   ]   ordering   ) 

Example

The following, Oracle-based example assigns a relative rank to values of NUM , partitioning the data on the ODD column:

  SELECT NUM, ODD, PERCENT_RANK( ) OVER   (PARTITION BY ODD ORDER BY NUM) cumedist   FROM test4;  NUM        ODD   CUMEDIST ---------- ---------- ----------          0          0          0          2          0         .5          4          0          1          1          1          0          3          1 .333333333          3          1 .333333333          5          1          1 

Following is an explanation of the calculation behind the rank for the row in which NUM =2:

  1. Row NUM =2 is the second row in its partition; thus, it ranks #2.

  2. Subtract 1 from 2 to get a divisor of 1.

  3. The dividend is the total number of rows in the partition, or 3.

  4. Subtract 1 from 3 to get a dividend of 2.

  5. The result of 1/3 is.33 repeating, as shown in the example.

ROW_NUMBER

Assigns a unique number to each row in a partition.

SQL2003 Syntax

 ROW_NUMBER( ) OVER ({   window_name   (   window_specification   )} 

DB2

DB2 does not allow the framing clause, and it makes the ordering clause optional:

 ROW_NUMBER( ) OVER ([   partitioning   ]     [   ordering   ]) 

Oracle

Oracle requires the ordering clause and does not allow the framing clause:

 ROW_NUMBER( ) OVER ([   partitioning   ]   ordering   ) 

Example

  SELECT NUM, ODD, ROW_NUMBER( ) OVER   (PARTITION BY ODD ORDER BY NUM) cumedist   FROM test4;  NUM        ODD   CUMEDIST ---------- ---------- ----------          0          0          1          2          0          2          4          0          3          1          1          1          3          1          2          3          1          3 

 5          1          4 



SQL in a Nutshell
SQL in a Nutshell (In a Nutshell (OReilly))
ISBN: 0596518846
EAN: 2147483647
Year: 2003
Pages: 78

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net