Section 3.3. Parallelizing Table Functions

3.3. Parallelizing Table Functions

Moving functions into the paradigm of SELECT statements allows them to take advantage of another powerful facet of Oracleparallel query processing.

Oracle has long provided parallelism as a way to conquer large queries by splitting processing among multiple Parallel Query (PQ) servers, using each to calculate a portion of the results, then assembling a final result set at the end. Behind the scenes, Oracle decides how to distribute the work across the available PQ servers to get the best results. The DBA can influence Oracle's decision somewhat by setting degrees of parallelism for tables or implementing specific partitioning schemes, but, in the end, the database decides how best to execute the query.

3.3.1. Taking Advantage of Parallel Query

Parallelization helps all types of queries, whether or not they use table functions. Queries that do not use table functions take advantage of the extra throughput of PQ to assemble their final result set faster. Queries that do use table functions also benefit from the added throughput of PQ servers, but they take things a step further by assembling results sets along the way.

Let's illustrate using the following table, which holds one record for every single transaction at a large bank:

     SQL> DESC acct_transactions      Name                                      Null?    Type      ----------------------------------------- -------- ------------      AREA                                               VARCHAR2(10)      TRX_DATE                                           DATE      TRX_AMT                                            NUMBER

Suppose that I want to build a function to summarize the transactions by area. The first thing I considered implementing here was a query grouped by area. But that is negated by the need for a series of complex validation processes; for the sake of demonstration, I've hidden those in the function named super_complex_validation in the following examples.

Here are the results I'm looking for based on a demonstration data set, assuming that all transactions pass the complex validation.

     SQL> SELECT area,       2         SUM(trx_amt)       3    FROM acct_transactions       4  GROUP BY area;     AREA       SUM(TRX_AMT)     ---------- ------------     1                   460     10                  550     2                   470     3                   480     4                   490     5                   500     6                   510     7                   520     8                   530     9                   540

To achieve this result faster with a large number of records, I decide to use parallel processing. With table functions, this means that Oracle will run multiple instances of the table function in parallel and distribute results from the passed-in REF cursor among them, as shown in Figure 3-1.

Figure 3-1. Parallel table function

3.3.2. Distributing Records

One really cool thing about parallel table functions is that you can tell Oracle how to distribute records among the parallel function instances. Two different aspects of distribution can be specified. The first is partitioning how Oracle should decide which records to send to which function instance. The second is streaming how each function instance should order the records partitioned to it. The options for each are shown below.

Partitioning: Records can be partitioned based on ranges of one or more columns, hash values based on one or more columns, or simply as Oracle sees fit. The PARTITION BY clause specifies the specific partitioning option for the table function, as shown in the following examples.
Streaming: Records can be ordered or clustered by specific columns. The ORDER or CLUSTER keyword specifies the specific streaming option for the table function, as shown in the following examples.

I'll explain each of these components and how they interact while working through development of my account transaction summary function.

3.3.2.1. Random partitioning (PARTITION BY ANY)

First, I'll set up the function and tell Oracle to run it in parallel, but simply partition the records as it sees fit. The PARALLEL_ENABLE clause in the function header tells Oracle that this function has been coded with parallelism in mind and that it would really like to take advantage of this feature. The PARTITION BY ANY parameter supplied in this example states that the records returned by the REF cursor be partitioned randomly, in whatever order Oracle wants across the available PQ servers. In other words, this example is using the PQ servers only to gain throughput.

     /* File on web: pipelined.sql */     CREATE OR REPLACE FUNCTION area_summary ( p_cursor SYS_REFCURSOR )                                RETURN area_summary_t                                PIPELINED                                PARALLEL_ENABLE ( PARTITION p_cursor BY ANY ) AS       v_row   acct_transactions%ROWTYPE;       v_total NUMBER := NULL;       v_area  acct_transactions.area%TYPE;     BEGIN       -- for every transaction       FETCH p_cursor INTO v_row;       LOOP         EXIT WHEN p_cursor%NOTFOUND;         -- if we pass the extensive validation check         IF super_complex_validation(v_row.trx_date,v_row.trx_amt) THEN           -- set initial total or add to current area total           -- or return an area total as required           IF v_total IS NULL THEN             v_total := v_row.trx_amt;             v_area := v_row.area;           ELSIF v_row.area = v_area THEN                v_total := v_total + v_row.trx_amt;           ELSE             PIPE ROW(area_summary_o(v_area,v_total));             v_total := v_row.trx_amt;             v_area := v_row.area;           END IF;         END IF;  -- extensive validation         FETCH p_cursor INTO v_row;       END LOOP; -- every transaction       PIPE ROW(area_summary_o(v_area,v_total));     END;

The function is executed using a SELECT statement with a SELECT statement as a parameter. I still get a kick out of saying that!

     SELECT *       FROM TABLE(area_summary(CURSOR(SELECT *                                        FROM acct_transactions)));

The results will be random and unpredictable because the function relies on receiving records ordered by area, but Oracle has been left on its own to decide how to actually disperse the records. Thus, specifying PARTITION BY ANY will not help my function. (I do, however, get the incorrect results really fast!)

3.3.2.2. Range partitioning (PARTITION BY RANGE)

To take best advantage of parallel processing, I need to tell Oracle that all records for any particular area must be sent to the same function instance. For example, if three parallel function instances are run, then all records for area 7 must go to instance 1, 2, or 3. This method is specified using the RANGE clause, as shown in the new function header here:

     CREATE OR REPLACE FUNCTION area_summary ( p_cursor ref_cursors.acct_trx_curs )                                RETURN area_summary_t                                PIPELINED                                PARALLEL_ENABLE ( PARTITION p_cursor BY RANGE(area) ) AS

Notice that the passed REF cursor has been changed to a strongly typed version. That is, if Oracle is to correctly partition records, it has to know what their structure is beforehand. My ref_cursors package contains a single line of code:

     PACKAGE ref_cursors IS       TYPE acct_trx_curs IS REF CURSOR RETURN acct_transactions%ROWTYPE;     END;

Now, the results should be better.

     SQL> SELECT *       2    FROM TABLE(area_summary(CURSOR(SELECT *       3                                     FROM acct_transactions)));     AREA              AMT     ---------- ----------     6                  96     7                  97     8                  98     9                  99     6                 414     7                 423     8                 432     9                 441     1                 369     10                450     2                 378     3                 387     4                 396     5                 405     1                  91     2                  92     3                  93     4                  94     5                  95     10                100     20 rows selected.

3.3.2.3. Order streaming (ORDER)

That's still not quite what I'm looking for. The problem is that even though records are being partitioned across function instances by area, they aren't being ordered within each function instance. For example, the records for area 6 all went to the same function instance, but another area was mingled in, which caused redundant, incomplete summaries for area 6. I need to advise Oracle that entries must be ordered by area within each function instance, as well. That's done using the ORDER clause shown below.

     CREATE OR REPLACE FUNCTION area_summary ( p_cursor ref_cursors.acct_trx_curs )                                RETURN area_summary_t                                PIPELINED                                PARALLEL_ENABLE ( PARTITION p_cursor BY RANGE(area) )                                ORDER p_cursor BY (area) AS

Now let's see what I get.

     SQL> SELECT *       2    FROM TABLE(area_summary(CURSOR(SELECT *       3                                     FROM acct_transactions)));     AREA              AMT     ---------- ----------     1                 460     10                550     2                 470     3                 480     4                 490     5                 500     6                 510     7                 520     8                 530     9                 540     10 rows selected.

Success! The totals are now correct because the records returned by the cursor were partitioned across multiple parallel function instances by area and ordered by area within them, as shown in Figure 3-2.

3.3.2.4. Hash partitioning (PARTITION BY HASH)

The preceding examples demonstrated random (ANY) partitioning and range partitioning. If you select the third partitioning option, known as hash partitioning , a hash value is calculated based on specified columns, and that value determines the function instance to which a record should be sent. Duplicate values will produce the same hash, so it's safe to assume that record distribution will work for my account

Figure 3-2. Parallel table function partitioned and ordered by area

transaction summary function. As a matter of fact, the PARTITION BY HASH option is usually a little bit faster than the PARTITION BY RANGE option.

Here's the syntax:

     CREATE OR REPLACE FUNCTION area_summary ( p_cursor ref_cursors.acct_trx_curs )                                RETURN area_summary_t                                PIPELINED                                PARALLEL_ENABLE ( PARTITION p_cursor BY HASH(area) )                                ORDER p_cursor BY (area) AS

3.3.2.5. Cluster streaming (CLUSTER)

In an earlier version of my summary function, I specified the ORDER clause because that function relies on records being ordered by area within each function instance. Another option, CLUSTER, guarantees that records with the same values for the specified columns will be clustered within the function instance. Note, however, that this option does not order the values.

Because my function cares only about a single column, I can easily switch to clustering. Here's the syntax for that.

     CREATE OR REPLACE FUNCTION area_summary ( p_cursor ref_cursors.acct_trx_curs )                                RETURN area_summary_t                                PIPELINED                                PARALLEL_ENABLE ( PARTITION p_cursor BY HASH(area) )                                CLUSTER p_cursor BY (area) AS

I have found the CLUSTER option for single columns to be faster than the ORDER option.

3.3.3. Which Options Should You Choose?

With all of these partitioning and streaming choices, which one makes sense for your particular situation?

PARTITION BY ANY (random)

Use this option if you just want extra throughput and the ordering of records is not important at all. For example, if you need to perform calculations on every record from the REF cursor and each calculation is independent, then this is the way to go because Oracle will distribute the records without worrying about their order.

I have yet to come across a situation requiring random partitioning in conjunction with the ORDER or CLUSTER options, but these combinations are syntactically allowed.

PARTITION BY RANGE

Specify this option if your function relies on certain records being processed together and the records are spread evenly among the values. This will ensure that each parallel function instance gets a similar amount of work to do.

Combining this partitioning option with either ORDER BY or CLUSTER BY is valid. Note that the clustering approach will be slightly faster.

PARTITION BY HASH

Specify this option if your function relies on certain records being processed together and the spread is somewhat skewed. The hash algorithm provides a better chance of each parallel function instance's getting a similar amount of work to do.

Combining this partitioning option with either ORDER BY or CLUSTER BY is valid. Note that the clustering approach will be slightly faster.

3.3.4. What Does Oracle Do?

At this point, you might be wondering how the database partitions the query results. Does Oracle execute one query per parallel function instance or does it execute a single query and partition its results? Let's find out by looking in the shared pool.

     SQL> SELECT sql_text,       2         parse_calls,       3         executions       4    FROM v$sql       5   WHERE INSTR(UPPER(sql_text),'ACCT_TRANSACTIONS')  > 0       6     AND INSTR(UPPER(sql_text),'SQL_TEXT') = 0       7     AND command_type = 3;     SQL_TEXT                  PARSE_CALLS EXECUTIONS     ------------------------- ----------- ----------     SELECT *   FROM TABLE(are           1          1     a_summary(CURSOR(SELECT *     FROM acct_transactions)))     SELECT "A3"."AREA" "AREA"           1          1     ,"A3"."TRX_DATE" "TRX_DAT     E","A3"."TRX_AMT" "TRX_AM     T" FROM "ACCT_TRANSACTION     S" "A3" ORDER BY "A3"."AR     EA"

There are only two cursors here: the one I executed and the one executed within the table function, each with a parse and execute count of 1. That means that a single cursor is used and that Oracle partitions the returned rows as requested.

3.3.5. Number of PQ Servers to Use

For most parallel query operations, it is possible to limit (or at least influence) the number of parallel query servers that may be used for a particular operation. For example, the degree of parallelism can be specified for a table, and Oracle will do its best to provide the specified number of PQ servers for queries of that table. It's also possible to specify the degree of parallelism (and hence the number of PQ servers used) for any particular query by specifying the appropriate optimizer hints.

Unfortunately, no such option exists for parallelized table functions.

I wish something like the following syntax were available:

     CREATE OR REPLACE FUNCTION area_summary ( p_cursor ref_cursors.acct_trx_curs )                                RETURN area_summary_t                                PIPELINED                                PARALLEL_ENABLE ( PARTITION p_cursor BY HASH(area) )                                DEGREE 3 - invalid - DRH dream syntax                                ORDER p_cursor BY (area) AS...

That way, I could define the maximum number of PQ servers allocated to a single execution of the table function. Despite this minor quibble, I do have to admit that Oracle does a pretty good job of determining the number of PQ servers to utilize.