3.2. Cursors, Pipelining, and NestingSo far, table functions may strike you as a performance booster that might help you out occasionally when you're in a tight spot. The features described in this section, howeverthe use of cursors, pipelining, and nesting with table functionsare powerful enough that they might persuade you to actually tailor your code to make use of them.
I think the best way to demonstrate all of these capabilities is with an example from a Data Warehouse ETL process that extracts information about work orders. The specific function I'll show extracts components of the creation, assignment, and close date of the orders. These components are then passed on for further processing in the ETL. 3.2.1. CursorsWe looked carefully at cursors in Chapter 2, and here they are again! How do cursors and table functions interact? Let's start by looking at a non-pipelined version of a table function. /* File on web: date_parser.sql */ CREATE OR REPLACE FUNCTION date_parse ( p_curs SYS_REFCURSOR ) RETURN order_date_t AS v_order_rec orders%ROWTYPE; v_ret_val order_date_t := order_date_t( ); BEGIN -- for every order in the cursor... LOOP FETCH p_curs INTO v_order_rec; EXIT WHEN p_curs%NOTFOUND; -- extend the array by 3 and populate with cmoponents of the -- orders creation, assignment and close date v_ret_val.EXTEND(3); v_ret_val(v_ret_val.LAST - 2) := order_date_o(v_order_rec.order_number, 'O', TO_CHAR(v_order_rec.create_date,'YYYY'), TO_CHAR(v_order_rec.create_date,'Q'), TO_CHAR(v_order_rec.create_date,'MM')); v_ret_val(v_ret_val.LAST - 1) := order_date_o(v_order_rec.order_number, 'A', TO_CHAR(v_order_rec.assign_date,'YYYY'), TO_CHAR(v_order_rec.assign_date,'Q'), TO_CHAR(v_order_rec.assign_date,'MM')); v_ret_val(v_ret_val.LAST) := order_date_o(v_order_rec.order_number, 'C', TO_CHAR(v_order_rec.close_date,'YYYY'), TO_CHAR(v_order_rec.close_date,'Q'), TO_CHAR(v_order_rec.close_date,'MM')); END LOOP; -- every order in ths cursor RETURN(v_ret_val); END; And here are the results when three orders are queried. ORDER_NUMBER D YEAR QUARTER MONTH ------------ - ---------- ---------- ---------- 1 O 2005 3 8 1 A 2005 3 8 1 C 2005 3 8 2 O 2005 4 10 2 A 2005 4 10 2 C 2005 4 10 3 O 2005 4 12 3 A 2005 4 12 3 C 2005 4 12 You'll notice that I said three orders were queried, but I was careful not to say where they were queried from. The answer is that they are queried from the cursor passed to the function as shown here. SELECT * FROM TABLE(date_parse(CURSOR(SELECT * FROM orders))); The TABLE keyword indicates the call to the date_parse function. The CURSOR keyword indicates that the text that follows (in parentheses) is to be used as the cursor in the table function. It is implicitly opened, and the function fetches records from it. It is implicitly closed when the table function goes out of scope. The ability to pass cursor text to the function is extremely powerful because all it requires is valid SQL. It can be called (and hence reused) in many places. For example, if I want to process only today's orders, I would call it like this: SELECT * FROM TABLE(date_parse(CURSOR(SELECT * FROM orders WHERE create > TRUNC(SYSDATE)))); If I want to further restrict it to orders from a specific region, I would add that restriction to the SELECT statement as follows: SELECT * FROM TABLE(date_parse(CURSOR(SELECT * FROM orders WHERE create > TRUNC(SYSDATE) AND region_id = 33))); The only restriction on the SELECT statement here is it has to select every column from the ORDERS table because within the table function, the record variable it selects into is declared as such (orders%ROWTYPE). You don't always have to be this restrictive, however, and there are plenty of options for matching up SELECT statements, cursor parameter types, and local variables. I'll discuss those options later in this chapter. 3.2.2. Pipelined Table FunctionsA pipelined table function is one that returns a result set as a collection, but it does so iteratively. In other words, Oracle does not wait for the function to run to completion, storing all the rows it computes in the PL/SQL collection before returning it. Instead, as each row is ready to be assigned to the collection, it is "piped out" of the function. Let's see a pipelined table function in action. /* File on web: date_parser_pipelined.sql */ CREATE OR REPLACE FUNCTION date_parse ( p_curs SYS_REFCURSOR ) RETURN order_date_t PIPELINED AS v_order_rec orders%ROWTYPE; BEGIN -- for every order in the cursor... LOOP FETCH p_curs INTO v_order_rec; EXIT WHEN p_curs%NOTFOUND; -- pipe out the components of the orders open data PIPE ROW(order_date_o(v_order_rec.order_number, 'O', TO_CHAR(v_order_rec.create_date,'YYYY'), TO_CHAR(v_order_rec.create_date,'Q'), TO_CHAR(v_order_rec.create_date,'MM'))); -- pipe out the components of the orders assign date PIPE ROW(order_date_o(v_order_rec.order_number, 'A', TO_CHAR(v_order_rec.assign_date,'YYYY'), TO_CHAR(v_order_rec.assign_date,'Q'), TO_CHAR(v_order_rec.assign_date,'MM'))); -- pipe out the components of the orders close date PIPE ROW(order_date_o(v_order_rec.order_number, 'C', TO_CHAR(v_order_rec.close_date,'YYYY'), TO_CHAR(v_order_rec.close_date,'Q'), TO_CHAR(v_order_rec.close_date,'MM'))); END LOOP; -- every order in the cursor RETURN; END; There are four main syntax changes between the non-pipelined version and the pipelined version:
Oracle won't allow just any old datatype to be piped back from any old table function. The collection specified as the return type of the function must be of the object type. The two datatypes declared for my example were declared like this: CREATE OR REPLACE TYPE order_date_o AS OBJECT ( order_number NUMBER, date_type VARCHAR2(1), year NUMBER, quarter NUMBER, month NUMBER ); / CREATE TYPE order_date_t AS TABLE OF order_date_o; / To demonstrate how much pipelining helps in this situation, I'll beef up the number of orders to 10,000 and run the following query against both the non-pipelined and the pipelined versions of the function. SELECT * FROM TABLE(date_parse(CURSOR(SELECT * FROM orders))) WHERE ROWNUM <= 10; This query is perfect for this situation because it will show exactly how long the function took to send back its 10th result. And the winner is...
The pipelined version runs 2.66 seconds or almost 93% faster. More fulfilling from a DBA point of view is that behind the scenes the database had to spend 93% less time maintaining a read-consistent copy of the ORDERS table to satisfy the query, and there is 93% less chance of requiring excess physical reads because the query ran so long. I could go on and on. Even more pleasing from a DBA point of view is the reduction in the amount of Oracle session memory required to execute the pipelined version. For the sake of simplicity here, I'll restrict the concept of session memory to the User Global Area (UGA) and the Process Global Area (PGA) . Table 3-1 shows a comparison between the UGA and the PGA required to execute both versions of the function. These stats were gathered after signing onto the database and executing the pipelined and non-pipelined functions once each.
That's a 98% reduction in both session UGA and session PGAthe database has less to keep track of, so it can spend more time being fast. 3.2.3. Nested Table FunctionsThe nesting of table functions refers to the execution of several table functions in a nested manner: the results of one are sent to the next for further processing, and those results are sent on to the next, and so on. This process is sometimes known as daisy-chaining . Combined with pipelining, the nesting of table functions provides a particularly powerful technique for ETL processing. I'll demonstrate by coding a function that accepts the results of my date_parse table function via a cursor and performs an operation on them. To avoid the clutter of explaining a complex ETL transformation, we'll stick to a simple one by adding up the order_number, year, quarter, and month values queried. Here's the function. /* File on web: next_in_line.sql */ CREATE OR REPLACE FUNCTION next_in_line ( p_curs SYS_REFCURSOR ) RETURN next_t PIPELINED IS v_ret_val next_t := next_t( ); -- local variables for cursor results v_on NUMBER; v_dt VARCHAR2(1); v_yr NUMBER; v_qt NUMBER; v_mt NUMBER; BEGIN -- for all date components from the cursor... LOOP FETCH p_curs INTO v_on, v_dt, v_yr, v_qt, v_mt; EXIT WHEN p_curs%NOTFOUND; -- pipe out the sum of the components PIPE ROW(next_o(v_on + v_yr + v_qt + v_mt)); END LOOP; -- every date component RETURN; END; We've already covered most of the syntax in this example. Only the syntax used to nest the two functions together in a query is new: SELECT * FROM TABLE(next_in_line (CURSOR (SELECT * FROM TABLE(date_parse (CURSOR (SELECT * FROM orders)))))); With my nifty indentation, it even looks nested with the embedded TABLE and CURSOR keywords. Simply put, what this example does is start the date_parse function, which pipes results to the next_in_line function, which pipes them to the world. |