Section 11.2. Querying with a Variable in List


11.2. Querying with a Variable in List

There is another, and rather important, use of pivot tables that I must now mention. In previous chapters I have underlined the importance of binding variables , in other words of passing parameters to SQL queries. Variable binding allows the DBMS kernel to skip the parsing phase (in other words, the compilation of the statement) after it has done it once. Keep in mind that parsing includes steps as potentially costly as the search for the best execution path. Even when SQL statements are dynamically constructed, it is quite possible, as you have seen in Chapter 8, to pass variables to them. There is, however, one difficult case: when the end user can make multiple choices out of a combo box and pass a variable number of parameters for use in an in list. The selection of multiple values raises several issues:

  • Dynamically binding a variable number of parameters may not be possible with all languages (often you must bind all variables at once, not one by one) and will, in any case, be rather difficult to code.

  • If the number of parameters is different for almost every call, two statements that only differ by the number of bind variables will be considered to be different statements by the DBMS, and we shall lose the benefit of variable binding.

The ability provided by pivot tables to split a string allows us to pass a list of values as a single string to the statement, irrespective of the actual number of values. This is what I am going to demonstrate with Oracle in this section.

The following example shows how most developers would approach the problem of passing a list of values to an in list when that list of values is contained within a single string. In our case the string is v_list, and most developers would concatenate several strings together, including v_list, to produce a complete select statement:

 v_statement := 'select count(order_id)'                    || ' from order_detail'                    || ' where article_id in ('                    || v_list || ')'; execute immediate v_statement into n_count; 

This example looks dynamic, but for the DBMS it's in fact all hardcoded. Two successive executions will each be different statements, both of which will have to be parsed before execution. Can we pass v_list as a parameter to the statement, instead of concatenating it into the statement? We can, by applying exactly the same techniques to the comma-separated value stored in variable v_list as we have applied to the comma-separated value stored in column actors in the example of on-the-fly normalization. A pivot table allows us to write the following somewhat wilder SQL statement:

 select count(od.order_id) into n_count from order_detail od,      (  -- Return at many rows as we have items in the list         -- and use character functions to return the nth item         -- on the nth row       select to_number(substr(v_list,                               case row_num                                 when 1 then 1                                 else 1 + instr(v_list, ',', 1, row_num - 1)                                end,                                case instr(v_list, ',', 1, row_num)                                  when 0 then length(v_list)                                  else                                    case row_num                                      when 1  then instr(v_list, ',',                                                         1, row_num) - 1                                      else instr(v_list, ',', 1, row_num) - 1                                           - instr(v_list, ',',                                                   1, row_num - 1)                                    end                                  end)) article_id       from pivot       where instr(v_list||',', ',', 1, row_num) > 0         and row_num <= 250) x where od.article_id = x.article_id; 

You may need, if you are really motivated, to study this query a bit to figure out how it all works. The mechanism is all based on repeated use of the Oracle function instr( ). Let me just say that this function instr(haystack, needle, from_pos, count) returns the countth occurrence of needle in haystack starting at position from_pos (0 is returned when nothing is found), but the logic is exactly the same as with the previous examples.

I have run the pivot and hardcoded versions of the query successively 1, 10, 100, 1,000, 10,000, and 100,000 times. Each time, I randomly generated a list of from 1 to 250 v_list values. The results are shown in Figure 11-5, and they are telling: the "pivoted" list is 30% faster as soon as the query is repeatedly executed.

Figure 11-5. Performance of a hardcoded list versus a list transformed with a pivot table


Remember that the execution of a hardcoded query requires parsing and then execution, while a query that takes parameters (bind variables) can be re-executed subsequently for only a marginal cost of the first execution. Even if this later query is noticeably more complicated, as long as the execution is faster than execution plus parsing for the hardcoded query, the later query wins hands-down in terms of performance.

There are actually two other benefits that don't show up in Figure 11-5:

  • Parsing is a very CPU-intensive operation. If CPU happens to be the bottleneck, hardcoded queries can be extremely detrimental to other queries.

  • SQL statements are cached whether they contain parameters or whether they are totally hardcoded, because you can imagine having hardcoded statements that are repeatedly executed by different users, and it makes sense for the SQL engine to anticipate such a situation. To take, once again, the movie database example, even if the names of actors are hardcoded, a query referring to a very popular actor or actress could be executed a large number of times.[*] The SQL engine will therefore cache hardcoded statements like the others. Unfortunately, a repeatedly executed hardcoded statement is the exception rather than the rule. As a result, a succession of dynamically built hardcoded statements that may each be executed only once or a very few times will all accumulate in the cache before being overwritten as a result of the normal cache management activity. This cache management will require more work and is therefore an additional price to pay.

    [*] Actually, the best optimization tactic in this particular case would be to cache the result of the query rather than the query.




The Art of SQL
The Art of SQL
ISBN: 0596008945
EAN: 2147483647
Year: N/A
Pages: 143

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net