Section 2.15. Careful Use of User-Written Functions


2.15. Careful Use of User-Written Functions

When a user-written function is embedded in a statement, the function may be called a large number of times. If the function appears within the select list, it is called for each returned row. If it appears within the where clause, it is called for each and every row that has successfully passed the filtering criteria previously evaluated. This may be a considerable number of times if the other criteria are not very selective.

Consider what happens if that same function executes a query. The query is executed each time the function is called; in practice, the result is exactly the same as a correlated subquery, except that the function is an excellent way to prevent the cost-based optimizer from executing the main query more intelligently! Precisely because the subquery is hidden within the function, the database optimizer cannot take any account of this query. Moreover, the stored procedure is not as close to the SQL execution engine as is a correlated subquery, and it will consequently be even less efficient.

Now I shall present an example demonstrating the dangers of hiding SQL code away inside a user-written function. Consider a table flights that describes commercial flights, with columns for flight number, departure time, arrival time, and the usual three-letter IATA[*] codes for airports. The translation of those codes (over 9,000 of them) is stored in a reference table that contains the name of the city (or of the particular airport when there are several located in one city), and of course the name of the country, and so on. Quite obviously any display of flight information should include the name of the destination city airport rather than the rather austere IATA code.

[*] International Air Transport Association.

Here we come to one of the contradictions in modern software engineering. What is often regarded as "good practice" in programming is modularity, with many insulated software layers. That principle is fine in the general case, but in the context of database programming, in which code is a shared activity between the developer and the database engine itself, the desirability of code modularity is less clear. For example, we can follow the principle of modularity by building a small function to look up IATA codes and present the full airport name whenever the function is cited in a query:

     create or replace function airport_city(iata_code in char)     return varchar2     is       city_name  varchar2(50);     begin       select city       into city_name       from iata_airport_codes       where code = iata_code;       return(city_name);     end;     /

For readers unfamiliar with Oracle syntax, trunc(sysdate) in the following query refers to today at 00:00 a.m., and date arithmetic is based on days; the condition on departure times therefore refers to times between 8:30 a.m. and 4:00 p.m. today. Queries using the airport_city function might be very simple. For example:

     select flight_number,            to_char(departure_time, 'HH24:MI') DEPARTURE,            airport_city(arrival) "TO"     from flights     where departure_time between trunc(sysdate) + 17/48                              and trunc(sysdate) + 16/24     order by departure_time     /

This query executes with satisfactory speed; on a random sample on my machine, 77 rows were returned in 0.18 seconds (the average of several runs), the kind of time that leaves users happy (statistics indicate that 303 database blocks were accessed, 53 read from diskand there is one recursive call per row).

As an alternative to using a look-up function we could simply write a join, which of course looks slightly more complicated:

     select f.flight_number,            to_char(f.departure_time, 'HH24:MI') DEPARTURE,            a.city "TO"     from flights f,          iata_airport_codes a     where a.code = f.arrival       and departure_time between trunc(sysdate) + 17/48                              and trunc(sysdate) + 16/24     order by departure_time     /

This query runs in only 0.05 seconds (the same statistics, but there are no recursive calls). It may seem petty and futile to be more than three times as fast for a query that runs for less than a fifth of a second. However, it is quite common in large systems (particularly in the airline world) to have extremely fast queries running several hundred thousand times in one day. Let's say that a query such as the one above runs only 50,000 times per day. Using the query with the lookup function, the query time will amount to a total of 2:30 hours. Without the lookup function, it will be under 42 minutes. This maintains an improvement ratio of well over 300%, which in a high traffic environment represents real and tangible savings that may ultimately translate into a financial saving. Very often, the use of lookup functions makes the performance of batch programs dreadful. Moreover, they increase the "service time" of queries for no benefitwhich means that fewer concurrent users can use the same box, as you shall see in Chapter 9.

The code of user-written functions is beyond the examination of the optimizer.




The Art of SQL
The Art of SQL
ISBN: 0596008945
EAN: 2147483647
Year: N/A
Pages: 143

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net