Adding New SQL Functions


A powerful feature of the SQLite library is the ability to add user-defined functions to the SQL language. Because this feature is specific to SQLite, it is not part of the Perl DBI.

Instead, Perl provides the facility to create user-defined functions through two private methods in the SQLite DBD.

Creating Functions

You can register a new function using the private method create_function, called as follows:

 $dbh->func( $name, $argc, $func_ref, "create_function" ) 

The three arguments required are the function name, the number of arguments the function will take, and a reference to the Perl function that is to be used whenever the SQL function is called.

The simplest way to see how a user-defined function is used is to register a built-in Perl function in the SQL language. The following statement uses a minimal inline function, such as $func_ref, which registers the Perl function rand() as the SQL function rand(). No arguments are required for the function, so $argc is zero.

 $dbh->func( "rand", 0, sub { return rand() }, "create_function" ); 

SQLite's built-in random() function returns a random signed 32-bit integer, whereas Perl's rand() function returns a decimal between 0 and 1. You could see the difference by preparing and executing the following statement from a Perl script after the rand() function has been registered in SQL:

 SELECT rand(), random(); 

User-defined functions have to be registered for each database connection where they are to be used. Such functions are available in SQL only for the duration of that connectionthey are not saved to the database itself or made available to other connection objects within the same script. Of course, user-defined functions are designed to allow you to add custom functions to SQL, so let's look at a more complex example (see Listing 7.11).

Listing 7.11. Creating a User-Defined Function in Perl
 #!/usr/bin/perl -w sub altcaps {   my $str = $_[0];   my $newstr = "";   for ($i=0; $i<length($str); $i++) {     $char = substr($str, $i, 1);     if ($i%2) {       $newstr .= uc($char);     }     else {       $newstr .= lc($char);     }   }   return $newstr; } use strict; use DBI; my $dbh = DBI->connect("DBI:SQLite2:dbname=perldb", "", "")        or die("Cannot connect: " . DBI::errstr() ); $dbh->func( "altcaps", 1, "altcaps", "create_function" ); my $sql = "SELECT altcaps(last_name) FROM contacts"; my $sth = $dbh->prepare($sql)           or die("Cannot prepare: " . DBI::errstr() ); $sth->execute() or die("Cannot execute: ". DBI::errstr() ); while ( my $first_name = $sth->fetchrow_array() ) {   print "$first_name \n"; } 

Listing 7.11 creates a new function in Perl called altcaps(), which converts the case of a string so that it contains alternating upper- and lowercase characters. The function is then registered as an SQL function with the same name and is used to select a modified version of the last_name column from contacts.

The output from running this script is as follows:

 nEwMaN o'bRiEn wIlLiAmS 

In this example we specified a $argc value of 1 as only a single string argument is required. SQLite will return an error code if the function is passed too many or too few arguments.

For example, if the SQL statement is changed to

 $sql = "SELECT altcaps(first_name, last_name) FROM contacts"; 

The following error is produced:

 Cannot execute: wrong number of arguments to function altcaps() at ./func.pl line 30. 

However, you can call the user-defined function as many times as you like in the same query. For example, this is a valid usage of the custom function:

 $sql = "SELECT altcaps(first_name), altcaps(last_name) FROM contacts"; 

Note

If $argc is 1, any number of arguments can be passed to the function. This simply disables validation of the number of arguments in the SQL parser; you must still deal with the passed-in arguments in a sensible way in your Perl function.


Creating Aggregating Functions

SQLite also allows you to create new aggregating functions through the Perl interface. An aggregating function is one that applies a calculation across all the rows returned by a query or across similar rows indicated by a GROUP BY clause.

The private method in DBD::SQLite2 that is used to create an aggregating function is called create_aggregate.

 $dbh->func( $name, $argc, $pkg, "create_aggregate" ); 

This is called in much the same way as create_function, but the $pkg argument is a reference to a package containing the steps necessary to implement an aggregating function.

Three steps are required:

  • The new() method is called once to initialize an aggregator object upon which the step() and finalize() methods will be called.

  • The step() method is called once for each row in the aggregate.

  • The finalize() method is called once after all the rows in the aggregate have been processed and returns the computed value.

In Listing 7.12 we create a new aggregating function, median(), to calculate the median of a set of numbers. The median is a value such that half the values in the set are above the median and half are below it. The SQLite built-in avg() function computes the mean averagethe sum of all values divided by the number of values.

Listing 7.12. Creating an Aggregating Function Using the Perl Interface
 #!/usr/bin/perl -w package median; sub new {   bless []; } sub step {   my ( $context, $value ) = @_;   push @$context, $value; } sub finalize {   my $context = $_[0];   if (!@$context) {     return undef;   }   my @sorted = sort { $a <=> $b } @$context;   my $count = @sorted;   my $median;   if ($count%2 == 1) { # Odd number of elements     $median = $sorted[int($count/2)];   }   else {               # Even number of elements     $median = ($sorted[$count/2] + $sorted[($count/2)-1] ) / 2;   }   return $median; } use strict; use DBI; my $dbh = DBI->connect("DBI:SQLite2:dbname=perldb", "", "")           or die("Cannot connect: " . DBI::errstr() ); $dbh->func( "median", 1, "median", "create_aggregate" ); my $sql = "SELECT median(num), AVG(num) FROM numbers"; my $sth = $dbh->prepare($sql)           or die("Cannot prepare: " . DBI::errstr() ); $sth->execute() or die("Cannot execute: ". DBI::errstr() ); my ($median, $mean) = $sth->fetchrow_array(); print "Median: $median \n"; print "Mean: $mean \n"; 

Let's look at the three methods that make up the median package, beginning with new().

 sub new {   bless []; } 

This simple function is all that's needed to initialize an aggregator.

The step() method is also quite simple. Each $value that is to be aggregated is read in turn and pushed onto context. We do all the hard work in the finalize() function, so step() is just used to gather the required data into a workable format.

 sub step {   my ( $context, $value ) = @_;   push @$context, $value; } 

The finalize() method begins by reading context and making sure it contains at least one element.

 my $context = $_[0]; if (!@$context) {   return undef; } 

If there are no elements from which to find the median, undef is returnedtranslated to a NULL result in SQL.

To find the median value we need to look at the elements in numerical order, so a sort is performed on context.

 @sorted = sort { $a <=> $b } @context; 

The rule to find the median where there is an odd number of values is simply to find the value of the middle element in the list.

 $count = @sorted; if ($count%2 == 1) { # Odd number of elements   $median = $sorted[int($count/2)]; } 

For an even number of values, the median is the midpoint between the two middle values.

 else {               # Even number of elements   $median = ($sorted[$count/2] + $sorted[($count/2)-1] ) / 2; } 

Finally we return $median, which will become the result of the SQL aggregating function.

 return $median; 

We will use the data added by Listing 7.10 to test this new function, so we need some data to work from. This script created a table named numbers and inserted a sequence of square numbers into the num column.

The square numbers that were inserted, in order, are

 1 4 9 16 25 36 49 64 81 

We can see from looking at this sequence that the median will be 25, as there are four numbers on either side. The mean average, on the other hand, will be a bigger value as the later numbers are relatively large numbers compared to the earlier numbers in the sequence.

Listing 7.12 fetches both the median and the mean average of the rows using the new user-defined function and the SQLite built-in:

 $sql = "SELECT median(num), AVG(num) FROM numbers"; 

Running the script shows the calculated values as follows:

 $ ./listing7.12.pl Median: 25 Mean: 31.6666666666667 



    SQLite
    SQLite
    ISBN: 067232685X
    EAN: 2147483647
    Year: 2004
    Pages: 118
    Authors: Chris Newman

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net