Perl | PostgreSQL Developer's Handbook2001

Today, Perl is one of the most widespread programming languages, and millions of people all around the globe rely on this powerful programming language. This section is dedicated to all Perl freaks among you who want to connect your applications to PostgreSQL.

Overview

"It's the Magic that counts," Larry Wall on Perl's apparent ugliness. Well, I guess Larry, the father of Practical Extraction and Report Language (Perl), knows what he is talking about. It's the magic that counts, but I guess that's the same for every available programming language. In some cases, it is simply a matter of personal philosophy which language is best, but that's not all. Speed, stability, portability, and availability can be weighty arguments. Perl is a scripting language, so it's a little slower than native C code in most cases; it has to be because Perl's internals are actually C code. People say that there are some situations where Perl can even outperform a native C program. Personally, I must say that I can hardly imagine anything faster than a highly-optimized C program ”anything but hand-coded Assembler, of course. One thing can be said for sure, Perl is one of the fastest and most flexible scripting languages available. Some people are still discussing whether Perl is a "real" interpreter language. We want to answer this question with a quote taken from a Perl tutorial, "Perl is a compiler that thinks he is an interpreter."

Perl is one of the most powerful and widespread languages available. Perl is used by man people, especially for parsing purposes, Web applications, and network programming. The number of available modules seems to be endless, and their number is constantly increasing. Before we get to some programming stuff concerning PostgreSQL, we want to tell you something about the history of Perl. Perl was posted to the Usenet group comp.sources on October 18, 1987.

In 1988, Perl 2.0 was released and offered a lot of additional features. Larry Wall has stated,

"Randal can write one-liners again.

Everyone is happy,

and peace spreads over the whole Earth."

I think this sentence shows exactly what Perl hacking was all about. In 1988, Randal invented the very famous "Just another Perl hacker" signature that is often included in emails today.

Perl 3 was released in 1989, and Perl 4 followed on March 21, 1991. Perl 5 was born in 1994, and Tim Bunce introduced DBI, an abstract Perl interface to databases and a DBI driver for Oracle. Today, DBI is the most widespread Perl module for dealing with databases.

Since 1987, many improvements and modules have been implemented, and Perl has become one of the most flexible, fastest, and powerful language available.

In the summer of 1995, Andreas K nig started the Perl module repository. Later, K nig's archive turned into Comprehensive Perl Archive Network (CPAN), which is the standard archive for Perl modules.

Developers all around the globe are working hard to make Perl even more powerful. Currently, Perl works on nearly all hardware platforms and operating systems, because the core of the Perl language is entirely written in ANSI C so it's very portable. In a documentation about PostgreSQL by Alavoor Vasudevan, in the section about Perl it says, "Perl will be in use for thousands of years in the future!" Well, we will see whether Alavoor is right.

Currently, two major Perl interfaces for PostgreSQL are available. The Pg module is distributed with PostgreSQL and provides access to all functions in the libpq interface for PostgreSQL. The target of the module is to provide an interface that is as close as possible to the libpq interface. The Pg module is said to be a little faster than the second important interface ”the so-called DBI interface. We will take a close look at both interfaces in the following sections. It is up to you to decide which module you prefer and what is best for your application.

The Pg Module

The DBI module is not the only Perl module for PostgreSQL. The Pg module is PostgreSQL's onboard library and is the Perl version of the libpg interface. This section will guide you through the Pg module.

Overview

The Pg module is not related to the DBI interface in any way. The Pg module is an independent piece of software and, in addition to PostgreSQL and Perl, does not need additional Perl modules to work correctly.

Two different styles are supported by the Pg module.

The new style has some advantages over the old style. Destructures are automatically called when the last reference to an object has gone away. This is some sort of internal garbage collection that is very comfortable because the programmer does not have to take care about deleting connections and result structures. When a new object is created, the appropriate libpq function is used as a virtual method.

Connecting

We will start with a simple example in which we want to connect to a database called performance as user postgres . In the following example, user postgres has no password:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres") or         die "can't open: $!"; print $conn->status."\n";

To establish the connection, we use a function called connectdb that can be found in the Pg module. We have to pass a list of arguments to the function that define all necessary values. If not all parameters the functions accepts are provided by the programmer, the database uses the default value. We compiled a list of all available parameters and their default value in Table 9.2.

Table 9.2. Environment Variables

Parameter	Environment	Default Value
`Hosthost`	`PGHOST`	`localhost`
`pPort`	`PGPORT`	`5432`
`oOptions`	`PGOPTIONS`	`""`
`Ttty`	`PGTTY`	`""`
`dDbname`	`PGDATABASE`	`current userid`
`uUser`	`PGUSER`	`current userid`
`pPassword`	`PGPASSWD`	`""`

In the previous example, we have not defined the host storing the database, so localhost will be used by default.

After establishing the connection, we use a function called status to find whether the connection has successfully been created. If is returned, the connection has successfully been established; otherwise , an error occurred.

Note

In the previous example, we have included die to end the program when the connection cannot be established. This does not work with the Pg module. If we define the wrong database, the program won't stop. We can only determine an error when checking the result of the status function.

If we want to see the error message when no connection can be established, we can use a function called errorMessage :

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=xy user=postgres"); print $conn->errorMessage."\n";

We have no database called xy on the system, so an error will be displayed:

 [hs@duron code]$  ./pg.pl  FATAL 1:  Database "xy" does not exist in the system catalog.

The structure returned by the CONNECT function is called PGconn and consists of various parameters that can easily be accessed:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres") or         die "can't open: $!"; print $conn->port."\n"; print $conn->user."\n";

If we try to execute the program, we will see the following results printed onscreen:

 [hs@duron code]$  ./pg.pl  5432 postgres

The names of the fields are slightly different than the parameters that have to be passed to the connectdb function. They are listed in Table 9.3.

Table 9.3. Field Names

Field	Key
Name of the database	`Db`
Name of the user	`User`
Password of the user	`Pass`
Host where the database is stored	`Host`
Port to which the database is listening to	`Port`
Options passed to the database	`Options`

Another option for establishing a connection to a PostgreSQL server is to use Pg::setdb . The syntax is easy:

 $conn = Pg::setdb($pghost, $pgport, $pgoptions, $pgtty, $dbname);

The setdb method should be used when establishing a connection where a username and password are not needed.

If you want to find out about all possible options of the connectdb function, you can use Pg::conndefaults . The following is a short script that shows how this functions works:

 #!/usr/bin/perl use Pg; $Option_ref = Pg::conndefaults(); while(($key, $value) = each %$Option_ref) {         print "$key: $value\n"; }

If we execute the script, we will get the following result:

 [hs@duron code]$  ./pg.pl  service: port: 5432 hostaddr: host: password: options: user: hs authtype: tty: dbname: hs

The values shown in the previous listing are the current default values.

You may sometimes want to reset the connection to your database and reconnect . The Pg module offers a simple function called reset to perform the operation. In the following code, we show a simple example where we reconnect to the database after establishing a connection:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {  print "Connection established ...\n"; } $conn->reset; if      ($conn->status eq 0) {  print "Connection re-established ...\n"; }

If both connections are being created successfully, the script should produce two lines of output such as the following:

 [hs@duron code]$  ./pg.pl  Connection established ... Connection re-established ...

Disconnects will not be used in any of the scripts in this section. As mentioned before, this is done automatically. Up to now, we have already connected to the database several times, but there are no "old" backends running. We can check that with a simple grep command:

 [hs@duron code]$  ps ax  grep post  1872 ?        S      0:00 /usr/bin/postmaster -D /var/lib/pgsql/data  2710 pts/1    S      0:00 grep post

Only the supervisor daemon is running; no backend processes are in memory because the processes have been deleted by Perl automatically.

Collecting Information about the Backend

Sometimes it is useful to collect information about the backend process related to a certain query. The Pg module offers some very comfortable functions to get information about the backend. If we want to know the descriptor number of the backend connection socket, we can use a function called socket :

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); print $conn->socket."\n";

If we execute this script, we will see one number printed onscreen (in our case, it is 3 ). If no backend process is open, -1 will be returned.

Every process on a system is assigned to a unique process ID when being started, just like backend processes. To find out which PID has been assigned to the current backend process, the backendPID function can be used:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); print $conn->backendPID."\n";

We received the following result when executing the previous sample code shown:

 [hs@duron code]$  ./pg.pl  15533

For debugging purposes, or simply because of curiosity , the messages exchanged between the backend and frontend can be sent to a debugging port. In most cases, you won't need this in your applications, but it can be a very interesting information for PostgreSQL developers. Turning messaging on can be done with trace ; turning messaging off can be done with untrace :

 $conn->trace(debug_port); $conn->untrace;

Running Queries

The Pg module is not only used to establish connections. The most important thing when working with databases is retrieving data with the help of queries. The following example shows how we can perform a very simple query with the help of the Pg module:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT 1+1";         $result=$conn->exec($query);         if       ($result->resultStatus eq PGRES_TUPLES_OK)         {                 print "ok ...\n";         }         else         {                 print "error ...\n";         } }

First, we connect to the database and check whether the connection has been established successfully. If status returns , the connection works and we can execute the query. With the help of a function called exec , we perform a simple SELECT statement. Before we can safely use $result , we have to check whether the database encountered any problems with the SQL-statement, so we use resultStatus . resultStatus can return values from 0 to 7 but, as you can see in the previous sample code, it is possible to check more than the number being returned by status. Table 9.4 provides an overview which strings are related to which numbers .

Table 9.4. Overview of Query Status

Value	String
	`PGRES_EMPTY_QUERY`
`1`	`PGRES_COMMAND_OK`
`2`	`PGRES_TUPLES_OK`
`3`	`PGRES_COPY_OUT`
`4`	`PGRES_COPY_IN`
`5`	`PGRES_BAD_RESPONSE`
`6`	`PGRES_NONFATAL_ERROR`
`7`	`PGRES_FATAL_ERROR`

We do not just want to execute queries, we also want to use the result of a query. In the following example, we perform the SELECT statement shown previously and print the result onscreen:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT 1+1";         $result=$conn->exec($query);         $value=$result->fetchrow;         print "value: $value\n"; }

We can get the result of the query by calling the fetchrow function. To display the result onscreen, we use a simple print command.

The SELECT query just shown only returns one field and one row, but how can multiple rows with multiple columns be processed ? The solution is very easy. If one row has many columns , the database returns an array. To get the complete result, we simply go through all rows of the result using a WHILE loop:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT id, id*2 AS double FROM perftest WHERE id=4 or id=9";         $result=$conn->exec($query);         while   (@value=$result->fetchrow)         {                 print "@value\n";         } }

When executing the program, we will get two lines with two columns each:

 [hs@duron code]$  ./pg.pl  4 8 9 18

Extracting the result is sometimes not enough, so the Pg module offers additional features to provide a little more information about the query and the result of the query for the user.

Retrieving Information about the Result

In the following example, we will retrieve some important information about the query:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT id, id*2 AS double FROM perftest WHERE id<4";         $result=$conn->exec($query);         print "tuples: ".$result->ntuples."\n";         print "fields: ".$result->nfields."\n";         print "fieldname: ".$result->fnumber(id)."\n"; }

ntuples returns the number of tuples in the query result. nfields returns the number of fields in the result, and fnumber tells us where in the array a certain column is stored. When executing the script, we will get the following result:

 [hs@duron code]$  ./pg.pl  tuples: 3 fields: 2 fieldname: 0

You can see that the column called id can be found in position number 0, which is actually the first column.

Some programmers prefer binary data instead of ASCII. The advantage of binary data is that it can be processed faster, because the database stores the data in a binary format. The binaryTuples function can be used to see whether the result is binary. In the following code, you can see how the function has to be used:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT id, id*2 AS double FROM perftest WHERE id<4";         $result=$conn->exec($query);         print "tuples: ".$result->binaryTuples."\n"; }

The function returns 1 if the data returned is binary; othwerwise, will be returned.

In many cases ,fields have a fixed length. If you need the size of a field to define a data structure, you can use fmod . fmod returns the size of the field or a -1 if the field has no fixed length.

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT id FROM perftest WHERE id<4";         $result=$conn->exec($query);         print "length: ".$result->fmod("id")."\n";         print "length: ".$result->fmod(0)."\n"; }

Let's execute the script:

 [hs@duron code]$  ./pg.pl  length: -1 length: -1

You can see that Perl supports two ways of using commands such as fmod . On the one hand, you can see that we use the name of the column in the result to access the data. On the other hand, we can use the ID. In general, we recommend the use of the name to make sure that you get the right column.

In most cases, a result of -1 looks as if something has gone wrong during the process. Everything worked just fine in the previous example, but we recommend to that you test every statement handle before using it to make 100 percent sure that really everything has been working.

We may want to know more than the length of a field. In the next example, we will show how the data type of a field can be found:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT id FROM perftest WHERE id<4";         $result=$conn->exec($query);         print "type: ".$result->ftype("id")."\n"; }

If we execute the previous script, we see that 23 is returned. Now the problem is, what is meant by 23 ? The problem can be solved with the help of PostgreSQL's system tables. If we query pg_type by using

 SELECT * FROM pg_description WHERE objoid=23;

The result of the query is not surprising:

 objoid                    description --------+-------------------------------------------------      23  -2 billion to 2 billion integer, 4-byte storage (1 row)

The column is an integer value. PostgreSQL always returns the ID instead of the description of an object, because the object is also used in the system tables. It is much shorter and not confusing at all. The behavior of objects, operators, or aggregates supported by PostgreSQL can be changed by the superuser. For that reason, names can also be slightly confusing. You simply have to get used to that.

Imagine a situation where you want to find the status of the query used in the previous example. cmdStatus can be applied to do the job for you:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $query="SELECT id FROM perftest WHERE id<4";         $result=$conn->exec($query);         print "status: ".$result->cmdStatus."\n";         $query="SEL id FROM perftest WHERE id<4";         $result=$conn->exec($query);         print "status: ".$result->cmdStatus."\n"; }

You may have noticed that the second SELECT statement has a syntax error; the first SQL statements works perfectly well. If we execute the script, we will receive the following result shown:

 [hs@duron code]$  ./pg.pl  status: SELECT status:

The first SQL statement produces a useful status, while the second one produces an empty string.

Working with `BLOBs`

Working with BLOB s by just using a command-line tool, such as psql , might not fit your demands. Sometimes, it is more comfortable to use a Perl script to get the job done. Because the Pg module is the Perl version of libpq , it also offers functions to work with BLOB s. The following is a Perl script that creates a table to store the names of songs and the binary file containing the songs:

 #!/usr/bin/perl use Pg; $conn = Pg::connectdb("dbname=performance user=postgres"); if      ($conn->status eq 0) {         $create="CREATE TABLE songs(name text, mp3file OID)";         $result=$conn->exec($create);         print $conn->errorMessage."\n";         $path="/tmp/data/";         $upload="Falco-Rock_me_Amadeus.mp3";         $sql="INSERT INTO songs VALUES('Falco - Rock me Amadeus',                 lo_import('$path$upload')) ";         $ret=$conn->exec($sql);         print $conn->errorMessage."\n"; }

We loaded one of the highlights of Austrian music into the database. We have done this with the help of a simple SQL command. The Pg module supports a function called lo_import , but we think that it is a lot easier to use the SQL version.

Although the Pg module offers a number of functions for treating BLOBs , many people still have some troubles with it. We recommend that you use simple SQL commands, such as we have shown in the "Working with BLOBs" section in Chapter 3, "An Introduction to SQL." In the case of BLOB s, using Pg can easily become a little bit painful.

Final Word

Pg is a very powerful programming interface for your PostgreSQL database. You should keep in mind that the interface is designed to be a 100 percent counterpart of the libpg module (C). Some real Perl hackers among you will recognize that the Pg module is not a real Perl-ish interface. If you want a little more Perl feeling try the DBI module.

The DBI Interface and DBD for PostgreSQL

Let's get to the DBI interface for PostgreSQL. As you have already seen, two Perl modules for working with PostgreSQL are available. In this section, you will learn how to use the standard DBI module and the DBD driver for PostgreSQL.

Overview

Initially, the DBI interface was written by Tim Bunce, who also started writing the Oracle driver for DBI. The DBI interface is the most widespread database interface available for Perl today, and a lot of people around the globe build database applications relying on the tremendous flexibility of this module. DBI provides a huge set of functions that you can use to make your applications support databases.

When dealing with DBI, the most important thing to mention is that the DBI module has nothing to do with PostgreSQL. DBI is only a layer between the driver and the application, used to access the database, and your application.

In the case of PostgreSQL, DBD::Pg does the job. If an application wants to communicate with a database (it can even be a simple ASCII file), the DBI interface passes the data to the DBD driver of the database. Then the DBD driver does all the communication with the database engine, so DBI can be seen as the glue between the database driver and your applications.

Currently, a huge number of drivers is available for nearly all databases. If you want to find a list of all official modules available, simply check out CPAN at http://search.cpan.org.

At the time of writing, current version of DBI was 1.15, released on March 30, 2001. We strongly recommend that you use the most recent version of DBI because the specification of DBI is evolving constantly.

Installation

At the moment, the DBI module and the DBD driver for PostgreSQL are not included in PostgreSQL's source distribution, so the modules must be installed on your system. In general, installing Perl modules is a very easy task, just like DBI and DBD.

The first module you have to install on your system is DBI. Get the module from a local FTP site and unpack the archive:

  tar xvfz DBI-1.15.tar.gz

If the process terminates without displaying any error messages, you can switch to the directory containing the sources of your DBI module. The content of the directory will look something like the following:

 [root@duron DBI-1.15]#  ls  blib       DBI.o        dbish      Driver.xst   Perl.c      README Changes    DBI.pm       dbish.PL   lib          Perl.o      t dbd_xsh.h  dbipport.h   dbi_sql.h  Makefile     Perl.xs     test.pl DBI.bs     dbiproxy     DBI.xs     Makefile.PL  Perl.xsi    ToDo DBI.c      dbiproxy.PL  DBIXS.h    MANIFEST     pm_to_blib

After reading the README file, you can call perl Makefile.PL to generate a new Makefile that you'll need to continue installing the module:

 [root@duron DBI-1.15]#  perl Makefile.PL  *** Note:     The optional PlRPC-modules (RPC::PlServer etc) are not installed.     If you want to use the DBD::Proxy driver and DBI::ProxyServer     modules, then you'll need to install the RPC::PlServer, RPC::PlClient,     Storable and Net::Daemon modules. The CPAN Bundle::DBI may help you.     You can install them any time after installing the DBI.     You do *not* need these modules for typical DBI usage. Optional modules are available from any CPAN mirror, in particular     http://www.perl.com/CPAN/modules/by-module     http://www.perl.org/CPAN/modules/by-module     ftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-module Writing Makefile for DBI     Remember to actually *read* the README file!     Use  'make' to build the software (dmake or nmake on Windows).     Then 'make test' to execute self tests.     Then 'make install' to install the DBI and then delete this working     directory before unpacking and building any DBD::* drivers.

As you can see, Perl displays some information about the installation process and tells us how to continue. Let's try Perl's suggestions and use make . Perl now compiles the sources and we can test the module using make test :

 [root@duron DBI-1.15]# make test PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib -I/usr/lib/perl5/5.6.0/i386-linux -I/usr/lib/perl5/5.6.0 -e 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' t/*.t t/basics............ok t/dbidrv............ok t/examp.............ok t/meta..............ok t/proxy.............skipped test on this platform t/shell.............ok t/subclass..........ok All tests successful, 1 test skipped. Files=7, Tests=183,  4 wallclock secs ( 1.32 cusr +  0.25 csys =  1.57 CPU) PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib -I/usr/lib/perl5/5.6.0/i386-linux -I/usr/lib/perl5/5.6.0 test.pl test.pl DBI test application $Revision: 10.5 $ Using /mnt/data/perl/DBI-1.15/blib Switch: DBI 1.15 by Tim Bunce, 1.15 Available Drivers: ADO, ExampleP, Multiplex, Pg, Proxy, mysql dbi:ExampleP:: testing 5 sets of 20 connections: Connecting... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Disconnecting... Connecting... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Disconnecting... Connecting... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Disconnecting... Connecting... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Disconnecting... Connecting... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Disconnecting... Made 100 connections in  0 wallclock secs ( 0.06 usr +  0.00 sys =  0.06 CPU) Testing handle creation speed... 5000 NullP statement handles cycled in 0.7 cpu+sys seconds (6756 per sec) test.pl done

It seems that the test has been completed successfully and we can finish the installation process. This can be done with the help of make install . If make install terminates successfully, the module has been installed correctly on the system.

Because the DBI module is ready, we can add the DBD driver for PostgreSQL to the system. First, we have to unpack the sources:

  tar xvfz DBD-Pg-0.98.tar.gz

We have to use perl Makefile.PL , as we did for the DBI module, to generate the Makefile we need for the rest of the installation process:

 [root@duron DBD-Pg-0.98]#  perl Makefile.PL  Configuring Pg Remember to actually read the README file ! please set environment variables POSTGRES_INCLUDE and POSTGRES_LIB !

Oh dear, what happened ? Some of you might think that something has gone wrong, but keep calm ”everything is still on track.

The installation process can't find PostgreSQL, and we have to define the right location for the software with the help of two environment variables. In the case of RedHat 7.1, this can easily be done with the following two commands:

 [root@duron DBD-Pg-0.98]#  export POSTGRES_INCLUDE=/usr/include/pgsql/  [root@duron DBD-Pg-0.98]#  export POSTGRES_LIB=/usr/lib/pgsql/

If we have set the two variables correctly, we can try perl Makefile.PL again:

 [root@duron DBD-Pg-0.98]#  perl Makefile.PL  Configuring Pg Remember to actually read the README file ! OS: linux Using DBI 1.15 installed in /usr/lib/perl5/site_perl/5.6.0/i386-linux/auto/DBI Writing Makefile for DBD::Pg

You can see that it works now. To complete the installation process, use make and make install now. If no errors are displayed, you have successfully installed the module.

Connecting

We now try to establish a connection to the PostgreSQL server. Connecting means that we create a connection handle:

 #!/usr/bin/perl use DBI; $user="hs"; $password="anypasswd"; $dbname="performance"; $dsn="dbi:Pg:dbname=$dbname;port=5432"; $dbh=DBI->connect("$dsn", "$user", "$password") or         die "can't connect: $!\n";

In this example, we establish a connection to the PostgreSQL server running at localhost listening to port 5432. The database we want to connect to is called performance . We try to establish the connection as user hs whose password is anypasswd . In the source code, you can see that the first thing happening is including the DBI module. We do not have to explicitly include the DBD module for PostgreSQl because this is done automatically by the DBI module.

When establishing a connection, PostgreSQL checks if a value is defined in the connect string. If a certain value is not defined, PostgreSQL checks whether environment variables are set. If a value still can't be found, the database uses the hard-coded default values.

Table 9.5 shows all parameters supported by PostgreSQL.

Table 9.5. Environment variables

Parameter	Environment	Default Value
`host`	`PGHOST`	`localhost`
`port`	`PGPORT`	`5432`
`options`	`PGOPTIONS`	`""`
`tty`	`PGTTY`	`""`
`dbname`	`PGDATABASE`	`current userid`
`user`	`PGUSER`	`current userid`
`password`	`PGPASSWD`	`""`

After we have seen how establishing a connection works, we will present a very simple, but sometimes very useful, command to count the number of databases available:

 #!/usr/bin/perl use DBI; print DBI->data_sources("Pg")."\n";

When executing the script on my PostgreSQL server, the result will be 5 , because I have 5 PostgreSQL databases on my system.

If you want to check whether the database connection you are working on is still in business, the ping method would be a good choice for testing.

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; $rc=$dbh->ping or         die "can't ping database\n"; print "rc: $rc\n";

This program does nothing else but ping the database to which we are connected. A ping 's return value is critical. Check the module you are using to connect to your PostgreSQL server to find which values are returned in case of which scenarios.

The DBI specification supports some additional features that can be quite useful. Currently, the following function is not supported completely by PostgreSQL's DBD driver, but it works fine and it can be used to enhance the speed of your application when porting it to other databases:

 #!/usr/bin/perl use DBI; $one=DBI->connect("dbi:Pg:dbname=performance; port=5432", "hs", "anypasswd") or         die "can't connect: $!\n"; $two=DBI->connect_cached("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or die "can't connect: $!\n";

The connect_cached can be used to save authentification overhead, especially in performance critical environments. connect_cached is like connect but returns a cached database handle if it is still valid. If you have to connect to the database very often, this can speed up the connection process significantly.

Working with DBI

Working with the DBI module is extremely comfortable because DBI really feels like something very Perl-ish, and it is very easy for a programmer to understand the basic concept and handling of a system using DBI. Because DBI is the standard Perl interface for database programming, applications written for the DBI interface can be easily ported to other databases. If the SQL code used inside the application is 100 percent ANSI SQL 92-compliant and the application is written properly, the only thing that has to be modified is the connect string, which should be a rather easy task.

Let's start with a very simple program that does nothing but display the result of a simple SELECT statement:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432", "hs", "anypasswd") or         die "can't connect: $!\n"; $sql="SELECT 1+1"; $sth=$dbh->prepare($sql) or         die "can't prepare: $sql\n"; $sth->execute or         die "can't execute: $sql\n"; while   (@row=$sth->fetchrow_array) {         print "row: @row\n"; }

This may look a little bit confusing, but let's go through it line by line.

After connecting to the database, as shown in the previous section, we define a simple SQL statement ( $sql ). After that, we prepare the execution of the statement. Many databases support some information after preparing the statement. PostgreSQL does not support a concept like that, so prepare does nothing but store the statement after checking for placeholders.

Note

The DBI specification also defines prepare_cached , but PostgreSQL supports no interaction with the statement.

If prepare fails, we want to display an error and quit. In the previous example, this is done by using a simple Perl command for exception handling called die . When the application terminates because of an error, $! will contain some information about the origin of the error.

After preparing the statement, we want to execute it by using the execute command. Again, if an error occurs, we want the application to quit and raise an error.

Finally, we read the result of the query and write it to standard output. With the help of @row=$sth->fetchrow_array , we can read one line of data and assign the result to an array called @row . All lines are displayed onscreen with the help of a simple print command. print is repeated as long as WHILE extracts data from the query.

In many cases, it is interesting to find out how many lines are returned by the database. This can very easily be done by checking the return value of the execute statement. In the following example, we select the first 100 lines of table perftest (from the database we have used in Chapter 8, "Performance Tuning"):

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432", "hs", "anypasswd") or         die "can't connect: $!\n"; $sql="SELECT * FROM perftest LIMIT 100"; $sth=$dbh->prepare($sql) or         die "can't prepare: $sql\n"; $ret=$sth->execute or         die "can't execute: $sql\n"; print "lines: $ret\n";

$ret contains the number of lines returned by the statement. If we execute the script, we will get a result such as the following:

 [hs@duron code]$  ./dbi1.pl  lines: 100

The query used in this program shown returns exactly 100 rows; I guess that is what we expected.

In most cases, we want to use the result created by one query in another query. In the next example, we select the first 100 records and add one to the last column by using an UPDATE query:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432", "hs", "anypasswd") or         die "can't connect: $!\n"; $sql="SELECT * FROM perftest WHERE id < 101"; $sth=$dbh->prepare($sql) or         die "can't prepare: $sql\n"; $ret=$sth->execute or         die "can't execute: $sql\n"; while   (@row=$sth->fetchrow_array) {         $rv=$dbh->do("UPDATE perftest SET anumber=anumber+1 WHERE id=@row[0]") or                 die "can't update: $id\n"; }

After connecting and retrieving the desired records in the database, we go through the result line-by-line . For every line, we perform an UPDATE operation by using the do command provided by the DBI module and its DBD driver.

do is prepare and execute combined in a single statement. You might wonder why we have included prepare and execute here if all wonderful things can simply be done with do . Well, some things, such as retrieving data, cannot be done with do , because do only returns the number of lines affected by an SQL statement and not the statement handle that we need to extract the data from the result. Usually, do is used for non SELECT statements that are used only once or cannot be prepared because of driver restrictions.

Sometimes you want to execute the same SQL statements more than once using different sets of parameters. In this scenario, you don't have to prepare the SQL statement every time you want to execute it:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; $sql="SELECT * FROM perftest WHERE id < 2"; $stha=$dbh->prepare($sql) or die "can't prepare: $sql\n"; $ret=$stha->execute or die "can't execute: $sql\n"; $sthb=$dbh->prepare("UPDATE perftest SET anumber=anumber+? WHERE id=?") or         die "can't prepare\n"; while   (@row=$stha->fetchrow_array) {         $sthb->execute(@row[0], @row[0]) or                 die "can't update: $id\n"; }

We prepare and execute a SELECT statement. Before processing the WHILE loop, we prepare a second statement. Note, that we have included two question marks instead of real values in the statement. In the WHILE loop, we execute the statement and pass two parameters to it. Every parameter will be used to substitute one question mark so the statement can be executed correctly. Substitution is done with the help of so-called bind variables.

If you use bind variables and execute statements more often, your application will speed up significantly, because preparing the statement has to be done only once.

Many queries are very small and are designed to return only one line. To make the code easier to understand, the DBI interface offers some very useful commands. Look at the following sample code:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; $sql="SELECT * FROM perftest WHERE id < 4 LIMIT 1"; @row=$dbh->selectrow_array($sql); print "row: @row\n";

We perform a query that returns only one line of data. In this case, it would be much too long to use prepare , execute , and a WHILE loop to extract the data. Real Perl hackers want to write entire operating systems with just one line of code (that nobody understands if the line is written by a real Perl hacker), so the DBI interface has to offer a command for those people, too.

To prepare and execute the statement and for extracting the first line of data from the result, we can use the selectrow_array function. Don't confuse this with the fetchrow_array function we have used in the previous examples.

The DBI supports some more functions for retrieving data from a query, but covering them all is beyond the scope of in this book. If you want complete information about all functions, check out the man pages of the DBI interface.

Working with Arrays

You may sometimes want to use arrays in your SQL statements. In this section, we will show you how you can access data in arrays that are returned by your PostgreSQL database server. Let's create a table and insert some data first:

 CREATE TABLE children (     name        text,     children    text ); INSERT INTO children     VALUES ('Paul',     '{ Josef, Pat, Amy} '); INSERT INTO children     VALUES ('Carol',     '{ Epi, Gerwin} ');

A table called children contains two records now. We will retrieve and display both records using the following short Perl script:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432", "hs", "anypasswd",         { PrintError => 0, RaiseError => 1}  ) or         die "can't connect: $!\n"; $sql="SELECT * FROM children"; $sth=$dbh->prepare($sql); $ret=$sth->execute; while   (@row=$sth->fetchrow_array) {         print "row: @row[0] - @row[1]\n"; }

If we execute the program, we will get the following result:

 [hs@duron code]$  ./db3.pl  row: Paul - { Josef, Pat, Amy} row: Carol - { Epi, Gerwin}

You can see that the values of the array are returned by the database as one string. Depending on the way you want to work with the data, you can convert the string to any data structure you want.

Transactions

When building complex applications, it can be useful to influence the way PostgreSQL treats transactions. To build a robust system, many atomic database operations can be combined into one unit to avoid errors. The DBI interface offers a flag to influence the way transactions are treated. By default, AutoCommit is set and every operation is treated as a separate transaction.

In the following example, we perform two simple operations in one transaction.

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; $dbh->{ AutoCommit}  = 0;         # enable transactions eval {         $rv=$dbh->do("UPDATE perftest SET anumber=anumber+1 WHERE id < 10");         $rv=$dbh->do("DELETE FROM perftest WHERE id < 5");         $dbh->commit; } ; if      ($@) {         warn "An error occurred: $@\n";         $dbh->rollback; }

We first enable the transaction by turning AutoCommit off. Then we perform all operations with the help of the eval command. If eval fails, an error is displayed and the rollback function is called.

Errors and Exception Handling

No programming language can live without some sort of error and exception handling. Because the DBI module interacts closely with the database, some additional functions for exception handling have been implemented in Perl. We have already used some Perl functions to display warnings or to quit applications. Now let's look at the functions provided by the DBI interface now:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; $sql=" * FROM perftest WHERE id < 10 LIMIT 1"; @row=$dbh->selectrow_array($sql); print "error: ".$dbh->err."\n"; print "row: @row\n";

A function called err is defined for every handle and can be used to display errors. In the code, we have included an SQL statement where we simply omitted SELECT from the query string. This leads to a syntax error. We will execute the script now and see what happens:

 [hs@duron code]$  ./db2.pl  DBD::Pg::db selectrow_array failed: ERROR:  parser: parse error at or near at ./db2.pl line 10. error: 7 row:

An error is displayed, and that's what we expected Perl to do. The first error displayed comes directly from the selectrow_array function because the statement cannot be executed. After that, err returns 7 as error code. You will see how to receive more comfortable output in the next example.

Finally, we try to display the result of the query, but the array that should contain the result is empty.

The next example actually does actually the same thing, but the code looks a little more beautiful. We use the errstr function instead of err to get an error message that can be more easily understood .

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; $sql=" * FROM perftest WHERE id < 10 LIMIT 1"; @row=$dbh->selectrow_array($sql) or         warn "error: ".$dbh->errstr."\n"; if      (@row[0]) {         print "row: @row\n"; }

We only want the result of the function to be displayed when the query terminated successfully.

 [hs@duron code]$  ./db2.pl  DBD::Pg::db selectrow_array failed: ERROR:  parser: parse error at or near at ./db2.pl line 10. error: ERROR:  parser: parse error at or near

You can see that a real error message is being displayed now.

Debugging and Monitoring

After a connection has been established and a few lines of code have been written, you will soon find yourself doing some debugging work. The most important thing when debugging is to find out what is going on in the program. The Perl debugger or some simple print statements can satisfy your demands in some but not all cases. The DBI interface supports a method to trace the things going on inside the database. Look at the following example:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; DBI->trace(2, '/tmp/pg_trace.txt'); $sql="SELECT * FROM perftest WHERE id < 2"; $sth=$dbh->prepare($sql) or         die "can't prepare: $sql\n"; $ret=$sth->execute or         die "can't execute: $sql\n"; while   (@row=$sth->fetchrow_array) {         $rv=$dbh->do("UPDATE perftest SET anumber=anumber+1 ".                 "WHERE id=@row[0]") or                         die "can't update: $id\n"; }

We have slightly modified one of the previous examples to show you how trace can be used. Six levels of messaging are defined in Table 9.6.

Table 9.6. Status Codes

Value	Action
	No tracing
1	Traces DBI method calls with results and errors
2	Traces method entries with parameters and results
3	Adds some additional internal information about DBI
4	Displays more detailed information from the driver and displays DBI mutex information the in case of multithreaded Perl

trace can be called with up to two parameters. The first parameter defines the trace level as described in Table 9.6. The second parameter can be used to define a file to which to redirect the output. This is not only useful for debugging purposes, but it also gives you a good insight into what is happening inside your applications. We have included the output of the trace function that we used in the previous program to show you how debugging information produced by trace may look.

 DBI 1.15-nothread dispatch trace level set to 2     Note: perl is running without the recommended perl -w option     -> prepare for DBD::Pg::db (DBI::db=HASH(0x81de338)~0x81de2fc 'SELECT * FROM perftest WHERE id < 2') dbd_st_prepare: statement = >SELECT * FROM perftest WHERE id < 2< dbd_st_preparse: statement = >SELECT * FROM perftest WHERE id < 2<     <- prepare= DBI::st=HASH(0x81de440) at dbi1.pl line 12.     -> execute for DBD::Pg::st (DBI::st=HASH(0x81de440)~0x8102d78) dbd_st_execute dbd_st_execute: statement = >SELECT * FROM perftest WHERE id < 2<     <- execute= 1 at dbi1.pl line 14.     -> fetchrow_array for DBD::Pg::st (DBI::st=HASH(0x81de440)~0x8102d78) dbd_st_fetch     <- fetchrow_array= ( '1' 'WV3SGAmq' '34800423' ) [3 items] at dbi1.pl line 16.     -> do in DBD::_::db for DBD::Pg::db (DBI::db=HASH(0x81de338)~0x81de2fc 'UPDATE perftest SET anumber=anumber+1 WHERE id=1') 2   -> prepare for DBD::Pg::db (DBI::db=HASH(0x81de2fc)~INNER 'UPDATE perftest SET anumber=anumber+1 WHERE id=1' undef) dbd_st_prepare: statement = >UPDATE perftest SET anumber=anumber+1 WHERE id=1< dbd_st_preparse: statement = >UPDATE perftest SET anumber=anumber+1 WHERE id=1< 2   <- prepare= DBI::st=HASH(0x81de4d0) at DBI.pm line 939.     -> execute for DBD::Pg::st (DBI::st=HASH(0x81de4d0)~0x8174320) dbd_st_execute dbd_st_execute: statement = >UPDATE perftest SET anumber=anumber+1 WHERE id=1<     <- execute= 1 at DBI.pm line 940.     -> rows for DBD::Pg::st (DBI::st=HASH(0x81de4d0)~0x8174320) dbd_st_rows     <- rows= 1 at DBI.pm line 941.     <- do= 1 at dbi1.pl line 18.     -> DESTROY for DBD::Pg::st (DBI::st=HASH(0x8174320)~INNER) dbd_st_destroy     <- DESTROY= undef at dbi1.pl line 18.     -> fetchrow_array for DBD::Pg::st (DBI::st=HASH(0x81de440)~0x8102d78) dbd_st_fetch     <- fetchrow_array= ( ) [0 items] at dbi1.pl line 16.     -- DBI::END     -> disconnect_all for DBD::Pg::dr (DBI::dr=HASH(0x8170fc0)~0x81de380) dbd_discon_all     <- disconnect_all= '' at DBI.pm line 454.     -> DESTROY for DBD::Pg::st (DBI::st=HASH(0x8102d78)~INNER) dbd_st_finish dbd_st_destroy     <- DESTROY= undef during global destruction.     -> DESTROY for DBD::Pg::db (DBI::db=HASH(0x81de2fc)~INNER) dbd_db_disconnect dbd_db_destroy     <- DESTROY= undef during global destruction.     -> DESTROY in DBD::_::common for DBD::Pg::dr ( DBI::dr=HASH(0x81de380)~INNER)     <- DESTROY= undef during global destruction.

You can see that the output is already quite long, although we have not used level two yet.

If we don't want to rely on the messages provided by the system, we can add our own messages to the logfile. This can be done by using the trace_msg function:

 #!/usr/bin/perl use DBI; $dbh=DBI->connect("dbi:Pg:dbname=performance; port=5432",         "hs", "anypasswd") or                 die "can't connect: $!\n"; DBI->trace(1, '/tmp/pg_trace.txt'); $sql="SELECT * FROM perftest WHERE id < 9 LIMIT 1"; $sth=$dbh->prepare($sql) or die "can't prepare: $sql\n"; $ret=$sth->execute or die "can't execute: $sql\n"; while   (@row=$sth->fetchrow_array) {         DBI->trace_msg("Using @row[0]\ n",1);         $rv=$dbh->do("UPDATE perftest SET anumber=anumber+1 ".                 "WHERE id=@row[0]") or                         die "can't update: $id\ n"; }

pg_trace in /tmp contains one additional line for every time the WHILE loop is processed. It is also possible to define the trace level of an individual statement. If a second parameter is defined, the message is printed, but only if the trace level is equal to or greater than that level. The default value is 1 .

Although we have already set the trace level to 1 and only two SQL commands have to be processed, the logfile is still very long:

 DBI 1.15-nothread dispatch trace level set to 1     Note: perl is running without the recommended perl -w option dbd_st_prepare: statement = >SELECT * FROM perftest WHERE id < 9 LIMIT 1< dbd_st_preparse: statement = >SELECT * FROM perftest WHERE id < 9 LIMIT 1<     <- prepare('SELECT * FROM perftest WHERE id < 9 LIMIT 1' CODE)= DBI::st=HASH(0x81de858) at dbi1.pl line 12. dbd_st_execute     <- execute(CODE)= 1 at dbi1.pl line 13. dbd_st_fetch     <- fetchrow_array= ( '5' 'Kq6NJr5Z' '34864453' ) [3 items] at dbi1.pl line 14. Using 5 dbd_st_prepare: statement = >UPDATE perftest SET anumber=anumber+1 WHERE id=5< dbd_s