Writing Perl Scripts | MySQL and Perl for the Web

only for RuBoard - do not distribute or recompile

Perl scripts are written for a couple of purposes in this book: as programs you invoke yourself from the command line, or as programs that you intend to run from a Web server in response to requests received over the network. (We ll also write some scheduled scripts that are invoked as cron jobs, but those are essentially command-line scripts run automatically by the system.) This section shows how to write both kinds of scripts, but instead of starting out by writing something complicated, we ll use the following progression of steps:

Verify that Perl works by writing a simple do nothing script.
Verify that you can access the CGI.pm and DBI modules.
Verify that you can get Apache to execute a Perl script to generate a Web page.
Verify that you can to connect to MySQL from within a Perl script.
Verify that Apache Web scripts can access MySQL.

If you ve never written Perl scripts before, you should go through each of the steps. Otherwise, you might want to skim through or skip the material describing the first couple of scripts. Command-line scripts do not need to be put in any special place. You might find it most convenient to create a separate directory under your home directory to use for trying out command-line scripts you write for this book. By contrast, Web scripts need to be installed in your Web server s directory hierarchy. For the scripts in this chapter, that s the cgi-bin directory described earlier in the Configuring Apache section.

Save Yourself Some Typing

If you haven t ready done so, you should obtain the software distribution that accompanies this book. It contains the source code for the scripts developed here, as well as any CREATE TABLE statements and sample data for the database tables that they use. The distribution is available at the following Web site:

http://www.kitebird.com/mysql-perl /

Write a Simple Perl Script

To write Perl scripts, you must know where Perl itself is located. Try the following command, which tells your shell (command interpreter) to display the full path name of your Perl program:

 % which perl  /usr/bin/perl

The output you see when you run the which command might be different, such as /usr/local/bin/perl or /opt/bin/perl. Whatever the path name is, make a note of it, because you ll need to substitute that path name wherever you see /usr/bin/perl in this book. If which doesn t tell you where Perl is, ask your system administrator.

Next, check which version of Perl you have by executing it with the -v option:

 % /usr/bin/perl -v  This is perl, version 5.005_03

Perl s version number should be at least 5.005. If you re running Perl 4, the scripts in this book won t work. If you have Perl 5, but it s older than 5.005, some of the scripts may fail.

Now write a simple script using the path name to Perl that you just determined, by using a text editor to create a file named intro1.pl that contains the following lines:

 #! /usr/bin/perl  # intro1.pl - simple do-nothing Perl script  print "I am a Perl script\n";  exit (0);

Be sure to put the correct path to Perl for your system on the first line, and be sure to use double quotation marks (not single) on the line containing the print statement. Then make intro1.pl executable using this command:

 % chmod +x intro1.pl

After making intro1.pl executable, try running it using one of the following commands. You should see the following output:

 % intro1.pl  I am a Perl script  %./intro1.pl  I am a Perl script

The first form should work if your shell searches your current directory to find commands. If it doesn t, use the second form, where the leading ./ tells the shell explicitly that the script is located in the current directory.^[1]

^[1] I ll assume throughout the rest of the book that the ./ is necessary to run any script located in your current directory.

Assuming that you see the proper output for at least one of the commands just shown, you ve accomplished a lot. Don t believe me? Then let s discuss a few of the things that could go wrong at this point, even for such a simple script. If you get the Command Not Found error even when you specify the leading ./ on the command name, it s likely that the path name to Perl on the first line of the script is not exactly correct. Check the script and fix the path if necessary.

A different error that might occur is this one:

 %./intro1.pl  intro1.pl: Permission denied.

If this happens, you forgot to run the chmod +x command. Go back and run it to make the intro1.pl script executable. Of course, it s not very likely that you ll have overlooked this for intro1.pl because I just told you to use it. However, you ll also need to use chmod +x for every other Perl script you write using this book (whether you run it from the command line or from your Web server), and it s easy to forget. In fact, one of the first things you should do if a script refuses to run at all is check its file mode to see whether it s executable.You can do that using ls -l (long listing):

 % ls -l intro1.pl  -rwxrwxr-x    1 paul    mtp      47 Oct 17 17:00 intro1.pl

The x (executable) permissions should be enabled. If they re not, use chmod +x to turn them on.

Another possible problem is that the script s output might look like this, with \n at end of the line:

 %./intro1.pl  I am a Perl script\n

If that s what you see, it means you ignored my earlier admonition to use double quotation marks in the print statement and used single quotation marks instead. I admire your explorer s spirit in desiring to find out what happens when you do things a different way, but you didn t really follow instructions, so change the single to double quotation marks so that the \n prints a newline and not a literal \n when you run intro1.pl. (This problem occurs, by the way, because Perl treats special characters in single-quoted and double-quoted strings differently. Within double quotes, sequences such as \n , \r , and \t print as newline, carriage return, and tab. Within single quotes, no such interpretation is done and Perl prints the sequences literally.)

In the course of writing intro1.pl, you established a few important things that you should keep in mind as we go along:

The path to your Perl program
How to make scripts executable
Whether you need to invoke a script as script.pl or as . /script.pl when you run it at the command line

Running Perl Scripts Under Windows

Some aspects of the directions in this section for running Perl scripts under UNIX don t apply if you re using Windows. For example, the initial #! line that lists the path name of the Perl interpreter is irrelevant (although harmless). Also, you ll want to set up a file association that tells Windows to use Perl to run programs having names that end in .pl. If Perl is in your search path, that association enables you to run a Perl script just by typing its name. Therefore, you should be able to run intro1.pl from the command prompt like this:

 C:\> intro.pl

These details may be taken care of for you when you install Perl on your system. If you use ActiveState Perl, for instance, its installer should change your path for you and help you establish the filename extension association.

Verify That You Can Access CGI.pm and DBI

intro1. pl serves to ensure that your Perl program is present and can execute scripts. Our second script checks the presence and accessibility of the CGI.pm and DBI modules. Create a script called intro2.pl that contains the following lines:

 #! /usr/bin/perl  # intro2.pl - verify availability of CGI.pm and DBI modules  use CGI;  use DBI;  $cgi = new CGI;  print "The CGI object was created successfully\n";  @driver_names = DBI->available_drivers ();  print "These DBI drivers are available: @driver_names\n";  exit (0);

Make intro2.pl executable, and then try to run it to verify that CGI.pm and DBI can be accessed properly:

 % chmod +x intro2.pl  %./intro2.pl  The CGI object was created successfully  These DBI drivers are available: ADO ExampleP Multiplex Proxy mysql

The list of driver names may be different on your system, but if you see mysql in the list of driver names, you re in good shape. If mysql isn t one of the driver names listed, you need to install DBD::mysql, the MySQL-specific DBI driver. If you see error messages to the effect that Perl can t locate either or both of the CGI.pm or DBI modules, most likely they haven t been installed. See Appendix A for instructions on getting and installing any missing software, or ask your system administrator to do so. (Show the administrator your script and the error output to illustrate the problem.)

It s possible when you run intro2.pl that CGI.pm will display a prompt and ask for input parameters. If this happens, just type control-D and the script will proceed:

 %./intro2.pl  (offline mode: enter name=value pairs on standard input)  (press control-D here)  The CGI object was created successfully  These DBI drivers are available: ADO ExampleP Multiplex Proxy mysql

Under Windows, type control-Z, followed by Return. To suppress this prompt, specify an empty parameter list directly on the command line, as follows:

 %./intro2.pl ""

Write a Script to Generate a Web Page

In this section, we ll write a script for use from within Apache to generate a Web page. The previous scripts were run from the command line, so you could put them in just about any directory and run them from there. For Web scripts, that s not true; you need to put them in your cgi-bin directory because that s where Apache expects programs to be. (At least, that s how we configured it earlier!)

Before writing the script, let s consider what happens when a Web server sends a page to a client s browser, because writing Web scripts differs in some important ways from writing static HTML pages. If you re writing a file containing static HTML, you put some content in the file and install the file in the Web document tree. When a client requests the page, the Web server opens the file, reads it, and writes its contents over the network to the client. But that s actually only part of the process. What really happens is that the Web server sends header lines first information that precedes the file s contents and that lets the browser know what s coming. Suppose you have a file mypage.html that looks like this:

 <html>  <head><title>My Page Title</title></head>  <body><p>My page body</p></body>  </html>

What the Web server sends to the browser may actually look something like this:

 % lynx -mime_header http://www.snake.net/mypage.html  HTTP/1.1 200 OK  Date: Thu, 26 Oct 2000 20:30:44 GMT  Server: Apache/1.3.17 (Unix) mod_perl/1.25 PHP/4.0.4pl1  Content-Length: 90  Content-Type: text/html  <html>  <head><title>My Page Title</title></head>  <body><p>My page body</p></body>  </html>

The example uses lynx, a text-based Web browser, to request mypage.html from www.snake.net The -mime_header option tells lynx to display everything it receives from the Web server, not just the contents of the requested page. Notice that the contents of mypage.html are preceded by a set of headers and a blank line that signals where they end. The headers tell the browser various things about the server and about what to expect following the blank line. In particular, the Content-Type: header tells the browser to expect HTML data. Normally, you don t see these lines because your browser interprets them to see what it can learn from them and then discards them.

What does this mean for you as a script developer? Just this: When you write a static HTML page, the server supplies the headers for you. But when you write a script that generates pages, you have to generate the headers yourself (or at least some of them, such as the Content-Type: header). You also write the blank line that signals where the headers end and the page content begins.

Most header lines consist of a header name, a colon, and some additional information. However, the very first header is special. That s the HTTP response header we don t need to generate it because Apache will do it for us. (There are ways to tell Apache that you want to generate the HTTP header yourself, but we won t do so.)

A simple Perl script that generates the same page as mypage.html is shown here. Observe that it generates a Content-Type: header and a blank line before putting out the page content:

 #! /usr/bin/perl  # intro3.pl - script to generate a Web page  print "Content-Type: text/html\n\n";  print <<END;  <html>  <head><title>My Page Title</title></head>  <body><p>My page body</p></body>  </html>  END  exit (0);

Put this script in your cgi-bin directory, call it intro3.pl, make it executable, and then request it using your browser. (Remember to substitute your own Web server s name for www.snake.net):

http://www.snake.net/cgi-bin/intro3.pl

You should see the same page title and text that you d see for mypage.html. If you request the page using lynx -mime_header, you should see the page contents preceded by a set of headers similar to those that were sent with mypage.html. If intro3.pl doesn t work, there are several things you can check:

Make sure the script is executable. If it s not, you ll likely see an error message in your browser window:

You don't have permission to access intro3.pl on this server.

Use chmod +x to make intro3.pl executable if it isn t already.
Examine the Apache error log to see whether it contains a message about your script. The log often provides useful clues about causes of brokenness in scripts.
Use the -wc flags to tell Perl to perform a syntax check on the script:
```
 % perl -wc intro3.pl  intro3.pl syntax OK 
```
If there s a problem, you ll see some kind of diagnostic message rather than the Syntax OK message.

intro3.pl generates a Web page by hand that is, the Content-Type: header and HTML markup are written out literally in the script. The next script, intro4.pl, produces the same page content as intro3.pl but does so using the CGI.pm module to generate the header and markup. It looks like this:

 #! /usr/bin/perl -w  # intro4.pl - generate a Web page using the CGI.pm object-based interface  use strict;  use CGI;  my $cgi = new CGI;  print $cgi->header (),      $cgi->start_html ("My Page Title"),      $cgi->p ("My page body"),      $cgi->end_html ();  exit (0);

Install this script in the cgi-bin directory, request it from your browser, and you should see a page that looks the same as the ones produced by mypage.html and intro3.pl. (Actually, if you use your browser s View Source command or use lynx -mime_header to see the raw HTML produced by intro4.pl,you ll notice that CGI.pm doesn t write many newline characters. You may also see capitalized tags, and perhaps a <!DOCTYPE> tag at the beginning of the output. However, none of these things make a difference in the formatted display produced in the browser window.)

As before, we begin the script with a #! line specifying the path to Perl. The -w option is something we haven t used yet. It turns on warning mode, telling Perl to complain about dubious constructs in the script. It s a worthwhile option and we ll use it from now on.

The use strict line instructs Perl to be strict about variable references. Strict mode is valuable because it forces us to declare any variables before we use them. (This actually makes scripts a little more difficult to write initially because they require more programmer discipline, but it helps in the long run because we re less likely to make variable-related errors.)

The intro4.pl script also includes a use statement indicating that we want to use the CGI.pm module. Then it creates a CGI object $cgi that provides access to a bunch of HTML-generating methods (functions). The variable is declared using a my statement, which tells Perl, I m going to use this variable, don t complain when I do. (The use strict statement would cause Perl to issue an error had we used the variable without declaring it first.) The print statement strings together calls to several CGI.pm methods, each of which produces a different part of the Web page:

header() takes care of generating the Content-Type: header. The default type is text/html, but you can pass an argument specifying another type explicitly if necessary. If your program is sending a JPEG image rather than HTML text, for example, you d specify a type of image/jpeg. Techniques for serving images are described in Chapter 5, Writing Form-Based Applications.
The start_html() method sends out the introductory HTML up through the initial <body> tag; the argument to the method indicates the title that should appear within the <title> and </title> tags.
p() generates a paragraph in the page body. (It produces text surrounded by <p> and </p> tags.)
The end_html() method produces the closing </body> and </html> tags.

The HTML-generating calls used by intro4.pl are just a few of the methods that CGI.pm provides. Basically, there is a method corresponding to each HTML tag you d want to write they are listed in the installed CGI.pm documentation, which you can read using this command:

 % perldoc CGI

For other CGI.pm resources, see Appendix B, References and Further Reading. I recommend especially Official Guide to Programming with CGI.pm by Lincoln Stein, the author of CGI.pm.

intro4.pl uses the object-oriented interface to CGI.pm, where you create a CGI object and then call its methods to generate HTML. CGI.pm also provides a functionbased interface. To use it, import the method names into your script s namespace, and then call the methods directly as functions without naming an object. The next script, intro5.pl, looks much like intro4.pl but uses the function-based interface rather than the object-oriented interface:

 #! /usr/bin/perl -w  # intro5.pl - generate a Web page using the CGI.pm function-based interface  use strict;  use CGI qw(:standard);  print header (),      start_html ("My Page Title"),      p ("My page body"),      end_html ();  exit (0);

intro5.pl produces exactly the same output as intro4.pl, but differs from it in the following ways:

The use CGI line is different. It names the set of methods to be imported into the script s namespace. :standard grabs the most common ones. Another useful set is :html3, which includes the table-generating methods that are part of the HTML 3.2 standard. You can name both sets if you like:
```
 use CGI qw(:standard :html3); 
```
If you just want to import all methods, use the :all name set:
```
 use CGI qw(:all); 
```
qw() is a Perl construct for writing single-quoted words without writing the quotes. The previous use statement could have been written like this:
```
 use CGI ':all'; 
```
Use whichever format you prefer.
There s no need to instantiate (create) a CGI object.
Some CGI.pm method names clash with Perl operators; when these are invoked as functions, you need to use a different letter case. For example, Perl has a tr transliteration operator, so the CGI.pm table row method tr() must be invoked as Tr() or TR() when you call it as a function. This problem does not occur with the object-oriented interface because methods are invoked through the CGI object (for example, $cgi->tr()), and there is no ambiguity.
You need to be careful not to name your own functions using names that already are used by CGI.pm. For example, you shouldn t write functions with names that correspond to HTML tags, such as table(), br(), h1(), and so forth. CGI.pm already defines functions with those names.
Function-based code can be easier to read (and write) because it s less cluttered with instances of $cgi-> . For short scripts such as intro4.pl and intro5.pl, the difference is minimal, but large scripts written using the function-call interface can be significantly easier to read than equivalent object-based scripts.

I ll use the function-based CGI.pm interface from this point on, primarily because I find code written that way easier to read.

Write a Script to Access MySQL

Now we re going to forget about CGI.pm and Web scripts for a while and turn our attention to MySQL instead. (I assume that your server is configured to allow you to access it, as described in the earlier section Configuring MySQL. ) The next script will connect to the MySQL server, perform some simple operations to pull records from a table, and display them. That means we ll need a table to work with, of course. Use the mysql program to create a table called teams as shown here (substitute the appropriate connection parameters as necessary):

 % mysql -h localhost -p -u webdev webdb  Enter password: webdevpass  mysql> CREATE TABLE teams (name CHAR(20), wins INT, losses INT);  Query OK, 0 rows affected (0.00 sec)  mysql> INSERT INTO teams VALUES('Fargo-Moorhead Twins',36,16);  Query OK, 1 row affected (0.00 sec)  mysql> INSERT INTO teams VALUES('Winnipeg Maroons',24,26);  Query OK, 1 row affected (0.00 sec)  mysql> INSERT INTO teams VALUES('Minot Why Nots',19,23);  Query OK, 1 row affected (0.00 sec)  mysql> INSERT INTO teams VALUES("Warren Wanderers",16,30);  Query OK, 1 row affected (0.00 sec)  mysql> SELECT name, wins, losses FROM teams;  +----------------------+------+---------+  | name                  | wins | losses |  +----------------------+------+---------+  | Fargo-Moorhead Twins  |   36 |     16 |  | Winnipeg Maroons      |   24 |     26 |  | Minot Why Nots        |   19 |     23 |  | Warren Wanderers      |   16 |     30 |  +----------------------+------+---------+  4 rows in set (0.00 sec)  mysql> QUIT

After executing the CREATE TABLE and INSERT statements, you should have a table containing four rows of data, which you display using the SELECT statement. The table happens to represent the Northern League baseball team records for 1917, a year in which the league had only four teams. (The table is deliberately simple to make it easy to use.) To access the teams table from within Perl, create a script intro6.pl that looks like this:

 #! /usr/bin/perl -w  # intro6.pl - connect to MySQL, retrieve data, write plain text output  use strict;  use DBI;  my ($dbh, $sth, $count);  $dbh = DBI->connect ("DBI:mysql:host=localhost;database=webdb",                      "webdev", "webdevpass",                      {PrintError => 0, RaiseError => 1});  $sth = $dbh->prepare ("SELECT name, wins, losses FROM teams");  $sth->execute ();  $count = 0;  while (my @val = $sth->fetchrow_array ())  {     printf "name = %s, wins = %d, losses = %d\n",                          $val[0], $val[1], $val[2];      ++$count;  }  print "count rows total\n";  $sth->finish ();  $dbh->disconnect ();  exit (0);

When you run the script, you should see something like this:

 %./intro6.pl  name = Fargo-Moorhead Twins, wins = 36, losses = 16  name = Winnipeg Maroons, wins = 24, losses = 26  name = Minot Why Nots, wins = 19, losses = 23  name = Warren Wanderers, wins = 16, losses = 30  4 rows total

When Perl sees the use DBI statement, it reads in the DBI code to make it available to the script. If DBI isn t installed on your machine, you ll see all kinds of complaints from Perl when it tries to execute this line. That shouldn t happen, however, because we already checked DBI availability using an earlier script (intro2.pl).

Assuming the script makes it past the use DBI line, we re ready to connect to the database server and run a query. After declaring the variables we ll need using a my statement, we establish a connection to the MySQL server by calling the connect() method:

 $dbh = DBI->connect ("DBI:mysql:host=localhost;database=webdb",                      "webdev", "webdevpass",                      {PrintError => 0, RaiseError => 1});

The connect() call returns a database handle that points to a data structure containing information that DBI uses to keep track of the connection. connect() takes several arguments: a data source name (DSN), a username and password, and any error-handling attributes you want to specify. The data source name has the following format:

 DBI:mysql:options

The first segment is always DBI (not case sensitive). The second segment is mysql (case sensitive) for MySQL. This segment indicates the particular database driver to be used. If you were using a different driver, this segment would be different, such as pg for PostgreSQL or InterBase for InterBase. The syntax of the third segment depends on the particular driver that you re using. For MySQL, it consists of a list of semicolon-separated options, each written in name=value format. Therefore, to specify a host name and database name of localhost and webdb, respectively, the DSN looks like this:

 DBI:mysql:host=localhost;database=webdb

The order of the options in the third segment doesn t matter, and if they re left out, default values may be used. The default host value is localhost. (Therefore, we could have omitted this option from the preceding DSN string with no effect.) If the database option is omitted, no database is selected when the connection is established.

The final argument to connect() enables you to control DBI error-handling behavior. Enabling RaiseError rather than PrintError tells DBI not just to print a message when a database-related error occurs, but to print a message and terminate the script. This is discussed further in the section Error Handling and Debugging later in the chapter.

Following the connect() call, the script retrieves and displays records from the teams table. The process for doing this involves several steps:

Tell DBI what query we re going to issue, using the prepare() method. prepare() returns a statement handle $sth to use for all further operations involving the query. Note that query strings are written in DBI without any special terminator character. This is unlike the mysql program, where you indicate the end of each query using a semicolon or \g .
Call execute() to send the query to the server and generate a result set (the set of rows selected by the query).
Run a loop to fetch each row in the result set and display its contents. DBI has several row-fetching methods. intro6.pl uses fetchrow_array(), and we ll see others shortly. fetchrow_array() returns the next row of the result set as an array, or an empty list when there are no more rows.The SELECT query retrieves three columns from the teams table, so after fetching a row into @val, the elements can be accessed as $val[0], $val[1], and $val[2]. Column values are present in the order they are named in the SELECT statement; therefore, successive array elements represent name, wins, and losses.
Call the finish() method to close the result set. This allows DBI to release any resources associated with it. In practice, it s not strictly necessary to invoke finish() if you fetch all the rows, because DBI notices when you reach the end of the result set and closes it automatically. If you don t read an entire result set, however, you should invoke finish() explicitly. If you execute a query and then read only the first row, for example, you may see a warning message that looks like this when you disconnect from the MySQL server:

 DBI::db=HASH(0x10150bc0)->disconnect invalidates 1 active statement handle (either  destroy statement handles or call finish on them before disconnecting)

That s a sign that you need to call finish() explicitly.

After fetching and printing the rows in the result set, intro6.pl prints a row count. Notice that the script counts the rows itself while looping through the result set. DBI does provide a rows() statement handle method for this purpose. However, its use is problematic, particularly if you plan to use your script with other database engines, because the behavior of rows() varies between drivers. Counting the rows yourself works for any driver. (Another approach is to invoke a method that returns the entire result set all at once, and then check the number of rows in the data structure you get back. These calls are described later in High-Level Retrieval Methods. )

The last thing intro6.pl does is shut down the connection to the server by calling disconnect(). If you forget to do this, DBI disconnects automatically when a script terminates, but it also prints a message:

 Database handle destroyed without explicit disconnect.

Therefore, it s a good idea to disconnect explicitly. In addition to avoiding the warning message, that allows the MySQL server to perform an orderly shutdown on its end sooner, especially if your script runs for a while after it s done accessing the database. Scripts that are good citizens refrain from holding open connections to MySQL longer than necessary.

It s best to get in the habit of calling finish() and disconnect() as necessary, because the warnings that result otherwise are not likely to provide any useful information to people who run your scripts. These warnings are particularly confusing when produced by scripts that are run by a Web server on behalf of remote clients. Such users may know only that they re sending a request for a page to your Web server, not that the request triggers execution of a script that performs a database query. A little care in programming leads to less confusion on the part of visitors to your site.

Write a Script to Access MySQL over the Web

Our next script, intro7.pl, uses MySQL over the Web. It s really just an adaptation of intro6.pl, modified slightly to use CGI.pm. It includes a use CGI statement and generates HTML rather than plain text by invoking CGI.pm output methods:

 #! /usr/bin/perl -w  # intro7.pl - connect to MySQL, retrieve data, write HTML output  use strict;  use DBI;  use CGI qw(:standard);  my ($dbh, $sth, $count);  $dbh = DBI->connect ("DBI:mysql:host=localhost;database=webdb",                      "webdev", "webdevpass",                      {PrintError => 0, RaiseError => 1});  $sth = $dbh->prepare ("SELECT name, wins, losses FROM teams");  $sth->execute ();  print header(), start_html ("team data");  $count = 0;  while (my @val = $sth->fetchrow_array ())  {     print p (sprintf ("name = %s, wins = %d, losses = %d\n",                          $val[0], $val[1], $val[2]));      ++$count;  }  print p ("$count rows total"), end_html ();  $sth->finish ();  $dbh->disconnect ();  exit (0);

Put this script in your cgi-bin directory and request it from your browser as follows:

 http://www.snake.net/cgi-bin/intro7.pl

At this point, you ve put your database on the Web, at least in a modest way. It s not a very exciting Web presence at this point, but it s a start. (Actually, if you ve never done this before, just getting any database information to appear in your browser can be kind of exhilarating.)

More About Retrieving Data from MySQL

The intro6.pl and intro7.pl scripts use the fetchrow_array() function to return result set rows as an array, but that s not the only function that DBI provides for fetching rows. There are other row-fetching methods, as well as higher-level methods that perform the entire cycle of result-set processing from prepare() through finish().

Other Row-Retrieval Methods

fetchrow_arrayref() is very similar to fetchrow_array(), but it returns a reference to an array, not the array itself, so you refer to elements of the array as $ref->[i ], where i is a column index. Here s an example:

 $sth = $dbh->prepare ("SELECT name, wins, losses FROM teams");  $sth->execute ();  while (my $ref = $sth->fetchrow_arrayref ())  {     printf "name = %s, wins = %d, losses = %d\n",                      $ref->[0], $ref->[1], $ref->[2];  }  $sth->finish ();

The third row-fetching method is fetchrow_hashref(). It returns a reference to a hash, and elements of the row are accessed as $ref->{col_name }, where col_name is a column name:

 $sth = $dbh->prepare ("SELECT name, wins, losses FROM teams");  $sth->execute ();  while (my $ref = $sth->fetchrow_hashref ())  {     printf "name = %s, wins = %d, losses = %d\n",                      $ref->{name}, $ref->{wins}, $ref->{losses};  }  $sth->finish ();

When there are no more rows available, fetchrow_array() returns an empty list; fetchrow_arrayref() and fetchrow_hashref() return undef.

fetchrow_array() and fetchrow_arrayref() are more efficient than fetchrow_hashref() because it s slower to set up a hash than an array. On the other hand, when a row is stored in an array, you must know the positions of the columns in the row in order to access them properly. The query used in the preceding examples names the columns explicitly:

 $sth = $dbh->prepare ("SELECT name, wins, losses FROM teams");

This enables us to know that names, wins, and losses are in positions 0, 1, and 2. If we rewrote the query to use SELECT * instead, we wouldn t know how to access the columns properly because we wouldn t know for sure the order in which MySQL would return them. (If you re thinking, I can just run the query manually in mysql to find out the order, and then write my script accordingly, beware that the column order might change if at any point you use ALTER TABLE to modify the structure of the table.)

If you want to use SELECT * queries, fetchrow_hashref() is more suitable because column position doesn t matter; you access column values in the resulting hash by name, not position. If you do that, however, take care not to get bitten by hash element letter-case issues. Perl hash elements are case sensitive, but in MySQL, column names are not. The following queries are all equivalent as far as MySQL is concerned, but each would generate a Perl hash containing a different set of element names than the others:

 SELECT name, wins, losses FROM teams  SELECT Name, Wins, Losses FROM teams  SELECT NAME, WINS, LOSSES FROM teams

To guard against letter-case variations in the way that column names are written in queries, you can force the element names to a particular case. Specify lowercase or uppercase by passing an argument of NAME_lc or NAME_uc to fetchrow_hashref(). The following example uses uppercase:

 $sth = $dbh->prepare ("SELECT name, wins, losses FROM teams");  $sth->execute ();  while (my $ref = $sth->fetchrow_hashref ('NAME_uc'))  {     printf "name = %s, wins = %d, losses = %d\n",                      $ref->{NAME}, $ref->{WINS}, $ref->{LOSSES};  }  $sth->finish ();

By forcing column names to a specific letter case, you can access elements of the hash without regard to the letter case used in the query string itself.

High-Level Retrieval Methods

DBI provides a number of methods that perform the entire process of executing a retrieval query, including prepare(), execute(), the fetch loop, and finish(). These methods are selectrow_array(), selectcol_arrayref(), and selectall_arrayref(). (Note: Three new calls have been added to recent versions of DBI: selectrow_arrayref(), selectrow_hashref(), and selectall_hashref().)

selectrow_array() is useful when you need to retrieve a single row or a single column value. Invoked in a list context, it returns the first row of a result set as an array:

 @row = $dbh->selectrow_array (                 "SELECT wins, losses FROM teams WHERE name LIKE 'Fargo%'"              );  printf "Fargo-Moorhead: %d wins, %d losses\n", $row[0], $row[1] if @@row;

If you assign the result to a scalar, selectrow_array() returns only the first column of the first row. This is especially useful for queries where you re interested only in getting a count:

 $count = $dbh->selectrow_array ("SELECT COUNT(*) FROM teams");  print "The teams table has $count rows\n";

If the query returns no rows, selectrow_array() returns an empty list or undef, depending on the context in which you invoke it.

selectcol_arrayref() returns the first column of a result set, as a reference to an array of values. This array will be empty if the result set is empty. Otherwise, it contains one element per row of the result set:

 $ref = $dbh->selectcol_arrayref ("SELECT name FROM teams");  print "Teams names are @{$ref}\n" if defined (($ref);

selectall_arrayref() retrieves the entire result set as a matrix and returns a reference to it. The matrix is empty if the result set is empty. Otherwise, it contains one element per row of the result set, and each of these elements is itself a reference to an array of column values:

 $ref = $dbh->selectall_arrayref ("SELECT name, wins, losses FROM teams");  if (defined ($ref))  {     foreach my $row_ref (@{$ref})      {         printf "Name = %s, wins = %d, losses = %d\n",                  $row_ref->[0], $row_ref->[1], $row_ref->[2];      }  }

Each of these methods dies when an error occurs if RaiseError is enabled. They behave somewhat differently if RaiseError is disabled. selectrow_array() returns an empty list or undef, depending on whether it was invoked in list or scalar context. (Note that returning undef in scalar context may also occur if the value retrieved is a NULL value, so be careful.) selectcol_arrayref() and selectall_arrayref() both return undef if an error occurs prior to fetching the result set. If an error occurs during the fetch stage, they return whatever data were fetched up to the point of the error.

Queries That Modify the Database

The queries shown so far have been SELECT statements, and it s quite probable that the most common operation for which you ll use your database is to retrieve records. But DBI can help you with more than that. This section shows how to issue queries that modify the database in some way. The examples here demonstrate how to empty and repopulate the teams table using DELETE and INSERT statements. In a later section, we ll modify existing records using the UPDATE statement.

To issue queries that don t return rows, use the do() method. It executes a statement and returns the number of rows affected, or undef to indicate an error.^[2]

^[2] For MySQL, affected by means changed. For example, the following query returns a rows-affected value of 0, because it doesn t actually change anything:

 $dbh->do (qq{ UPDATE teams SET wins = wins });

For other databases, the number of rows affected by an update generally means the number of rows selected for updating, whether or not the update actually changes any values in the rows. (Therefore, the previous query would return an affected-rows value equal to the number of rows in the table.) If you want MySQL to behave that way, you can add an option to the DSN that enables the mysql_client_found_rows attribute when you connect:

 $dsn .= ";mysql_client_found_rows=1";

(If you have RaiseError enabled, you don t have to check for an undef return value, of course.) The following example shows you how to use the do() method to delete and insert records:

 $count = $dbh->do ("DELETE FROM teams");  print "$count rows were deleted\n";  $count = 0;  $count += $dbh->do ("INSERT INTO teams (name,wins,losses)                      VALUES("Fargo-Moorhead Twins",36,16)");  $count += $dbh->do ("INSERT INTO teams (name,wins,losses)                      VALUES("Winnipeg Maroons",24,26)");  $count += $dbh->do ("INSERT INTO teams (name,wins,losses)                      VALUES("Minot Why Nots",19,23)");  $count += $dbh->do ("INSERT INTO teams (name,wins,losses)                      VALUES("Warren Wanderers",16,30)");  print "$count rows were inserted\n";

If you run the example (it s part of intro8.pl), you should see the following output:

 % ./intro8.pl  0E0 rows were deleted  4 rows were inserted

Notice anything odd about that output? The 4 rows were inserted line makes sense because we inserted four rows. But that initial 0E0 rows were deleted line is strange. There were four rows in the teams table before; why doesn t it say 4 rows were deleted ? To understand what s going on here, you need to know one thing about MySQL and one thing about DBI:

The thing to know about MySQL is that when you issue a DELETE FROM tbl_name query to delete all the rows of a table, MySQL simply clobbers the table s data file and reinitializes it. This makes for really fast performance, but it s achieved at the cost of not knowing how many rows were in the table originally. Consequently, MySQL returns a row count of zero. If you want to know the number of rows deleted, use a DELETE statement that includes a trivially true WHERE clause like this:
```
 DELETE FROM tbl_name WHERE 1>0 
```
In this case, MySQL evaluates the WHERE clause for each row, forcing it to delete rows one by one. You ll get the true row count, at some cost in performance.

The thing to know about DBI is that do() needs to return distinct values to distinguish the occurrence of an error from the case that no rows were affected, so it uses undef and " 0E0". The value undef indicates an error; it evaluates to false in a Boolean context. The string " 0E0 " indicates that no rows were affected; it evaluates to true in Boolean contexts but is treated as zero in numeric contexts. (It represents zero expressed in scientific or exponential notation). If do() returned 0 to indicate no rows were affected, that evaluates to false in a Boolean context (just like undef) and would be difficult to distinguish from an error. If you want to convert the " 0E0 " value to a true zero, do this:

 $count += 0;

Alternatively, if you just want to print the value, use printf and a %d format specifier:

 printf "%d rows were deleted\n", $count;

The INSERT statements in the previous example used single quotes around the team names because SQL statements that use single-quoted string data values are more portable to other database systems. For MySQL itself, however, it makes no difference whether you use single quotes or double quotes around string values; MySQL understands them both. For example, the first INSERT statement could have been written using double quotes around the name column value like this:

 $count += $dbh->do ("INSERT INTO teams (name,wins,losses)                      VALUES(\"Fargo-Moorhead Twins\",36,16)");

In this case, the double quotes surrounding the team name are escaped with a leading backslash so that Perl doesn t interpret them as the end of the query string. Another alternative is to use single quotes around the entire query string, in which case the double quotes within the string wouldn t need any escaping:

 $count += $dbh->do ("INSERT INTO teams (name,wins,losses)                      VALUES("Fargo-Moorhead Twins",36,16)");

That s a less suitable method for quoting query strings if you want to embed variable references or special character sequences, such as \n , \r , or \t , within the string. Those are interpreted within double-quoted strings, but not within single-quoted strings, as you can see by running the following program:

 #! /usr/bin/perl -w  # quotes.pl - demonstrate differing quote behaviors  $var = "I am a string";  print "$var\n";     # print double-quoted string  print "$var\n";     # print single-quoted string

The output looks like this:

 % ./quotes.pl  I am a string  $var\n

Yet other ways to quote strings are to use Perl s q{} and qq{} operators, which treat everything between the { and } characters as single-quoted or double-quoted strings, respectively. These are convenient because you can use whatever kind of quotes you want within the string itself without worrying about escaping them:

 $count += $dbh->do (q{INSERT INTO teams (name,wins,losses)                      VALUES('Fargo-Moorhead Twins',36,16)});  $count += $dbh->do (qq{INSERT INTO teams (name,wins,losses)                      VALUES('Fargo-Moorhead Twins',36,16)});

q{} and qq{} are just like using single or double quotes with respect to treatment of embedded variable references and special character sequences.

Using Placeholders

The easiest way to avoid concerns about internal quote characters within queries is to use placeholders.With this mechanism, you put a ? character in your query string wherever you want to insert a data value, and then supply the value separately. DBI takes the data values and interpolates them into the query string, adding any quotes or backslashes that might be necessary. Using placeholders, the INSERT statements from the previous script could be written like this:

 $count = 0;  $count += $dbh->do ("INSERT INTO teams (name,wins,losses) VALUES(?,?,?)",                      undef, "Fargo-Moorhead Twins",36,16);  $count += $dbh->do ("INSERT INTO teams (name,wins,losses) VALUES(?,?,?)",                      undef, "Winnipeg Maroons",24,26);  $count += $dbh->do ("INSERT INTO teams (name,wins,losses) VALUES(?,?,?)",                      undef, "Minot Why Nots",19,23);  $count += $dbh->do ("INSERT INTO teams (name,wins,losses) VALUES(?,?,?)",                      undef, "Warren Wanderers",16,30);  print "$count rows were inserted\n";

Notice that you don t put any quotes around the ? placeholder characters in the query, not even for the name column value, which is a string. That s because DBI supplies surrounding quotes for you as necessary. The placeholder mechanism also takes care of escaping any quote characters within data values, if there happen to be any, as well as other special characters such as backslashes or null bytes.

The second argument to do() is always undef when you use placeholders. This argument is intended for passing processing attributes to the do() method, but I ve never seen anyone actually do that; just pass undef. The arguments following undef are the data values to be bound to the placeholders. These can be numbers or strings. If you want to bind NULL to a placeholder, pass undef.

You may have observed that the query strings for the INSERT statements are all identical in the previous example. When you have a situation like that, you may want to split the do() call into a prepare() call followed by one or more execute() calls.To row you want to insert, pass the data values to $sth->execute(), which binds the valdo this, pass the query to prepare() to get a statement handle $sth. Then, for each ues to the query and sends it to the MySQL server. The preceding example, when rewritten to use that approach, looks like this (note that execute(), unlike do(), requires no undef preceding the data values):

 $sth = $dbh->prepare ("INSERT INTO teams (name,wins,losses)                         VALUES(?,?,?)");  $count = 0;  $count += $sth->execute ('Fargo-Moorhead Twins',36,16);  $count += $sth->execute ('Winnipeg Maroons',24,26);  $count += $sth->execute ('Minot Why Nots',19,23);  $count += $sth->execute ('Warren Wanderers',16,30);  print "$count rows were inserted\n";

A more common scenario for using prepare() in conjunction with execute() occurs when you re inserting rows within a loop. The following script shows an example of this. It assumes that the input is a file containing tab-delimited values. The script reads each line and trims the trailing newline, and then splits the line at tab characters and passes the resulting values to the execute() method:

 $sth = $dbh->prepare ("INSERT INTO teams (name,wins,losses)                         VALUES(?,?,?)");  $count = 0;  while (<>)  {     chomp;      $count += $sth->execute (split (/\t/, $_));  }  print "$count rows were inserted\n";

Some databases gain a performance benefit from splitting do() into prepare() plus execute() this way. If the database engine constructs a query plan for the statement that is passed by prepare(), it can reuse the plan each time execute() is executed within the loop. If you use do() instead, the query plan must be constructed and executed for each row inserted. None of this pertains to MySQL, which doesn t use query plans, but it may still be a good idea to write a script using prepare() and execute() in the first place rather than do(). If you plan to use a script with other databases at a later date, that may give your script a significant boost when you port it, without a bunch of rewriting.

Placeholders can be used with any kind of statement, not just INSERT statements. For example, you could prepare a SELECT statement, and then find matching records by prompting the user for the value to be bound to the placeholder by the execute() call:

 $sth = $dbh->prepare (qq{             SELECT name, wins, losses FROM teams WHERE name = ?          });  while (1)  {     warn "Enter team name:\n"; # prompt user      $name = <>; # read response      chomp ($name);      last unless $name; # no name given; exit loop      $sth->execute ($name); # bind $name to placeholder      while (my $ref = $sth->fetchrow_hashref ())      {         printf "name = %s, wins = %d, losses = %d\n",              $ref->{name}, $ref->{wins}, $ref->{losses};      }      $sth->finish ();  }

Modifying Existing Records

We ve seen how to retrieve, delete, and insert records, but not how to change them. The following example shows how to modify existing records using UPDATE. The function takes a database handle argument specifying an open connection to the database server and two team names. It assumes the names represent the winner and loser of a game between the two, and updates the win/loss record of each team appropriately:

 sub update_wins_and_losses  { my ($dbh, $winner, $loser) = @_;      $dbh->do (qq{ UPDATE teams SET wins = wins + 1 WHERE name = ? },                  undef, $winner);      $dbh->do (qq{ UPDATE teams SET losses = losses + 1 WHERE name = ? },                  undef, $loser);  }

You might be tempted to use a single query string and employ a placeholder for the name of the column to be updated, like this:

 sub update_wins_and_losses  { my ($dbh, $winner, $loser) = @_;  my $sth;      $sth = $dbh->prepare (qq{ UPDATE teams SET ? = ? + 1 WHERE name = ? });      $sth->execute ("wins", "wins", $winner);      $sth->execute ("losses", "losses", $loser);  }

That won t work. Placeholders are for use only with data values, not SQL keywords or operators, or database, table, or column names.

The operation of updating two team records in tandem brings up a question: What if something happens between the two updates, such as the script getting killed or the database server going down? In that case, the win/loss records will be inconsistent. We ll return to this question later when we consider transaction processing in Chapter 10, E-Commerce Applications.

Specifying MySQL Connection Parameters Using a Library

We ve written several scripts now that access MySQL, and we re going to write quite a few more most of which will use the same connection parameters. Under these circumstances, there s an easier way to connect to the server than writing out the parameters literally in each script: Put the connection code in a library function. We ll do this by writing a module file WebDB.pm and putting it in the lib/perl directory under the Apache server root. (This directory was described earlier in Configuring Apache. )

The module will contain a connection function that uses the same parameters used in the previous two scripts. While we re at it, we ll also include a function that connects using another common connection mode: Get the name and password from the MySQL option file of the user running the script (the .my.cnf file in the user s home directory). That is useful for command-line scripts that can be run by different users when you want to require those users to supply their own MySQL name and password. A typical .my.cnf file specifies client program connection parameters as follows:

 [client]  user=user_name  password=user_password

As we go along through this book, we ll write several other functions and put them in WebDB.pm.These will be routines for operations that we ll need to perform from multiple scripts; putting them in the library enables us to avoid writing the code in each script. For now, the first version of WebDB.pm looks like this:

 package WebDB;  use strict;  use DBI;  my $host_name = "localhost";  my $db_name = "webdb";  my $dsn = "DBI:mysql:host=$host_name;database=$db_name";  # Connect to MySQL server, using hardwired name and password  sub connect  {     return (DBI->connect ($dsn, "webdev", "webdevpass",                            {PrintError => 0, RaiseError => 1}));  }  # Connect to MySQL server, using name and password from the current  # user's ~/.my.cnf option file. The mysql_read_default_file option,  # when added to the DSN, specifies which option file to read.  sub connect_with_option_file  {     $dsn .= ";mysql_read_default_file=$ENV{HOME}/.my.cnf";      return (DBI->connect ($dsn, undef, undef,                            {PrintError => 0, RaiseError => 1}));  }  1; # return true

Install the WebDB.pm file in the /usr/local/apache/lib/perl directory. Scripts can access it by including the following lines:

 use lib qw(/usr/local/apache/lib/perl);  use WebDB;

The first line adds the directory containing the module to the list of locations Perl searches when looking for module files, and the second line pulls in the code from the module. This allows a script to establish a connection just by invoking the appropriate routine from the library file. A script run from Apache would use the default parameters by connecting like this:

 $dbh = WebDB::connect ();

(You could, in fact, go back and modify the connect() call in our earlier scripts to use WebDB::connect() and they should still work properly.) If instead you want a script to use the name and password stored in the .my.cnf file of the user who runs it, the script should connect like this:

 $dbh = WebDB::connect_with_option_file ();

Here s an example script, select-user.pl, that connects that way. It reads the current user s option file, and then runs a SELECT USER() query and displays to the user the name actually found in the file:

 #! /usr/bin/perl -w  # select-user.pl - user current user's option file to connect  use strict;  use lib qw(/usr/local/apache/lib/perl);  use WebDB;  my($dbh, $user);  $dbh = WebDB::connect_with_option_file ();  $user = $dbh->selectrow_array ("SELECT USER()");  print "You connected as $user\n";  $dbh->disconnect ();  exit (0);

The select user. pl script doesn t have any use DBI statement, unlike the earlier scripts that access our database. That s because the WebDB module itself includes such a statement, so any script using WebDB automatically inherits the modules it uses.

The script uses selectrow_array() to combine prepare(), execute(), fetchrow_array(), and finish() into one operation to generate the first row of the result set. It is useful here because the SELECT query returns only one row anyway. In a scalar context, as we ve called it here, selectrow_array() returns just the first column of the row.

Putting connection routines in a library file has the following advantages:

MySQL scripts are simpler (and therefore easier to write) when you don t have to specify all the connection parameters in every script.
Scripts become more portable. If you decide to move your database to another host, for example, just change the host name in WebDB.pm, not in all the individual scripts that use the database. If you decide to use a driver for another database such as PostgreSQL, change mysql in the $dsn value to pg in the library file, not in individual scripts.
Encapsulating connection code is somewhat better for security. If you like to put your scripts in the document tree (rather than in cgi-bin), it s possible that if the server is reconfigured, an error will cause the Web server to start sending out your scripts as plain text rather than executing them. If the scripts contain the connection parameters, that s not such a good thing! Putting the connection code in a library file located outside the document tree prevents the Web server from showing it in plain text, even in the event of a configuration error.

Note that using a library file this way is a security improvement only against remote clients discovering the connection parameters. Other users who can log in on the Web server host and install their own Web scripts probably also have permission to read files in the server s library directory. See Chapter 9 for more information.

One disadvantage of libraries is that if you don t install them in a directory that Perl searches by default, your scripts must add the directory to Perl s search list. You ll run into this with scripts written for this book that use the WebDB module. If you don t have the WebDB.pm file installed in a standard Perl library directory or in /usr/local/apache/lib/perl, you ll need to modify the uselib statement to reflect where you put the module file.

only for RuBoard - do not distribute or recompile