Obtaining Database and Table Information | MySQL and Perl for the Web

only for RuBoard - do not distribute or recompile

You can ask MySQL for several kinds of information about the databases and tables that it supports. It s possible to get lists of databases and tables, and you can get descriptions about each of the columns in a table. Much of this information is available using various forms of the SHOW statement, a MySQL-specific query that we ll examine in some detail in the next few sections.

Listing Databases and Tables

To get a list of databases hosted by a MySQL server, use the SHOW DATABASES query. If you want to know what tables are in a particular database, use SHOW TABLES FROM db_name . If you re writing an interactive query builder, for example, you might let the user first pick a table from the current database, then specify query conditions for selecting records from that table. A full-blown application for doing that is more involved than I want to get into here, but the following short script, pick_table.pl, illustrates the concept of getting a list of table names from MySQL and converting it into a pop-up menu:

 #! /usr/bin/perl -w  # pick_table.pl - present a list of tables from the current database as  # a popup menu.  use strict;  use lib qw(/usr/local/apache/lib/perl);  use CGI qw(:standard escapeHTML);  use WebDB;  print header (), start_html (-title => "Pick a Table", -bgcolor => "white");  # Get reference to list of tables, convert it to an array  my $dbh = WebDB::connect ();  my @list = @{ $dbh->selectcol_arrayref ("SHOW TABLES") };  $dbh->disconnect ();  # Display table-selection form  unshift (@list, "Select a table");  # put a title at head of popup menu  print start_form (-action => url ()),      popup_menu (-name => "table", -values => \@list, -override => 1),      br (), br (),      submit (-name => "choice", -value => "Submit"),      end_form ();  # Display table selection from previous script invocation, if any  my $table = param ("table");  $table = "" if !defined ($table);  print hr (), p (escapeHTML (                     $table eq "" || $table eq "Select a table"                              ? "No table has been chosen."                              : "You chose this table: $table"              ));  print end_html ();  exit (0);

Note how the script puts a " Select a table " item at the head of the pop-up to serve as the default value, and then checks for that string later when determining whether the user actually picked a table. If you want to display a set of radio buttons instead, you might change the body of the script to something like this:

 # Display table-selection form  print start_form (-action => url ()),      radio_group (-name => "table",                      -values => \@list,                      -default => "[NO DEFAULT]",                      -linebreak => 1,                      -override => 1),      br (), br (),      submit (-name => "choice", -value => "Submit"),      end_form ();  # Display table selection from previous script invocation, if any  my $table = param ("table");  $table = "' if !defined ($table);  print hr (), p (escapeHTML (                     $table eq ""                              ? "No table has been chosen."                              : "You chose this table: $table"              ));

Listing Table Structure Information

In addition to simple lists of databases and tables, MySQL also can provide more detailed information about the structure of a table s columns. Suppose we have a table named coffee that is used for recording orders at a coffee shop:

 CREATE TABLE coffee  (     id      INT NOT NULL AUTO_INCREMENT PRIMARY KEY,  # record ID/timestamp      ts      TIMESTAMP,      qty     TINYINT UNSIGNED NOT NULL DEFAULT 1,      # quantity      size    ENUM('S','M','L') NOT NULL DEFAULT 'M',   # serving size      extras  SET('cream','sugar'), #                   extra ingredients      remark  VARCHAR(30) #                             special instructions  )

The table contains rows that look like this:

 +----+----------------+-----+------+-------------+---------------+  | id | ts             | qty | size | extras      | remark        |  +----+----------------+-----+------+-------------+---------------+  |  1 | 20010227060327 |   1 | L    | cream       | LOTS of cream |  |  2 | 20010227061749 |   1 | M    | sugar       | 2lumps        |  |  3 | 20010227064232 |   2 | M    | NULL        | NULL          |  |  4 | 20010227070751 |   1 | L    | cream       | NULL          |  |  5 | 20010227071006 |   3 | S    | cream,sugar | NULL          |  ...

If we re using a form-based application, coffee.pl, to enter these records, we might use the following function to generate the entry form, which includes text-input fields for the qty and remark values, a set of radio buttons for specifying the serving size, and a set of check boxes for the extras:

 sub display_form  {     print start_form (-action => url ()),          p ("Quantity:"),          textfield (-name => "qty"),          p ("Serving size:"),          radio_group (-name => "size",                          -values => [ "S", "M", "L" ],                          -default => "M"),          p ("Extras:"),          checkbox_group (-name => "extras", -values => [ "cream", "sugar" ]),          p ("Special instructions:"),          textfield (-name => "remark", -size => "30"),          br (), br (),          submit (-name => "choice", -value => "Submit"),          end_form ();  }

The function relies heavily on our knowledge about the structure of the coffee table and how form submissions will be processed:

We name the fields after the corresponding table columns.
We know the possible legal values for the size and extras fields. We also know which values should be selected by default (M for size, nothing for extras).
We know the size of the remark column 30 characters.
We didn t include any fields for the id or ts columns because we know that both of them have types for which MySQL will supply values automatically when we enter new records. (id gets the next AUTO_INCREMENT value, ts gets the current date and time.)

I haven t shown any code to validate submitted forms, but any such code would enforce certain constraints, and these also would be based on our knowledge of the table definition:

The qty field value must be numeric, but it shouldn t be allowed to be negative.
We could either require the quantity to be filled in, or use the default value (1) if it s left blank.
The size and extras fields should have values that are legal members of the corresponding table columns. (We d like to hope that will always be true because the only values that we re presenting to the user are those known to be valid in the first place. However, it s always possible that someone might attempt to submit a hacked form to see how bulletproof the application is, so to be sure that submitted values are legal, it s best to check them.)

As it happens, a lot of the information needed for generating the form and processing submissions is directly available from MySQL itself as the output from the SHOW COLUMNS statement. Let s see what that information is and how to use it.

The SHOW COLUMNS Statement

To find out about the columns in a given table, issue a SHOWCOLUMNS query.^[1] If you run a few instances of this statement from the mysql program, you ll see what kinds of things MySQL can tell you about your tables. The following example shows the output of SHOWCOLUMNS for our coffee table:

^[1] SHOW FIELDS is a synonym for SHOW COLUMNS.

 mysql> SHOW COLUMNS FROM coffee;  +--------+----------------------+------+-----+---------+----------------+  | Field  | Type                 | Null | Key | Default | Extra          |  +--------+----------------------+------+-----+---------+----------------+  | id     | int(11)              |      | PRI | NULL    | auto_increment |  | ts     | timestamp(14)        | YES  |     | NULL    |                |  | qty    | tinyint(3) unsigned  |      |     | 1       |                |  | size   | enum('S','M','L')    |      |     | M       |                |  | extras | set('cream','sugar') | YES  |     | NULL    |                |  | remark | varchar(30)          | YES  |     | NULL    |                |  +--------+----------------------+------+-----+---------+----------------+

If you want to know only about some of the columns, use a LIKE clause. Then SHOW COLUMNS will produce an output record only for each column with a name that matches the string following the LIKE keyword. The string is treated as a SQL pattern; for example, %r% matches any column name that contains an r character:

 mysql> SHOW COLUMNS FROM coffee LIKE '%r%';  +--------+----------------------+------+-----+-------------+-------+  | Field  | Type                 | Null | Key | Default     | Extra |  +--------+----------------------+------+-----+-------------+-------+  | extras | set('cream','sugar') | YES  |     | cream,sugar |       |  | remark | varchar(30)          | YES  |     | NULL        |       |  +--------+----------------------+------+-----+-------------+-------+

To display information about a single column, use a LIKE clause and specify the column s name literally as a string:

 mysql> SHOW COLUMNS FROM coffee LIKE 'size';  +-------+-------------------+------+-----+---------+-------+  | Field | Type              | Null | Key | Default | Extra |  +-------+-------------------+------+-----+---------+-------+  | size  | enum('S','M','L') |      |     | M       |       |  +-------+-------------------+------+-----+---------+-------+

You may find that SHOW COLUMNS produces more output columns than the six shown here, depending on your version of MySQL, but these six should always be present. You may also find that SHOW COLUMNS prints very long lines that wrap around and are difficult to read in a terminal window. You can deal with this by using \G at the end of the query (rather than ; or \ g ) to print output columns stacked vertically:

 mysql> SHOW COLUMNS FROM coffee LIKE 'size'\G  *************************** 1. row ***************************     Field: size      Type: enum('S','M','L')      Null:       Key:   Default: M     Extra:

You can probably see immediately how the output from SHOW COLUMNS might be useful for generating our entry form. For example, we can use it to get the names of the table s columns, from which the field names are determined. The information for the qty and remark columns tells us how long to make the corresponding text-input fields. The legal values for the ENUM and SET columns can be used to construct list elements in the form such as radio buttons and check boxes.

I ll give a general description of the types of information that SHOW COLUMNS provides, and then discuss how to use it with reference to the coffee table in particular. Note that although my convention is to write type and attribute keywords such as TINYTINT and AUTO_INCREMENT using uppercase letters, SHOW COLUMNS prints much of its output in lowercase. Be sure to take that into account when using output from this statement in your scripts. Note also that this discussion probably will seem abstract unless you run SHOW COLUMNS on some of your own tables to see for yourself what kind of result it produces for various column types. I encourage you to do that before reading further.

The output from SHOW COLUMNS includes information about several aspects of table columns:

Field

The name of the column. You can use this information by itself to find out what columns a table contains, or in conjunction with the other SHOW COLUMNS output to determine the characteristics of each column.
Type

This value always begins with the name of the column s type. It may also include additional information:
- For many types, the size or display width is indicated in parentheses. For example, char(10) indicates a 10-character string column, tinyint(3) signifies a tiny integer with a display width of 3 characters, and decimal(10,2) indicates a floating-point column that can have up to 10 digits, with 2 of them after the decimal point.
- ENUM and SET column Type values include the list of legal elements in parentheses as a set of single-quoted values separated by commas.
- Some types include additional attributes, such as zerofill for numeric types, unsigned for unsigned integer types, and binary for case-sensitive character columns.

Null

This value is YES if the column can have NULL values, empty otherwise.
Key

The Key value provides rudimentary information about the table s index struc-ture. The possible values are PRI (PRIMARY KEY), UNI (UNIQUE index), MUL (nonunique index), or empty (not indexed).
Default

Indicates the column s default value. For a string column, an empty Default value means that the default is the empty string.The word NULL means the column s default is the NULL value. Actually, for string columns, it could also mean that the default value is the literal word NULL, but how likely is that? In any case, the word NULL appears only when you run SHOW COLUMNS from the mysql program or when you use the mysqlshow command. When you read the output from within a DBI script, a NULL default value is returned as undef, which eliminates any ambiguity between the string NULL and a true NULL value.
Extra
This value provides miscellaneous extra information. At the moment, MySQL uses it for only one purpose: to display the word auto_increment for AUTO_INCREMENT columns.

There are a couple of instances where SHOW COLUMNS output is somewhat unintuitive. First, Null is always YES for a TIMESTAMP column, because you can set such a column to NULL no matter how it is declared (the result being that the column gets set to the current time and day). Second, integer columns with the AUTO_INCREMENT attribute and TIMESTAMP columns always have a Default value of NULL, because the way you get the special auto-generated values for these kinds of columns is to set them to NULL when you insert a new row.^[2]

^[2] Prior to MySQL 3.23, the Default value for AUTO_INCREMENT columns is displayed as zero.

Using SHOW COLUMNS Output to Build Forms

Let s look further at the output produced by SHOWC OLUMNS for the coffee table and see what we can determine based solely on that information. Here it is again:

 mysql> SHOW COLUMNS FROM coffee;  +--------+----------------------+------+-----+---------+----------------+  | Field  | Type                 | Null | Key | Default | Extra          |  +--------+----------------------+------+-----+---------+----------------+  | id     | int(11)              |      | PRI | NULL    | auto_increment |  | ts     | timestamp(14)        | YES  |     | NULL    |                |  | qty    | tinyint(3) unsigned  |      |     | 1       |                |  | size   | enum('S','M','L')    |      |     | M       |                |  | extras | set('cream','sugar') | YES  |     | NULL    |                |  | remark | varchar(30)          | YES  |     | NULL    |                |  +--------+----------------------+------+-----+---------+----------------+

Given this information, we can make the following deductions:

The id column contains auto_increment for the Extra value. This is a tipoff that the column s value will be provided automatically by MySQL when a new record is inserted, so we don t need to include it in the entry form at all. TIMESTAMP columns are typically initialized automatically as well, so we can omit the ts column from the form, too.
The qty column is an integer, so values would be gathered using a text-input field. According to the Type information, we d need a field three characters long to collect quantity values. The fact that the column is unsigned is not of consequence for generating the entry form. (It can be used for validating values submitted by users, but we re not at that point yet.) If we wanted to initialize the qty field in the entry form, we could do so by consulting the Default column and using the value found there (1).
The information for the size column tells us that it s an enumeration, which means we can display it as some kind of list element: a set of radio buttons, a pop-up menu, or a single-pick scrolling list. The Type information indicates the legal values to include in the list, and Default indicates that we should initialize the element with M as the default value.

The output for the extras column is similar to that for size. However, this column is a SET rather than an ENUM, so we d display it using a set of check boxes or a multiple-pick scrolling list.
The Type information for the remark column says varchar(30), so it s a string; we ll need a text-input field 30 characters long for the corresponding form element.

The preceding discursive narrative discussing the relevance of SHOW COLUMNS output to form generation makes sense to you as a human (I hope!), but it doesn t help a Perl script much. How do we actually exploit this information in a program, and to what extent should we do so? There are several possibilities. You could try to automate form generation as much as possible, or you could decide you re willing to make some assumptions about the fields and use the information from SHOW COLUMNS in a more limited way. What you ll generally find when dealing with this tradeoff is that the more you want your program to figure out for itself, the more difficult it will be to teach it how to do that. If you want to get the code written more quickly, make some assumptions on behalf of your script.

To generate the coffee table entry form, for example, you could write the script to figure out every field type, based on the corresponding table column types. It would have to determine that id and ts don t appear in the form at all, that the qty and remark columns should be represented by text-input fields, and that the size and extras columns should be represented by list fields of some sort. Or, you can begin by using SHOW COLUMNS output in a more modest way, and then expand your use of it later. That s what we ll do here. We ll start by just knowing what kind of field each column is represented by, and use only information about the size of the qty and remark text-input fields, and about the member lists for the size and extras fields for constructing the radio buttons and check boxes. After that, we ll get information about default values and use them to initialize form elements.

In each case where we use SHOW COLUMNS output to generate a field, we can begin by extracting into an array the information for the column in which we re interested. For example, the query to do that for qty looks like this:

 # get row describing the desired column  @val = $dbh->selectrow_array ("SHOW COLUMNS FROM coffee LIKE 'qty'");

For this column, we want to know the length of column values. That s contained in the Type information ($val[1]), which looks like this:

 tinyint(3) unsigned

To obtain the length (the number in parentheses) and use it to generate a text-input field, the following code suffices:

 ($len) = ($val[1] =~ /\((\d+)\)/);  print textfield (-name => "qty", -size => $len);

We may as well put the length-extraction code in a function to make it easy to use. We ll pass the database handle, and the table and column names, and expect to get back the column length in return. Before writing any code to extract the length, however, there s actually one additional detail to be aware of when retrieving information about a column. Remember that the string following the LIKE keyword is interpreted as a SQL pattern. Therefore, to match the column name literally, you need to turn off the special meaning of any SQL pattern characters ( _ and % ) that occur within the name by escaping them with a backslash.This is easy in Perl:

 $col_name =~ s/([_%])/\\$1/g;

The get_column_length() function returns the column length. It uses a helper function, get_column_info(), also shown:

 sub get_column_length  { my ($dbh, $tbl_name, $col_name) = @_;  my @val;      @val = get_column_info ($dbh, $tbl_name, $col_name);      ($val[1] =~ /\((\d+)\)/)              or die escapeHTML ("No length found for $tbl_name.$col_name\n");      return ($1);  }  sub get_column_info  { my ($dbh, $tbl_name, $col_name) = @_;  my ($esc_col_name, @val);      # escape SQL pattern characters to force literal column name match      ($esc_col_name = $col_name) =~ s/([_%])/\\$1/g;      @val = $dbh->selectrow_array (                 "SHOW COLUMNS FROM $tbl_name LIKE '$esc_col_name'"             );      @val or die escapeHTML ("No information found for $tbl_name.$col_name\n");      return (@val);  }

get_column_length() assumes that you ll call it only for columns that actually have a length present in the Type information, and dies an unceremonious death if that s not true. That assumption is true for our qty and remark columns, so we re safe. Whew.

Passing a Database Name to get_column_length()

There is no database name argument in the definition of get_column_length(). What if you want to get the length of a column from a table in a different database? No problem. Pass a value of $tbl_name that is fully qualified with the database name in db_name.tbl_name form:

 $len = get_column_length ($dbh, "other_db.my_tbl", "my_col");

This same principle applies to other functions written later in this chapter that take a table name argument.

Extracting the element lists for the size and extras columns is more work than getting a column length because the information is more complex than a string of digits. The Type values for these columns look like this:

 enum('S','M','L')  set('cream','sugar')

To process a value in this format, convert it to a set of individual member values b trimming off the leading word and the parentheses from the ends of the string, breaking it into individual values at the commas, and stripping the quotation marks from the ends of each value. The result is an array that we can use to produce a list element for the entry form. The following function, get_column_members(), takes a database handle, a table name, and a column name as arguments, and returns a list of legal column members:

 sub get_column_members  { my ($dbh, $tbl_name, $col_name) = @_;  my @val;      @val = get_column_info ($dbh, $tbl_name, $col_name);      # strip "enum(" or "set(" from beginning and ")" from end of "Type" value      $val[1] =~ s/^[^(]*\((.*)\)$/$1/;      # split on commas, then trim quotes from end of each word      @val = split (/,/, $val[1]);      s/^'(.*)'$/$1/ foreach (@val);      return (@val);  }

If we modify our original display_form() function in coffee.pl to use our new functions that look up column information for us from the database, it will automatically size the text-input fields and determine the correct choices for the list elements:

 sub display_form  { my ($dbh, $qty_len, $remark_len, @size, @extras);      $dbh = WebDB::connect ();      $qty_len = get_column_length ($dbh, "coffee", "qty");      $remark_len = get_column_length ($dbh, "coffee", "remark");      @size = get_column_members ($dbh, "coffee", "size");      @extras = get_column_members ($dbh, "coffee", "extras");      $dbh->disconnect ();      print start_form (-action => url ()),          p ("Quantity:"),          textfield (-name => "qty", -size => $qty_len),          p ("Serving size:"),          radio_group (-name => "size",                          -values => \@size,                          -default => "M"),          p ("Extras:"),          checkbox_group (-name => "extras", -values => \@extras),          p ("Special instructions:"),          textfield (-name => "remark", -size => $remark_len),          br (), br (),          submit (-name => "choice", -value => "Submit"),          end_form ();  }

You could use get_column_members() to generate other kinds of list elements. To generate a pop-up menu or a scrolling list rather than radio buttons, for example, call popup_menu() or scrolling_list() rather than radio_group().

We didn t specify any labels argument when we generated our radio buttons and check boxes, so CGI.pm uses the values for both the values and labels. Without additional external information, there s little else we can do, based just on the list of legal column members. One possibility is to present labels that have the initial letter capitalized. That s one more statement; the following example shows how to do this for the extras check boxes:

 @val = get_column_members ($dbh, "coffee", "extras");  %labels = map { $_ => ucfirst ($_) } @val;  print checkbox_group (-name => "extras",                        -values => \@val,                        -labels => \%labels);

The map function takes each element name and turns it into a key/value pair consisting of the name and the name with the first character capitalized. The result is a hash that is indexed on the element names. By specifying labels this way, the HTML produced by checkbox_group() looks like this:

 <input type="checkbox" name="extras" value="cream">Cream  <input type="checkbox" name="extras" value="sugar">Sugar

If you examine the modified display_form() function more closely, you may notice a couple of problems. First, we left the radio button default value (M) hardwired in. That s information we could have gotten from SHOW COLUMNS, but didn t. Second and more serious we re issuing four separate queries to the database each time we generate the form! Maybe we should grab the output for the entire table with a single query and save it in a data structure so that we can use the information as much as we want with no need for additional queries. And if we provide an object-oriented interface for that information and put it in its own module, we can use the same code in a variety of applications. If we name the module WebDB::TableInfo to place it in the WebDB namespace, for example, coffee.pl would be able to extract the table information for field sizes and member lists something like this:

 use WebDB::TableInfo;  $dbh = WebDB::connect ();                           # connect to database  $tbl_info = WebDB::TableInfo->get ($dbh, "coffee"); # get table information  $dbh->disconnect ();                                # done with database  $qty_len = $tbl_info->length ("qty");               # get column lengths  $remark_len = $tbl_info->length ("remark");         # for text fields  @size = $tbl_info->members ("size");                # get members for  @extras = $tbl_info->members ("extras");            # list fields

This approach also has the advantage of better encapsulation. The SHOW statement is a MySQL-specific query, and putting it inside the module is a step toward hiding database-specific details from our applications.

To create a module that can be accessed as just shown, several things are necessary:

The module file should begin with a package statement that specifies its name:
```
 package WebDB::TableInfo; 
```
The file should be placed in an appropriate location. The module name WebDB::TableInfo indicates that the module filename will be TableInfo.pm and it must be installed as WebDB/TableInfo.pm^[3] under some directory in the search path used by your scripts. The easiest way to do this is to move into the directory where you installed WebDB.pm, and then create a subdirectory, WebDB, and install TableInfo.pm in the subdirectory.

^[3] If you re using this module under Windows, the filename would be WebDB\TableInfo.pm rather than WebDB/TableInfo.pm.

If your scripts are set up to be able to find WebDB, the preceding strategy also allows them to find WebDB::TableInfo with no additional modification. If the module files are installed under /usr/local/apache/lib/perl, for example, you can reference both WebDB and WebDB::TableInfo like this:

 use lib qw(/usr/local/apache/lib/perl);  use WebDB;  use WebDB::TableInfo;

The general outline of the TableInfo.pm file is as follows. It includes a package line containing an appropriate package identifier, references the other modules that it requires, defines the methods that it provides, and ends with the conventional 1; statement that returns true to the Perl interpreter:

 package WebDB::TableInfo;  use strict;  use CGI qw(:standard escapeHTML);  use DBI;  sub get ...          # define get() method  sub length ...       # define length() method  sub members ...      # define members() method  1;                   # return true from module file

The first method any script will call from this module is get(), which saves the table name for use in error messages by other module methods, issues a SHOW COLUMNS query, and returns a reference to an object that contains the information returned by the query about the table s columns:

 sub get  { my ($class, $dbh, $tbl_name) = @_;  my $self = {};                      # table information hashref      $self->{table} = $tbl_name;     # save table name      $self->{data} = $dbh->selectall_arrayref ("SHOW COLUMNS FROM $tbl_name")          or die escapeHTML ("No information found for $tbl_name\n");      # Construct a row map that associates each column name with the      # row of SHOW COLUMNS output for that column, to make it easier      # to find information by column name later.      $self->{row} = {};      foreach my $row_ref (@{$self->{data}})      {         $self->{row}->{$row_ref->[0]} = $row_ref;      }      # bless object into class, return reference to it      return (bless ($self, $class));  }

The first argument that get() receives is the class name, even though you don t pass that name in the argument list when you call the method. That s because you invoke get() through the class name, like this:

 $tbl_info = WebDB::TableInfo->get ($dbh, "tbl_name");

Perl notices this and puts the class name at the head of the argument list, where we can access it and use it in the bless() call to place the object into the appropriate class.

get() constructs a lookup map allowing information for any given column to be accessed by name. That s the kind of thing we ll need to do fairly often, so we can write a helper method, _column_info(), that returns the array of SHOW COLUMNS information for a column, given its name:^[4]

^[4] The leading underscore in the name of _column_info() is a Perl convention indicating that the method should be considered private to other methods of the class. The convention is a loose one, however; there s nothing to stop your scripts from calling it. You could make the method really private by defining it within a closure in the definition of the get() method, but I don t see any particular reason to do that.

 sub _column_info  { my ($self, $col_name) = @_;      return (@{$self->{row}->{$col_name}})          if exists ($self->{row}->{$col_name});      die escapeHTML ("No information found for $self->{table}.$col_name\n");  }

To get the column length or the list of legal column members, invoke length() or members(). These are similar to our earlier non-object oriented functions get_column_length() and get_column_members() :

 sub length  { my ($self, $col_name) = @_;  my $len;      $len = ($self->_column_info ($col_name))[1];      ($len =~ /\((\d+)\)/)          or die escapeHTML ("No length found for $self->{table}.$col_name\n");      return ($1);  }  sub members  { my ($self, $col_name) = @_;  my @val;      @val = $self->_column_info ($col_name);      # strip "enum(" or "set(" from beginning and ")" from end of "Type" value      # split on commas, then trim quotes from end of each word      $val[1] =~ s/^[^(]*\((.*)\)$/$1/;      @val = split (/,/, $val[1]);      s/^'(.*)'$/$1/ foreach (@val);      return (@val);  }

For each of these methods, the first argument passed to them is the object itself. This is similar to the way that Perl passes the class name when you invoke the get() class method. The difference is that length(), members(), and _column_info() are invoked through an object that is an instance of the class, so Perl passes the object rather than the class name. This provides us with a convenient means of referring to the object in question so that we can access its data or call other object methods.

With this interface, only one query to the database is needed to obtain all the available information about a table s columns; you can use the information as little or as much as you want without incurring the cost of another query. At the moment, coffee.pl uses the information only to determine field lengths and member lists. However, there are other column attributes it could take advantage of, such as default values. Let s rectify that situation by writing a new method, named default(), that we can use to initialize fields in the form.

Writing default() is a little trickier than you might think at first. The value we want is in the Default column of the output produced by SHOW COLUMNS, so we could just return that value:

 sub default  { my ($self, $col_name) = @_;      return (($self->_column_info ($col_name))[4]);  }

In fact, that works perfectly well for most column types. For example, the default for the size column is a simple string, which we can extract like this:

 $size_default = $tbl_info->default ("size");

However, for a SET column, the version of default() just shown doesn t work so well. It s possible for a SET column to have a default value consisting of multiple set members. If we were to notice that most of our customers want cream and sugar added to their coffee, for example, we might change the extras column in the coffee table to reflect this by using an ALTER TABLE statement to set the default value to ' cream,sugar ':

 ALTER TABLE coffee ALTER extras SET DEFAULT 'cream,sugar'

In this case, we d also want coffee.pl to select both extras check boxes by default when presenting the entry form. But for that to work, we need to pass a default parameter to checkbox_group() that points to an array of values. (Passing the string ' cream,sugar ' wouldn t work, because CGI.pm has no idea that it should interpret the string as a comma-separated list of individual check box values.) Therefore, for SET columns, it would be more helpful for default() to split the default value string and return an array. Let s write default() to work either way, so that you can get a scalar or an array depending on how you call it:

 $size_default = $tbl_info->default ("size");      # get default as scalar  @extras_default = $tbl_info->default ("extras");  # get default as array

Perl s wantarray() function is what we need here, because it indicates whether a function is called in a context that requires an array. This allows default() to determine what kind of value to return:

 sub default  { my ($self, $col_name) = @_;  my $default;      $default = ($self->_column_info ($col_name))[4];      return ($default) unless wantarray ();      return (defined ($default) ? split (/,/, $default) : ());  }

default() is cognizant of the possibility that a SET column default might be NULL (undef); in this case, default() returns the empty list when called in an array context.

Now we can write a version of display_form() that uses the WebDB::TableInfo module. It s more comprehensive than earlier versions because it automatically initializes list element defaults in addition to specifying the set of possible values:

 sub display_form  { my ($dbh, $tbl_info, $qty_len, $remark_len, @size, @extras);  my ($qty_default, $size_default, @extras_default);      $dbh = WebDB::connect ();                           # connect to database      $tbl_info = WebDB::TableInfo->get ($dbh, "coffee"); # get table information      $dbh->disconnect ();                                # done with database      $qty_len = $tbl_info->length ("qty");               # get column lengths      $remark_len = $tbl_info->length ("remark");         # for text fields      @size = $tbl_info->members ("size");                # get members for      @extras = $tbl_info->members ("extras");            # list fields      $qty_default = $tbl_info->default ("qty");      $size_default = $tbl_info->default ("size');      @extras_default = $tbl_info->default ("extras");      print start_form (-action => url ()),          p ("Quantity:"),          textfield (-name => "qty",                      -size => $qty_len,                      -default => $qty_default),          p ("Serving size:"),          radio_group (-name => "size",                          -values => \@size,                          -default => $size_default),          p ("Extras:"),          checkbox_group (-name => "extras",                          -values => \@extras,                          -default => \@extras_default),          p ("Special instructions:"),          textfield (-name => "remark", -size => $remark_len),          br (), br (),          submit (-name => "choice", -value => "Submit"),          end_form ();  }

So what does all this messing around gain us? The primary advantages are that your applications become more flexible and adaptive to the structure of your database, and you reduce your workload:

When your applications determine for themselves what your tables look like, they are less dependent on your own knowledge about the characteristics of those tables. To the extent that an application uses the information available, it becomes adaptive with respect to changes in your database. Consider what happens to coffee.pl if your marketing people decide to engage in that scurrilous technique of trying to make servings seem larger by relabeling the sizes S, M, and L as M, L, and XL. You could implement the change in the database by modifying the definition of the size enumeration column:
```
 # first add the new element XL  ALTER TABLE coffee MODIFY size ENUM('S', 'M', 'L', 'XL') NOT NULL  # map existing values in the table to the next size up  UPDATE coffee SET size = ELT(FIELD(size, 'S', 'M', 'L'), 'M', 'L', 'XL')  # delete the now-extraneous element S and bump the default from M to L  ALTER TABLE coffee MODIFY size ENUM('M', 'L', 'XL') NOT NULL DEFAULT 'L' 
```
But now you d have a problem if you wrote the size choices literally into the function that generates the entry form. There is a mismatch between the new column definition and the values listed in the script. You must change your script or the form will include one value that is no longer valid for the column (S) and will be missing another value that users cannot select at all (XL). This difficulty can be avoided by using the information available from SHOW COLUMNS, because then the script can always determine the proper values to display. (In fact, one reason to use ENUM columns is precisely because you can convert the column descriptions from SHOW COLUMNS automatically to generate form elements that enable the user to select only correct values.)
Adding flexibility is more work initially; but if you write general library functions, you need to write them only once, and you can use them over and over from multiple applications. The end result is an overall reduction in programming effort.

These are not unalloyed benefits, of course. They come at a price, because you re pounding your database harder. If your site already has a lot of traffic, that may be a price you re not willing to pay. You ll have to decide how to balance the tradeoff between hard-coding form information for performance and running queries to gain flexibility.

One technique that can be useful in this situation is to implement some type of caching. When the script needs to present the form, check the cache. If the form isn t there, generate it and save it in the cache before displaying it to the user. If the form does exist in the cache, just display it. If you change the structure of the underlying table, remove the cache entry to force it to be regenerated the next time the script runs. This approach requires that you set up some kind of coordination between the administrator who manages the table and the Web scripts that use it. However, the performance benefits can be considerable because you greatly reduce the number of queries issued in connection with constructing the form.

Other Ways to Initialize Forms

In the versions of display_form() shown earlier in the chapter for displaying the coffee table entry form, we provided default values for form fields by passing a default parameter to field-generating calls. CGI.pm also allows field values to be initialized using the param() function. If you invoke param() to load values into your script s parameter environment, CGI.pm will find the values and use them to initialize the fields when you generate the form. We can take advantage of this behavior to write a method that examines a table description, and for each column extracts the default value and uses it to initialize the corresponding form parameter:

 sub load_column_defaults  { my $self = shift;  my ($col_name, $default);      foreach my $row_ref (@{$self->{data}})      {         $col_name = $row_ref->[0];          # for SET columns, treat default as a reference to an array,          # otherwise as a scalar          if ($self->typename ($col_name) eq "set")          {             $default = [ $$self->default ($col_name) ];          }          else          {             $default = $self->default ($col_name);          }          param (-name => $col_name, -value => $default);      }  }

With load_column_defaults(), you can set up field defaults with a single call, and then generate the form as usual, except that you don t need to pass a default parameter to any field-creation calls:

 $tbl_info->load_column_defaults ();  print start_form (-action => url () ),      p ("Quantity:"),      textfield (-name => "qty", -size => $qty_len),      p ("Serving size:"),      radio_group (-name => "size", -values => \@size),      p ("Extras:"),      checkbox_group (-name => "extras", -values => \@extras),      p ("Special instructions:"),      textfield (-name => "remark", -size => $remark_len),      br (), br (),      submit (-name => "choice", -value => "Submit"),      end_form ();

You ll notice that load_column_defaults() handles SET columns specially, because in that case, the default value might actually consist of multiple individual values. The helper method typename() used by load_column_defaults() pulls out a column s type name from the Type value returned by SHOW COLUMNS. The type is always the initial word of that value, so it can be determined like this:

 sub typename  { my ($self, $col_name) = @_;  my $type;      # strip Type information beginning at first non-word character      ($type = ($self->_column_info ($col_name))[1]) =~ s/\W.*//;      return ($type);  }

load_column_defaults() is most useful when you re presenting an entry form used to collect data that will be inserted into the database as a new record. If you want to use a form for updating existing records, a different approach is more appropriate; you typically present to the user a form that is initialized using the contents of the record to be modified. Then the column values in the record will be used as field initializers when you generate the form. To handle this situation, the following method, load_record(), stores the contents of a record into the parameter environment. You can call load_record() in either of two ways. First, you can pass the record as a hash reference, in which case load_record() determines the column names from the keys of the hash. Second, you can pass the record as a reference to an array of column values. In this case, the column names aren t implicit in the array, so you must pass a second reference that points to an array containing the names of the columns in the value array:

 sub load_record  { my ($self, $rec_ref, $name_ref) = @_;  my ($col_name, $val);      # Determine whether record was passed as hash or array reference.      # In the latter case, an array of column names must also be given.      if (ref ($rec_ref) eq "HASH")      {         foreach $col_name (keys (%{$rec_ref}))          {             $val = $rec_ref->{$col_name};           # column value              if ($self->typename ($col_name) eq "set")              {                 # convert SET values to array reference                  $val = (defined ($val) ? [ split (/,/, $val) ] : []);              }              param (-name => $col_name, -value => $val);          }      }      elsif (ref ($rec_ref) eq "ARRAY" &&& ref ($name_ref) eq "ARRAY")      {         for (my $i = 0; $i < @{$rec_ref}; $i++)     # for each column...          {             $val = $rec_ref->[$i];                  # get column value              $col_name = $name_ref->[$i];            # get column name              if ($self->typename ($col_name) eq "set")              {                 # convert SET values to array reference                  $val = (defined ($val) ? [ split (/,/, $val) ] : []);              }              param (-name => $col_name, -value => $val);          }      }      else      {         die escapeHTML ("load_record() was called with illegal arguments\n");      }  }

The following code shows how to load a record into a form if you fetch the record as a hash reference. After loading the record into the parameter environment, you can generate the editing form and display it to the user with the field values initialized from the column values:

 $sth = $dbh->prepare ("SELECT * FROM coffee WHERE id = 5");  $sth->execute ();  $tbl_info->load_record ($sth->fetchrow_hashref ());  $sth->finish ();  print start_form () ...    # generate editing form

Alternatively, you can fetch the record as an array reference and load it before you create the editing form. When you do this, you must also get a reference to the array of column names and pass it to load_record() as well:

 $sth = $dbh->prepare ("SELECT * FROM coffee WHERE id = 5");  $sth->execute ();  $name_ref = $sth->{NAME};   # get column names  $tbl_info->load_record ($sth->fetchrow_arrayref (), $name_ref);  $sth->finish ();  print start_form () ...    # generate editing form

Initializing Enumeration Fields to No Value

If you initialize a set of radio buttons from the value of an ENUM column, you may run into a problem when the value is NULL. Suppose you have a table that records sales for appliance items that can be purchased with service (maintenance) plans having a duration of 12, 24, or 60 months. The table includes a service_plan column declared as follows, where NULL is an allowable value, and also the default:

 service_plan ENUM('12 month', '24 month', '60 month') NULL DEFAULT NULL

The idea here is that NULL represents no plan. In this case, you may want to display three radio buttons corresponding to the service plan lengths, but have none of them selected by default. Unfortunately, if you pass undef to radio_group() as the value of the default parameter, CGI.pm thinks that you just haven t specified any default at all, so it goes ahead and uses the first button as the default anyway. One way to handle this is to call load_column_defaults() (or load_record()), and then override the setting for the enumeration field with a value that s not a legal ENUM member:

 $tbl_info->load_column_defaults ();      # or load_record ()  param (-name => "service_plan", -value => "[NO DEFAULT]");

That would work for radio buttons, but not if you map the ENUM to a pop-up menu, because some element of the menu must always be selected. In that case you may be better off to override both the enumeration values and the default. You can do this by prepending a "title" element to the head of the member list:

 $tbl_info->load_column_defaults ();      # or load_record  ()  @val = $tbl_info->members ("service_plan");  unshift (@val, "Please select a service plan");  param (-name => "service_plan", -value => \@val);

If you use this strategy, you must take care to treat the value " Please select a service plan " as a special case that means no plan when you process submitted forms.

Alternatively, you could avoid this whole mess by making None an explicit member of the enumeration, using it as the default, and disallowing NULL values in the column:

 service_plan ENUM('None', '12 month', '24 month', '60 month') NOT NULL

Limitations of Automatic Form Construction

Given that having an application look up table information for itself increases its form-generation flexibility, it s reasonable to ask how far you can carry this idea. Can you generate a complete form entirely on the basis of SHOW COLUMNS output? Yes and no. You may be able to, but it s not always clear that you really want to. There are some things MySQL cannot tell you. For example, it can t tell you what labels you want to display next to the fields in the form. You could use the column names as the labels, but column names often are insufficiently descriptive to make a form meaningful to the people who use it. There are also other limitations of automatic form generation that you have to deal with:

You can determine a column s length from the column description, but the length may not be useful for the purpose of sizing a text-input field. Suppose you want to present a 300-character text field. You can t store field values in a VARCHAR column, because that type has a limit of 255 characters. A TEXT field can be used instead; it has a maximum length of 65,535 characters, and thus is easily large enough to hold the values. However, a script that tries to autosize text-input fields on the basis of the column type knows nothing about your intent to allow only 300 characters. For a type of TEXT, it would deduce that the field should be 65,535 characters long!
For some column types, a script can t necessarily determine for itself what kind of field to use for the column. Should an ENUM column be presented as radio buttons, a pop-up menu, or a scrolling list? Should a SET column map to check boxes or a scrolling list? Sometimes you can use an adaptive approach based on the number of members in the column description. (For example, you could decide to represent an ENUM with radio buttons if it has five or fewer elements and with a pop-up menu otherwise.) But that s not a general solution that works in all cases.
Table structure information doesn t help you much with form layout issues. Do you want radio buttons to line up horizontally or vertically? Should text fields go next to the caption, or underneath? If you have a long string column, should it be represented by a multiple-row text box? If so, how long should each line be?
Dates are a problem. MySQL can t tell what format you intend to use in a date-valued field. Your application may enable users to enter date values in a format such as MM/DD/CCYY, and then perform a conversion to ISO format (CCYY-MM-DD) on the user s behalf before storing the record. Or you may prefer to use separate year, month, and day fields, and then combine them when entering the record. Those are things no script can determine just by looking at the table structure.

Despite these limitations, in some situations you can process forms in a completely automated fashion, and we ll see an instance of this in the section Automating Poll Processing later in the chapter. Generally, however, a middle-of-the-road approach is likely to be more useful. Write the application to use whatever table structure information is unambiguous, and give it the benefit of your own knowledge for the rest. coffee.pl is an illustration of this approach. It determines some aspects of the entry form for itself, but we supplied the field captions and decided for it what the field types would be.

One way to discover some of the limitations of automated form construction is to push the approach and see what barriers you encounter. Here s an example where we d run into problems with the coffee.pl script: The qty column of the coffee table is an unsigned integer type. If you want to constrain the range of possible quantities to a small number of values, you might decide to change it to an ENUM instead:

 ALTER TABLE coffee  MODIFY qty ENUM('1', '2', '3', '4', '5', '6', '7', '8', '9', '10') NOT NULL DEFAULT '1'

That would make it possible to use the column description to generate a radio button set or pop-up menu automatically that would contain only legal values for the column. As coffee.pl is presently written, however, it wouldn t actually notice whether you changed qty from a TINYINT to an ENUM. That s because we ve built in the assumption that quantities will be collected using a text-input field. For the script to automatically adapt to a change of column type, it would have to check those types and determine on the fly what kind of field to generate for each column. Unfortunately, then we run into one of the limitations described in the preceding list: Three different field types can be used to represent ENUM columns. How would the script figure out which is most appropriate? That s something you d have to decide. (Later in the chapter, we ll see how to use a lookup table to solve the problem of limiting the valid quantity values without changing the qty column from an integer to an enumeration.)

Using SHOW COLUMNS Output to Validate Form Submissions

SHOW COLUMNS output can help you do more than generate forms. It s also useful for validating the contents of a form after the user fills it in and submits it. This section describes some kinds of validation that can be performed automatically based on table descriptions. As is true for generating forms, you may need to add your own knowledge for validating them. For example, the qty column of the coffee table is an unsigned integer. A script can tell from this at validation time that it should disallow non-numeric values as well as numeric values that are negative. However, zero is a non-negative number that would pass such a test, even though it s highly unlikely you d want to allow zero as a valid quantity. That s something you d need to check for explicitly, based on the specific quantities you intend to allow. One general strategy that allows for both automatic and special-purpose validation tests is as follows:

Check all the fields, applying some battery of automatic tests.
If any errors are detected, notify the user and redisplay the form.
If all the fields pass the tests, run any special-purpose tests that may be necessary to handle conditions not covered by the automatic tests.

As a demonstration of this strategy, we ll write a set of automated tests, packaged as a validate() method for the WebDB::TableInfo module, and then apply this method to validation of entries obtained from the form presented by coffee.pl. To write validate(), let s consider what kinds of things it can check for based purely on the information produced by SHOW COLUMNS (as well as some things it should not check for):

If a column cannot contain NULL values, it s reasonable to assume that the corresponding field must be filled in. (If you want to provide default values, you could do so by calling load_column_defaults() or load_record() prior to generating and displaying the form.)
For integer columns, the field value must be a string of digits, optionally preceded by a sign character. If the column has the unsigned atrribute, negative values should be disallowed.
For integer, CHAR, and VARCHAR columns, the column length constrains the length of field values.
For ENUM columns, the field value must be one of the members listed in the column definition.
For SET columns, the corresponding field may consist of multiple values, each of which must be listed in the column definition.
When testing whether a field is empty, we can trim whitespace from the end of the value. Trimmed values can be put back in the parameter environment so that if additional tests need to be performed, the value can be extracted later without needing to be trimmed again. The exceptions to this strategy are fields that correspond to SET or ENUM columns. It s legal for members of those column types to be declared with leading or trailing whitespace; trimming would cause spurious results when checking for a match.
No checking should be done for fields that contain dates. When you store a value into a DATE column, it must be in ISO format (CCYY- MM-DD), but that s not a constraint that validate() can enforce in a general way, because you may well want users to enter values in some other format, such as MM/DD/CCYY.

Here s an implementation of the validate() method based on the preceding discussion. For each column in the table associated with a form, it uses the column name to extract a parameter value from the script environment, and then performs tests on the value based on the column description. If any errors are found, messages describing them are accumulated in an array, which becomes the return value. If validate() finds no problems, it returns an empty list.

 sub validate  { my $self = shift;  my @errors;      foreach my $col_name (keys (%{$self->{row}}))      {         my $typename = $self->typename ($col_name);          my ($val, $err);          # For a SET column, allow multiple values; this test is done BEFORE          # the empty field test because it's allowable for sets to be empty.          if ($typename eq "set")          {             my @val = param ($col_name); # use array to allow multiple values              push (@errors, $err)                  if $err = $self->check_set_value ($col_name, @val);              next;          }          # all other columns should have a scalar value          $val = param ($col_name);          # Check for blank fields.          # if the value is missing but the column is not nullable, it's          # an error (exception: for timestamp columns and auto_increment          # columns, it's assumed they won't even be represented in the form.)          if ((!defined ($val) || $val eq "") &&                  !$self->has_attr ($col_name, "nullable"))          {             push (@errors, "$col_name must have a value")                  if !$self->has_attr ($col_name, "auto_increment")                      && $typename ne "timestamp";              next;          }          if ($typename eq "enum")          {             push (@errors, $err)                  if $err = $self->check_enum_value ($col_name, $val);              next;          }          $val = WebDB::trim ($val);                  # trim whitespace          param (-name => $col_name, -value => $val); # return to environment          if ($typename =~ /int$/)        # one of the integer types          {             push (@errors, $err)                  if $err = $self->check_integer_value ($col_name, $val);          }          elsif ($typename =~ /char$/)    # char or varchar          {             push (@errors, "$col_name is too long")                  if CORE::length ($val) > $self->length ($col_name);          }          # Add any additional tests here ...      }      return (@errors);       # empty list if no errors  }

In the course of performing its tests, the validate() method invokes a number of subsidiary methods, several of which have not yet been described. The has_attr() method checks for various attributes in the column description:

 sub has_attr  { my ($self, $col_name, $attr_name) = @_;  my @val = $self->_column_info ($col_name);      # Look in Null value for YES to see if column is nullable      return ($val[2] eq "YES") if $attr_name eq "nullable";      # Look in Extra value for auto_increment      return ($val[5] eq "auto_increment") if $attr_name eq "auto_increment";      # Look for other attributes (unsigned, zerofill, binary) in Type value      # (split, then shift off type name to leave attribute words)      @val = split (" ", $val[1]);      shift (@val);      return (grep (/^$attr_name$/, @val));  }

Other methods check values for various column types. Each of them returns an empty string if the value is okay, or an error message otherwise. check_integer_value() verifies that a value is an integer, that it is non-negative if the column is unsigned, and that the value doesn t exceed the column s display width:^[5]

^[5] Note that validate() and check_integer_value() both invoke Perl s built-in length() function as CORE::length() to distinguish it from the length() method of the WebDB::TableInfo module itself.

 sub check_integer_value  { my ($self, $col_name, $val) = @_;      return ("$col_name must be an integer") if $val !~ /^([-+])?\d+$/;      return ("$col_name cannot be negative")          if $self->has_attr ($col_name, "unsigned") && $val < 0;      return ("$col_name is too long")          if CORE::length ($val) > $self->length ($col_name);      return ("");  }

check_enum_value() tests a scalar to make sure it s one of the valid members of an ENUM column. check_set_value() is similar, but more involved. It must check array values, each of which must be a valid member of a SET column. Both of these methods use the i modifier for the pattern match because ENUM and SET columns are not case sensitive:

 sub check_enum_value  { my ($self, $col_name, $val) = @_;  my @members = $self->members ($col_name);      # use //i test; ENUM is not case sensitive      return ("$col_name cannot have a value of $val")          if !grep (/^$val$/i, @members);      return ("");  }  sub check_set_value  { my ($self, $col_name, @val) = @_;  my @members = $self->members ($col_name);  my @illegal;      foreach my $val (@val)      {         # use //i test; SET is not case sensitive          push (@illegal, $val) if !grep (/^$val$/i, @members);      }      return ("$col_name cannot contain " . join (",", @illegal)) if @illegal;      return ("");  }

Here s how to use validate() in the context of the coffee.pl application, which invokes a process_form() function when the user submits an entry form. validate() runs the automatic tests, and if those succeed, we run any additional tests not handled automatically:

 sub process_form  { my ($dbh, $tbl_info, @errors);      $dbh = WebDB::connect ();      $tbl_info = WebDB::TableInfo->get ($dbh, "coffee");      # Perform automated tests.  If they succeed, perform the field-specific      # tests (only one such test is needed here).      @errors = $tbl_info->validate ();      if (!@errors)      {         # we know the qty value is numeric and non-negative          # at this point; make sure it isn't zero          push (@errors, "Quantity must be greater than zero")              if param ("qty") == 0;      }      # If any errors occurred, display messages, return failure.      if (@errors)      {         $dbh->disconnect ();          @errors = map { escapeHTML ($_) } @errors;          print p ("The following errors occurred:"),                  ul (li (\@errors));          return (0);      }      # Record is okay; insert into table, display confirmation, return true.      $tbl_info->insert_record ($dbh);      $dbh->disconnect ();      print p ("Record was inserted successfully"),          p (a ({-href => url ()}, "Enter next record"));      return (1);  }

The insert_record() method used near the end of process_form() constructs and executes an INSERT statement to create a new record, based on the table structure and the contents of the form. We ll discuss how it works in the next section, along with a method, update_record(), that can be used to issue UPDATE statements when you re editing existing records.

validate() works best if you define your tables with a couple of points in mind:

Declare your columns NOTNULL if the corresponding form fields must be filled in.
Specify an explicit display width for integer fields if you want validate() to enforce a specific length constraint on field values.

Even with these assumptions, however, validate() is subject to several limitations:

validate() doesn t know anything about the labels you display in a form, so error messages can refer only to column names, not descriptive field labels.
Form parameters are checked only if they have the same name as a column in the table. If your form contains fields that don t map onto table columns, validate() won t check them.
There are several column types that validate() doesn t check very well. For example, we could add tests to make sure that fields for DECIMAL, FLOAT, or DOUBLE columns contain numeric values. We could also check lengths for DECIMAL, FLOAT/DOUBLE, and BLOB/TEXT. (Although to do that, we d need to make the length() method smarter, because it too knows nothing about these types.)

The nature of these limitations may make validate() unsuitable for use with a given application. However, you may still find it useful to apply some of the individual value-checking techniques that it illustrates. Or you might choose to extend it. For example, as validate() is written, the error messages that it generates refer to column names. To get better messages, you might rewrite validate() to accept and use a hash argument that maps column names to descriptive field labels, something like this:

 @errors = $tbl_info->validate ($dbh, {             "qty" => "Quantity",              "size" => "Serving size",              "extras" => "Additional ingredients"  });

Using SHOW COLUMNS Output to Insert or Update Records

After you ve validated the contents of a submitted form, you store the information in the database. Table information from SHOW COLUMNS comes in handy for this phase of form processing (just as it does for earlier phases), although it s necessary to make a distinction between the issues involved in inserting new records and those for updating existing ones.

Inserting New Records

To gather information for the purpose of creating a new record, the process is as follows:

Generate the entry form, initialized with empty values. (Alternatively, to fill the form with the default values listed in the table description, call load_column_defaults() before creating the form.)
Display the form to the user.
Validate the form when the user submits it.
Insert the form contents as a new record, specifying in the INSERT statement only those fields that are non-empty.

We ve discussed the first stages of this process earlier in the chapter. Now we can consider how we might use table description information to help us construct the INSERT statement for storing the form contents into the table. As it turns out, table information doesn t actually tell us much. We can use it to determine whether a field maps onto a SET column, but that s about it. Nevertheless, that s important because a set field might contain multiple values that need to be converted to a single string.

Here is a method, insert_record(), that takes a database handle, looks through the columns in a table, extracts the corresponding form parameter value for each one, and adds it to the INSERT statement:

 sub insert_record  { my ($self, $dbh) = @_;  my $assign;  my @placeholder;  my (@val, $val);      # construct assignments for SET clause using all non-empty fields      foreach my $col_name (keys (%{$self->{row}}))      {         my $typename = $self->typename ($col_name);          @val = param ($col_name);       # field value for column          if ($typename eq "set")          {             $val = join (",", @val);    # convert multiple values to scalar          }          else          {             $val = shift (@val);          }          # don't bother with empty fields          next unless defined ($val) && $val ne "";          $assign .= "," if $assign;      # put commas between assignments          $assign .= $col_name . " = ?";  # add column name, placeholder          push (@placeholder, $val);      # save placeholder value      }      $dbh->do ("INSERT INTO $self->{table} SET $assign", undef, @placeholder);  }

insert_record() assumes that field values are stored in the parameter environment and can be accessed by calling param(). It doesn t bother adding any name=value assignment to the INSERT statement for any field that is empty. This is advantageous for two reasons. First, although you could provide a default value for an empty field by looking up the column s default value in the table description, MySQL will set the column to its default automatically if you just omit it from the INSERT statement (and it s more efficient to let MySQL do so than to list the default explicitly). Second, skipping assignments for columns with no corresponding parameter values gives insert_record() an easy way to pass over TIMESTAMP and AUTO_INCREMENT columns. That s generally what you want, because these are rarely represented in entry forms. They are special kinds of columns and it s much more common to let MySQL assign them values automatically.(TIMESTAMP columns end up with the current time and day, and AUTO_INCREMENT columns are assigned the next sequence number.)

A limitation of insert_record() is that it skips not only empty fields, but also fields that don t have the same name as a table column. In forms that collect a date value from users, for example, a common strategy for making dates easier to validate is to represent them as three separate integer-valued fields. Suppose you have a DATE column named entry_date in a table. You might represent it using three fields named day, month, and year. insert_record() would expect to find an entry_date field corresponding to the entry_date column, and would ignore the day, month, and year fields entirely. The result is that your entry_date column wouldn t be assigned any value in the INSERT statement.

Sometimes you can work around this lack of correspondence between table columns and form fields pretty easily, by manipulating the parameter environment before invoking insert_record(). If you have validated the date-part fields and know they form a legal date value, you could synthesize a parameter value for the entry_date column like this:

 param (-name => "entry_date",          -value => sprintf ("%04d-%02d-%02d",                      param ("year"), param ("month"), param ("day")));

If you do this before calling insert_record(), that method will find a field value named entry_date, properly written in ISO format, and will happily add it to the INSERT statement.

You can use this same technique if you have a DATE column that you want to set to the current date when the record is created, but that has no corresponding field in the form. Extract the current date from Perl and install it into the parameter environment like this:

 ($day,$month, $year) = (localtime (time ()))[3..5];  param (-name => "entry_date",          -value => sprintf ("%04d-%02d-%02d", $year + 1900, $month + 1, $day));

In both cases, the idea is the same: insert_record() is limited to using only those values it can find by calling param(), but there s nothing to stop you from putting your own values into the parameter space for insert_record() to discover.

Updating Existing Records

The process for editing an existing record differs somewhat from that for inserting new records. There are a couple of possible approaches. One way to perform record editing is to display a blank entry form and have the user fill in those values that need to be changed. Then when the form is submitted, construct an UPDATE statement that modifies the values only for those columns corresponding to non-empty fields in the form. This is easier for you as the application writer, but it s not very helpful for users:

The user doesn t see the record s current contents and can t tell which fields need to be changed, at least not without consulting some external source of information.
There is no way to delete a column value, because this approach treats an empty field as signifying leave this column alone rather than delete this column value.

A better method is to display the editing form with all the fields filled in from the record s current contents. Then the user can see what value each column contains easily and can tell which ones need to be revised. Also, a column s value can be deleted just by clearing the corresponding field in the editing form. To implement this approach, use the following general procedure:

Retrieve from the database the record that is to be modified.
Generate the editing form, using the contents of the record to initialize the fields. The form should include a hidden field that contains the value of some unique index to allow the record to be identified later when the user submits the form. (If the unique index comprises multiple columns, use multiple hidden fields.)
Display the form to the user.
Validate the form when the user submits it.
Use the form contents to update the record indicated by the value of the unique key specified by the hidden field (or fields). The UPDATE statement should assign a value to every column in the table corresponding to an editable form element.

It s more work to fill in the fields for an editing form, because you must look up the record first. (The effort required to load the record into the form can be reduced by using the load_record() method from the WebDB::TableInfo module.) It s also more work to generate an UPDATE statement than an INSERT statement. For one thing, you need to identify which record to update. (This can be done using a WHERE clause based on the unique key value.) For another, you must assign values to all the other columns, even those that are empty. (A column might have had a non-empty value that the user now wants cleared.) For an empty field, however, you can set the corresponding column to two possible values: the empty string or NULL. Which should you choose?

Here s where it s helpful to have information about the characteristics of the table s columns. It s generally better to use NULL if the table column allows NULL, because that corresponds better than the empty string to the no value case. Consider what happens if you have an INT or DATE column that can be NULL. If you set the column to NULL, that s what gets stored in the record. If you store an empty string, MySQL will perform type conversion and you ll get a 0 for an INT or 0000-00-00 for a DATE ; that s probably not what you want.

Using this information, we can write a method, update_record(), that takes arguments for a database handle and the name of the column (or columns) that make up the unique key. (The key values are those you will have represented using hidden fields.) If coffee.pl had a record-editing capability, you d update a record like this, because id is the unique key column:

 $tbl_info->update_record ($dbh, "id");

The implementation of update_record() is as follows:

 sub update_record  { my ($self, $dbh, @key) = @_;  my ($assign, $where);  my @placeholder;  my (@val, $val);      # Construct assignments for SET clause using all non-empty non-key columns      foreach my $col_name (keys (%{$self->{row}}))      {         next if grep (/^$col_name$/, @key); # skip key columns          my $typename = $self->typename ($col_name);          @val = param ($col_name);           # field value for column          if ($typename eq "set")          {             $val = join (",", @val);    # convert multiple values to scalar          }          else          {             $val = shift (@val);          }          # If the field has no value, determine whether to insert          # undef or an empty string into the column based on whether          # or not the column can take NULL values.          if (!defined ($val) || $val eq "")          {             $val = ($self->has_attr ($col_name, "nullable") ? undef : "");          }          $assign .= "," if $assign;      # put commas between assignments          $assign .= $col_name . " = ?";  # add column name, placeholder          push (@placeholder, $val);      # save placeholder value      }      # Construct WHERE clause using all key columns to identify record.      # All keys are assumed to be scalars and non-NULL.      foreach my $col_name (@key)      {         $val = param ($col_name);          $where .= " AND " if $where;    # put AND between assignments          $where .= $col_name . " = ?";   # add column name, placeholder          push (@placeholder, $val);      # save placeholder value      }      $dbh->do ("UPDATE $self->{table} SET $assign WHERE $where",                  undef, @placeholder);  }

only for RuBoard - do not distribute or recompile