An Online Document Sales Application | MySQL and Perl for the Web

only for RuBoard - do not distribute or recompile

A few areas of the PseudEcom site are functional now. Customers can visit the home page, and they can select the links that present policy statements or contact information. Let s proceed on to our next application, one that implements our first online shopping scenario. The situation we ll use for this application is that we are in possession of a wealth of wonderful information guaranteed to produce financial prosperity and security and that customers can have access to for a price, of course. This application, doc_shop.pl, delivers the information in the form of PDF documents. It works in several stages:

Initially, doc_shop.pl presents a catalog of product offerings a list of the documents we have for sale. The user chooses the documents that look interesting, and then selects the Check Out button.
The checkout page displays the items that were chosen so that the user can review the order. It also presents a billing information form containing fields for name, email address, and credit card information. (No shipping address is needed because we ll deliver the documents electronically.) The user fills in the billing form and selects Submit Order.
When doc_shop.pl receives the billing information, it charges the user s credit card (don t get excited; we re going to fake this part), and then displays a page containing an order confirmation number and hyperlinks to each of the purchased documents. These links enable the customer to download the documents, although each one will be tied to a password to discourage people from downloading documents without paying for them. We ll also send the customer an email message containing the order information as a more permanent record of the transaction.

What Information Should You Ask for, and When?

doc_shop.pl first determines what the customer wants to order, and then asks for billing information. There s no logical reason you couldn t ask for the billing information first, but there are a couple of practical reasons not to. First, if the visitor doesn t purchase anything, we won t need billing information and will have wasted the visitor s time by requiring it. Second, it s best not to make people identify themselves or provide sensitive information until you really need such information, because people don t like it when you do that. If you require a credit card number before allowing customers into the shopping area, for example, they ll become suspicious of your motives. (If I have to provide a card number, only to find that you don t have the goods in which I m interested, I may assume you re a fraudulent site set up solely for the purpose of capturing numbers. I will never visit you again.)

To avoid alienating your customers, here s a good principle to follow: Provide information to your customers as they request it, but require information from them only as you need it and not before. Think of it this way: You don t want to stop your customers at the door you want them to come in and look around.

doc_shop.pl begins with a preamble that implements its overall logic like this:

 #! /usr/bin/perl -w  # doc_shop.pl - Present document-shopping activity  use strict;  use lib qw(/usr/local/apache/lib/perl);  use CGI qw(:standard escape escapeHTML);  use Mail::Sendmail;  use Business::CreditCard;  use Digest::MD5;  use WebDB;  use WebDB::PseudEcom;  my $dbh = WebDB::connect ();  my $page = h3 ("The PseudEcom Doc Shop");  # Dispatch to proper action based on user selection  my $choice = lc (param ("choice")); # get choice, lowercased

 if ($choice eq "")                  # initial invocation  {     $page .= display_catalog ($dbh);  }  elsif ($choice eq "check out")      # user is ready to check out  {     $page .= display_billing_form ($dbh);  }  elsif ($choice eq "submit order")   # record the order, deliver goods  {     $page .= close_order ($dbh);  }  else  {     $page .= p (escapeHTML ("Logic error, unknown choice: $choice"));  }  $dbh->disconnect ();  print header (),          start_html (-title => "PseudEcom Corporation", -bgcolor => "white"),          add_boilerplate ($page),          end_html ();

That doesn t look very complicated, but of course it s just the tip of the iceberg. The application implementation involves several issues:

Product catalog presentation. For this application, the catalog is small and we can display the entire product line in a single page for the user to browse. Later, we ll build an application that uses a more extensive catalog for which we ll provide a search interface. When you allow browsing or searching over multiple pages, you need some way to remember what items have been ordered so far, and that means a shopping cart. The simpler catalog is convenient as a starting point because we can get away without writing a cart.
Collecting and processing billing information. We ll allow payment by credit card, so we ll need to solicit card information from the customer, and we ll also need to set up a secure connection.
Transactions. Orders will be recorded using two tables. One (doc_order) lists general order and customer information; the other (doc_item) lists each item in the order. It would not be good to update only one table if the update to the other failed for some reason. We want all updates for a given order to succeed as a unit, which means we need to use transactions to make sure that partial updates get rolled back.
Order delivery. doc_shop.pl uses electronic delivery, so we need not determine any shipping charges. Electronic delivery also eliminates any need to keep track of item quantities, at least for this scenario. After all, after a user has downloaded a PDF file, it can be printed multiple times at will. (For other types of downloads, you d probably want to associate quantities with items ordered. If you were selling software titles, you might allow multiple-unit purchases and issue separate serial numbers for each unit of a given title.)

Setting Up the Tables

doc_shop.pl requires three tables. One contains the catalog items, and the other two are used to record orders that we receive from customers. We ll also need to store the PDF files somewhere. We could store them in the database, the same way we managed image data in Chapter 5, Writing Form-Based Applications, and in Chapter 7, Performing Searches. Or we could store just the document names in the database and store the documents in the file system. To demonstrate this alternative, that s the approach doc_shop.pl takes.

The catalog table doc_catalog looks like this:

 CREATE TABLE doc_catalog  (     # document ID, title, price, and filename      doc_id      BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,      title       VARCHAR(60) NOT NULL,      price       DECIMAL(10,2) NOT NULL,      filename    VARCHAR(40) NOT NULL  )

doc_id is the unique document identifier. title provides a descriptive title that we ll display for customers in the catalog page to indicate what the document is about. price is the selling price. (Its type, DECIMAL(10,2), provides 10 digit values with 2 decimal places of accuracy well suited to monetary values.) filename indicates where to find the PDF document. We ll store all the documents in a single library directory, so this value need only be the basename of the file within that directory. An INSERT statement to create some sample records in the doc_catalog table looks like this:

 INSERT INTO doc_catalog (title,price,filename)      VALUES          ('Make Money Fast!',20.00,'fast.pdf'),          ('Double Your Money in 5 Minutes',15.00,'double5.pdf'),          ('The Secret of Financial Success',19.95,'secret.pdf'),          ('How To Be Tax Free',25.00,'tax-free.pdf')

That s only part of the job of initializing the catalog, because we also must install the PDF documents that the catalog refers to. For doc_shop.pl,we ll use the same library directory where we store the static HTML pages that are processed by static.pl (that is, /usr/local/apache/lib/htfiles). Sample documents corresponding to those named in the INSERT statement just shown can be found in the ecom/htfiles directory of the webdb distribution; move into that directory and copy the PDF files to the library directory where doc_shop.pl expects to find them. If you want to install the files in a different directory, change the path name near the beginning of doc_shop.pl. (You ll also need to change another script, serve_doc.pl, to match.)

If you want to add other documents to the catalog, add a record for each one to the doc_catalog table and copy the corresponding PDF file to the library directory. Similarly, to remove a document, delete the doc_catalog record and remove the file. Compared to the technique of storing documents in the database, this is more cumbersome because each action involves a database operation and a file system operation. On the other hand, storing the documents in the file system does have some advantages. To replace a document, for example, you can just flop a new one on top of it. Thus, each approach has its own pros and cons.

Note that although we stored images in the database in earlier chapters and PDF documents in the file system here, we could just as well have done things the other way around. There is no inherent property of images, documents, or, indeed, any other kind of data that constrains you to using one method or the other.

To store orders received from customers, doc_shop.pl uses two tables:

 CREATE TABLE doc_order  (     order_id    BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,      order_date  DATE NOT NULL,          # date of order      order_amt   DECIMAL(10,2) NOT NULL, # total cost      auth_id     VARCHAR(10) NOT NULL,   # transaction authorization      cust_name   VARCHAR(40) NOT NULL,   # customer name      cust_email  VARCHAR(60) NOT NULL    # customer email address  ) TYPE = BDB  CREATE TABLE doc_item  (     order_id    BIGINT UNSIGNED NOT NULL,  # order number      doc_id      BIGINT UNSIGNED NOT NULL,  # document number      price       DECIMAL(10,2) NOT NULL,    # price paid      password    CHAR(32) NOT NULL,         # item password (for downloading)      UNIQUE (order_id, doc_id)  ) TYPE = BDB

doc_order contains general information about each order; there will be one record in this table per order. doc_item will contain one row per order for each document purchased. The order_id and item_id columns indicate the order the record belongs to and which document was ordered. The price column shows how much the customer paid for it. (We record the price here in the doc_item table because otherwise we wouldn t be able to determine how much customers paid for each document if we happened to change the prices in the doc_catalog table.) The password column contains a value that the customer must present to download the document after paying for it. The column is a CHAR(32) because we ll generate passwords that are 32-character strings. (The passwords will be created using the Digest::MD5 module, which is why the preamble of doc_shop.pl includes a use Digest::MD5 statement.)

The CREATETABLE statements for the order storage tables include something we haven t used before a TYPE clause at the end. This indicates that the doc_order and doc_item tables should be created using a specific table type, rather than whatever the default type happens to be. BDB is a type that supports transactions, which we ll require later when updating both order tables at the same time. (I m using BDB tables here because, although other kinds of transaction-capable tables may be available, BDB tables were the first such to be included in MySQL, and therefore the most likely to be present in your version. To use a different transactional table type, substitute an appropriate clause for the TYPE = BDB part of the CREATETABLE statements. If your version of MySQL doesn t have transaction support, you should remove TYPE = BDB from the statements entirely.)

Presenting the Document Catalog

The first thing that doc_shop.pl does for the customer is present the list of documents we have for sale. The display_catalog() function handles this. It selects the items from the doc_catalog table and presents them as a table that shows the document ID, description, and price. It also displays a check box next to each item that the customer can use to select documents.

 sub display_catalog  { my $dbh = shift;  my ($sth, @row, $page, $url);      push (@row, Tr (             th ("Select"),              th ("Item No."),              th ("Document Title"),              th ("Price (USD)")          ));      $sth = $dbh->prepare ("SELECT * FROM doc_catalog ORDER BY doc_id");      $sth->execute ();      while (my $ref = $sth->fetchrow_hashref ())      {         my $item_no = "item_" .. $ref->{doc_id};          push (@row, Tr (                 # explicitly give checkbox no label                  td (checkbox (-name => $item_no, -label => "")),                  td (escapeHTML ($ref->{doc_id})),                  td (escapeHTML ($ref->{title})),                  td ({-align => "right"},                      escapeHTML (sprintf ("%.2f", $ref->{price})))                  ));      }      $sth->finish ();

     # present a secure link for the Check Out button      ($url = url ()) =~ s/^http:/https:/i;      $page .= p ("Please select the documents you wish to purchase,\n"                  . "then select Check Out.")              . start_form (-action => $url)              . table (@row)              . submit (-name => "choice", -value => "Check Out")              . end_form ();      return ($page);  }

The check boxes presented in the catalog page are individual fields. They do not form a group; each one corresponds to a specific document. The check boxes are given names of the form item_n , where n is the doc_id value for the associated document. This makes it easy to identify the relevant parameters when the customer submits the form. Note that the check box labels are explicitly set to the empty string. If we don t specify any labels at all, the check box names will be displayed as the labels, which is not what we want.

The document selection form includes an action parameter that begins with https so that when the customer submits the form, a secure connection will be set up for collecting the billing information in the next stage. This way, the customer can see that we re going to protect any information that we ask for. (If you don t have your server set up for secure connections yet, this https link won t work. For testing, you can modify the script to use the unaltered value of url(), but that s not a good idea for production applications.)

Collecting Billing Information

After the user indicates which document or documents to purchase and selects the Check Out button on the catalog page, doc_shop.pl proceeds to the checkout stage, implemented using the display_billing_form() function. This routine first determines which items were chosen by the customer. If nothing was selected, display_billing_form() says so and redisplays the catalog page. Otherwise, it presents a table showing the selected items and the total order cost, along with a form for collecting billing information. This form also contains the item IDs as hidden fields so that we ll be able to determine which documents to charge the user for when the billing form is submitted:

 sub display_billing_form  { my $dbh = shift;  my (@id, $order_info, $page);      # Find field names of the form item_n and extract n      # from each to determine which items were selected.  @id = grep (s/^item_//, param ());  if (!@id)  {     $page .= p ("You didn't choose any items.");      $page .= display_catalog ($dbh);      return ($page);  }  # fetch information for the selected documents from the catalog and  # format it into a table for display to the customer  $order_info = fetch_catalog_items ($dbh, @id);  $page .= p ("You chose these item(s):");  $page .= format_items_html ($order_info);  $page .= p ("Please provide your name, email address, and\n"              . "credit card information, then select Submit Order.");  $page .= start_form (-action => url ());  # add item IDs as hidden fields  foreach my $id (@id)  { my $item_no = "item_" .. $id;      $page .= hidden (-name => $item_no, -value => $id, -override => 1);  }  $page .= table (             Tr (                 td ("Your name (as listed on card):"),                  td (textfield (-name => "cust_name", -size => "40"))              ),              Tr (                 td ("Email address:"),                  td (textfield (-name => "cust_email", -size => "60"))              ),              Tr (                 td ("Credit card type:"),                  td (popup_menu (-name => "card_type",                                  -values => get_card_type_list ()))              ),              Tr (                 td ("Credit card number:"),                  td (textfield (-name => "card_number", -size => "30"))              ),              Tr (                 td ("Expiration date (month, year):"),                  td (popup_menu (-name => "card_exp_month",                                      -values => get_card_exp_month_list ())                          . popup_menu (-name => "card_exp_year",                                      -values => get_card_exp_year_list ()))                  )              )              . br ()              . submit (-name => "choice", -value => "Submit Order")              . end_form ();      return ($page);  }

To generate the first part of the page that displays the items that the customer selected, display_billing_form() needs to extract the item ID values from the order form the user submitted (the form generated by display_catalog()). These IDs were encoded using field names such as item_14 for document 14, so we can get the item numbers by looking for fields with names of that form and stripping off the initial item_ prefix:

 @id = grep (s/^item_//, param ());

After we know the item numbers, we can retrieve information for them from the doc_catalog table to use in formatting the order information. This information is obtained by calling fetch_catalog_items(), which takes a database handle and a list of document IDs. It returns a structure containing an array of item records and the total cost of the items:

 sub fetch_catalog_items  { my ($dbh, @id) = @_;  my ($sth, $info);      return undef unless @id;    # no ID values were passed      $info = {};      $info->{item} = []; # array for item hashrefs      $sth = $dbh->prepare (                 "SELECT * FROM doc_catalog WHERE doc_id IN("                  . join (",", ("?") x @id)                  . ") ORDER BY doc_id");      $sth->execute (@id);      $info->{total_price} = 0;      while (my $ref = $sth->fetchrow_hashref ())      {         push (@{$info->{item}}, $ref);          $info->{total_price} += $ref->{price};      }      $sth->finish ();      return undef unless @{$info->{item}};   # none of the IDs were valid      return ($info);  }

The fetch_catalog_items() routine uses MySQL s IN() function to avoid looking up records one at a time. For example, if the user has selected items 3, 14, and 22, the query to retrieve them looks like this:

 SELECT * FROM doc_catalog WHERE doc_id IN(3,14,22) ORDER BY doc_id

IN() takes a variable number of arguments, which is convenient because we might be looking for variable numbers of documents. However, we can t use a fixed query string because the string must contain one placeholder per document. (Placeholders represent single data values, so we can t use one placeholder to represent a list of values.) For example, to retrieve three records, we need a query string that looks like this:

 SELECT * FROM doc_catalog WHERE doc_id IN(?,?,?) ORDER BY doc_id

To create the proper query for the general case, we need to know how many IDs there are. That s available as the value of @id, which in a scalar context returns the number of elements in the array. Then we can use the value to construct a comma-separated list of placeholder characters like this:

 join (",", ("?") x @id)

The x n operator, when applied to a list, concatenates the list n times to produce another list, and join produces a string consisting of the list elements separated by commas. The resulting list of placeholders can be placed into the IN() call to construct a query string that expects the proper number of ID values, which we supply by passing @id to execute() after preparing the query.

After display_billing_form() calls fetch_catalog_items() to get the item information, it invokes format_order_html() to format that information for display to the customer:

 sub format_items_html  { my $order_info = shift;  my @row;      push (@row, Tr (             th ("Item No."),              th ("Document Title"),              th ("Price (USD)")          ));      foreach my $ref (@{$order_info->{item}})      {         push (@row, Tr (                 td (escapeHTML ($ref->{doc_id})),                  td (escapeHTML ($ref->{title})),                  td ({-align => "right"},                      escapeHTML (sprintf ("%.2f", $ref->{price})))                  ));      }      push (@row, Tr (             td (""),              td ("Total Price"),              td ({-align => "right"},                  escapeHTML (sprintf ("%.2f", $order_info->{total_price})))          ));      return (table (@row));  }

Following the formatted order information, display_billing_form() presents a form to solicit credit card information. The form includes several fields that are constructed using helper routines from the WebDB::PseudEcom module. These functions return array references pointing to the proper lists of values for the pop-up menus used to specify the credit card type and expiration month and year:

 sub get_card_type_list  {     return ([ "Visa", "MasterCard", "American Express", "Discover" ]);  }  sub get_card_exp_month_list  {     return ([ "01", "02", "03", "04", "05", "06",              "07", "08", "09", "10", "11", "12" ]);  }  sub get_card_exp_year_list  { my $year = (localtime (time ()))[5] + 1900;      # list 10 successive years, beginning with current year      return ([ map {{ $year + $_ } (0 .. 9) ]);  }

Closing the Order and Delivering the Documents

After the customer provides the credit card information and submits the billing information form, we have everything we need to close the order and to provide the documents that the customer selected. For doc_shop.pl, this process begins by extracting and checking the contents of the billing form. If the information looks reasonable, we issue an authorization request to charge the customer s credit card, store the order in the database, and provide the customer with a final copy of the order (both in a browser window and by email):

 sub close_order  { my $dbh = shift;  my ($billing_info, $order_info, $card_info, $auth_id);  my (@id, @error);  my $page;      ($billing_info, @error) = get_billing_info ();      # determine which items were ordered by checking the hidden      # "item_n" fields,, then retrieve the corresponding catalog entries      @id = grep (s/^item_//, param ());      if (!@id)      {         push (@error, "No items are present in the order");      }      else      {         $order_info = fetch_catalog_items ($dbh, @id)              or push (@error, "No valid items are present in the order");      }      # If validation errors occurred, say what they were and redisplay      # billing information page so user can correct them      if (@error)      {         @error = map { escapeHTML ($_) } @error;          $page .= p ("Please correct the following problem(s)")                  . ul (li (\@error));          $page .= display_billing_form ($dbh);          return ($page);      }      # Package credit card information into a hash      # for use in the authorization request      $card_info = {};      $card_info->{card_type} = $billing_info->{card_type};      $card_info->{card_number} = $billing_info->{card_number};      $card_info->{card_exp_month} = $billing_info->{card_exp_month};      $card_info->{card_exp_year} = $billing_info->{card_exp_year};      $card_info->{cust_name} = $billing_info->{cust_name};      $card_info->{total_price} = $order_info->{total_price};      $order_info->{auth_id} = issue_authorization_request ($card_info);      if (!$order_info->{auth_id})      {         $page .= p ("Sorry, credit card authorization failed.\n"                      . "Please make sure your credit card information\n"                      . "in the form below is correct.");          $page .= display_billing_form ($dbh); # redisplay page          return ($page);      }      # Once we have the authorization ID, we'll work with that;      # the credit card number is no longer needed.      delete ($card_info->{card_number});      # Store the order in the database (after loading into $order_info      # any remaining billing information that needs to be stored).      # Perform all statements involved in the database update as a      # transaction.      $order_info->{cust_name} = $billing_info->{cust_name};      $order_info->{cust_email} = $billing_info->{cust_email};      $dbh->{AutoCommit} = 0;     # disable auto-commit mode      eval      {         store_order ($dbh, $order_info) and $dbh->commit ();      };      if ($@)                     # transaction failed, abort and notify user      {         $dbh->rollback ();          $dbh->disconnect ();          fatal_error ("Sorry, could not store order in database");      }      $dbh->{AutoCommit} = 1;     # restore auto-commit mode      # Generate the confirmation page and email message      $page .= gen_confirmation_page ($order_info);      $page .= send_confirmation_email ($order_info);      return ($page);  }

Validation of the billing form is handled by get_billing_info(), which returns two kinds of information: a reference to a hash containing the various billing form fields, and an error message array. The array will be non-empty if any errors occurred:

 sub get_billing_info  { my $billing_info = {};  my @error;      $billing_info->{cust_name} = WebDB::trim (param ("cust_name"));      $billing_info->{cust_email} = WebDB::trim (param ("cust_email"));      $billing_info->{card_type} = WebDB::trim (param ("card_type"));      $billing_info->{card_number} = WebDB::trim (param ("card_number"));      $billing_info->{card_exp_month} = WebDB::trim (param ("card_exp_month"));      $billing_info->{card_exp_year} = WebDB::trim (param ("card_exp_year"));      $billing_info->{cust_name} ne ""          or push (@error, "Name must be filled in");      WebDB::looks_like_email ($billing_info->{cust_email})          or push (@error, "You must supply a valid email address");      # card type and expiration values must be in corresponding lists      # of legal values      grep (/^$billing_info->{card_type}$/, @{get_card_type_list ()})          or push (@error, "The credit card type is invalid");      grep (/^$billing_info->{card_exp_month}$/, @{get_card_exp_month_list ()})          or push (@error, "The credit card expiration month is invalid");      grep (/^$billing_info->{card_exp_year}$/, @{get_card_exp_year_list ()})          or push (@error, "The credit card expiration year is invalid");      # get rid of spaces, dashes, etc. from card number      $billing_info->{card_number} =              WebDB::strip_non_digits ($billing_info->{card_number});      validate ($billing_info->{card_number})          or push (@error, "The credit card number is invalid");      return ($billing_info, @error)  }

Most of the validation performed by get_billing_info() is fairly routine. The customer name must be non-blank and the email address must look like a valid address. The credit card type and expiration date values are checked to make sure that they re present in the value lists that were used to generate the fields in the first place. The expiration year value returned from the card_exp_year field will be four digits long. If your authorization service wants two digits only, modify the billing form or use the following expression to modify the value returned from it. The modulo operation (% 100) takes the two least significant digits and the sprintf() call makes sure there is a leading zero if the result is less than 10:

 $card_exp_year = sprintf ("%02d", $card_exp_year % 100);

To check the credit card number, we call validate(), which is a function from the Business::CreditCard module. It performs a checksum test on the number to make sure it s not bogus or mistyped.^[4] Before checking the number, we strip it of any non-digit characters. (The validate() function allows non-digits, but your authorization service may not.)

^[4] This test is described further in Credit Card Processing. If you need a number to use for testing the application, try 1000 0000 0000 0008.

After close_order() invokes get_billing_info(), it extracts the list of document ID numbers to determine which documents were selected. (These were placed in the form as hidden values by the display_billing_form() function.) Including the IDs as unprotected values in the form may seem to be insecure. Can the customer hack the form? Yes. Will that accomplish anything? No. When the billing information is submitted, we ll look up the prices of the IDs listed in the form s hidden fields. If the customer changed the ID list, all that will happen is that the bill will be generated based on the modified list. It certainly won t trick us into giving away any free merchandise. Try it yourself by hacking the form to see what happens. (Of course, if you re not convinced by this reasoning, there s nothing to stop you from protecting the list using checksum and encryption techniques such as were discussed in Chapter 9, Security and Privacy Issues. Far be it from me to discourage paranoia!)

The final section of close_order() calls several other functions to authorize the credit transaction, store the order, and send information to the customer. These routines are described in the next few sections.

Getting a Credit Authorization

The part of close_order() that issues the authorization request looks official in that it sets up information to be sent to the authorization service, issues the request, and takes action according to the result of the request:

 # Package credit card information into a hash  # for use in the authorization request  $card_info = {};  $card_info->{card_type} = $billing_info->{card_type};  $card_info->{card_number} = $billing_info->{card_number};  $card_info->{card_exp_month} = $billing_info->{card_exp_month};  $card_info->{card_exp_year} = $billing_info->{card_exp_year};  $card_info->{cust_name} = $billing_info->{cust_name};  $card_info->{total_price} = $order_info->{total_price};  $order_info->{auth_id} = issue_authorization_request ($card_info);  if (!$order_info->{auth_id})  {     $page .= p ("Sorry, credit card authorization failed.\n"                  . "Please make sure your credit card information\n"                  . "in the form below is correct.");      $page .= display_billing_form ($dbh);   # redisplay page      return ($page);  }

 # Once we have the authorization ID, we'll work with that;  # the credit card number is no longer needed.  delete ($card_info->{card_number});

But the version of issue_authorization_request() in doc_shop.pl is really nothing more than a fake stub routine that ignores its argument and returns the same transaction ID each time it s invoked:

 sub issue_authorization_request  { my $card_info = shift;      return ("123456");  }

The specific details involved in handling these requests will depend on your authorization service, so you d have to replace issue_authorization_request() with a routine that conforms to your service s conventions. See Collecting Payment Information later in the chapter for an overview of this process.

Performing Transactions

If the credit authorization request succeeds, we can store the order in the database. However, recall that doc_shop.pl is going to use two tables to store order information (doc_order and doc_item). We want them to be updated together, but it s possible that the updates will succeed for one table and fail for the other. This would leave the tables in an inconsistent state. The way to handle this is to use a transaction a set of statements that succeed or fail as a unit.

MySQL and DBI Transaction Syntax

To execute a group of statements as a transaction, you must manipulate MySQL s statement committal mode. By default, MySQL runs in auto-commit mode: changes produced by individual statements are committed to the database immediately as soon as they execute. (In other words, it s as if a COMMIT statement implicitly follows each statement.) DBI also runs in auto-commit mode by default.

MySQL s commit mode is affected by the following two SQL statements:

 SET AUTOCOMMIT = 0            # disable auto-commit  SET AUTOCOMMIT = 1            # enable auto-commit

While auto-commit is disabled, you begin a transaction with BEGIN and end it with COMMIT, like this:

 BEGIN  INSERT INTO table1...  INSERT INTO table2...  COMMIT

If an error occurs during the transaction, you can roll back the effect of all its constituent statements by issuing a ROLLBACK statement.

Unfortunately, the preceding syntax isn t necessarily portable to other transaction-capable databases, which may use somewhat different SQL statements from what MySQL uses. To deal with this problem, DBI provides a transaction abstraction for better portability. To turn auto-commit mode on or off from within a DBI script, set the AutoCommit attribute of your database handle:

 $dbh->{AutoCommit} = 0;       # disable auto-commit  $dbh->{AutoCommit} = 1;       # enable auto-commit

To execute a multiple-statement transaction with auto-commit disabled, issue the statements, and then invoke commit() to commit the statements as a group:

 $dbh->do ("INSERT INTO table1...");  $dbh->do ("INSERT INTO table2...");  $dbh->commit ();

(Another way to end a successful transaction is to set the AutoCommit attribute true again; this causes commit() to be called automatically for any pending transaction.) If an error occurs during the transaction, invoke the rollback() method to abort it.

The effect of closing a database handle while a transaction is pending is undefined in DBI, so be sure to commit transactions yourself before disconnecting from the database server.

Requirements for Transaction Support

Transaction support in DBI requires transaction capabilities in the underlying database. For MySQL, transactions appeared in version 3.23.17. However, the 3.23.xx series wasn t declared stable until version 3.23.28, so it s best to use a release at least that recent.

You also need to use table types that support transactions, and not all types do (for example, ISAM and MyISAM tables do not). The CREATE TABLE statements that were shown earlier set up the doc_order and doc_item table as BDB tables, one of the applicable types. BDB (Berkeley DB) tables are based on work contributed by Sleepycat. Depending on your version of MySQL, other transaction-capable table types may be available to you as well, such as InnoDB tables, contributed by InnoBase, and Gemini tables, contributed by NuSphere. (The Gemini work is in beta at the moment, but should be ready by the time you read this.) Consult the MySQL manual to see what table types are available for performing transactions in current releases.

Finally, you ll need a version of DBD::mysql (the MySQL-specific DBI driver) at least as recent as 1.2216 if you want to use the DBI transaction abstraction. Older versions of the driver cannot handle transactions using the standard DBI mechanism (setting the AutoCommit attribute has no effect). In this case, to control transactional behavior, you ll have to issue SET AUTOCOMMIT, BEGIN, COMMIT, and ROLLBACK statements directly.

Storing Document Orders Using Transactions

The store_order() routine takes the order information and stores it in the doc_order and doc_item tables. This routine simply issues the requisite INSERT statements necessary to store an order, blissfully unaware of anything having to do with transactions. The real work of setting up transactional behavior takes place in close_order(), from which store_order() is called:

 $dbh->{AutoCommit} = 0;     # disable auto-commit mode  eval  {     store_order ($dbh, $order_info) and $dbh->commit ();  };  if ($@)                     # transaction failed, abort and notify user  {     $dbh->rollback ();      $dbh->disconnect ();      fatal_error ("Sorry, could not store order in database");  }  $dbh->{AutoCommit} = 1;     # restore auto-commit mode

This section of code first disables auto-commit behavior. Then it calls store_order() within an eval block so that errors can be trapped. If store_order() executes successfully, the SQL statements issued by it are committed using commit(). If store_order() fails, an exception is raised and the error is returned in the $@ variable. In that case, we call rollback() to cancel the transaction and inform the user of the problem.

This code is written based on the assumption that DBI catches errors automatically and triggers an exception if one occurs. Otherwise, the commit() call will execute even if a statement fails which of course defeats the purpose of using a transaction in the first place! You can ensure the proper behavior by disabling PrintError and enabling RaiseError.

If your version of MySQL doesn t have transaction support and you can t upgrade, you can change the preceding section of code by substituting the following statements instead:

 if (!store_order ($dbh, $order_info))  {     $dbh->disconnect ();      fatal_error ("Sorry, could not store order in database");  }

However, if you do this and failure occurs in the middle of a transaction, you may end up with doc_order records that have no corresponding doc_item records. You can use the following query to test for this condition:

 SELECT doc_order.*  FROM doc_order LEFT JOIN doc_item USING (order_id)  WHERE doc_item.order_id IS NULL

If the query produces any rows, they represent failed attempts to store order information that didn t succeed in updating both tables.

The function that actually stores rows into the two order tables, store_order(), looks like this:

 sub store_order  { my ($dbh, $order_info) = @_;      # add one record to the doc_order table      $dbh->do ("INSERT INTO doc_order                  (order_date,order_amt,auth_id,cust_name,cust_email)                  VALUES(CURRENT_DATE,?,?,?,?)",                  undef,                  $order_info->{total_price},                  $order_info->{auth_id},                  $order_info->{cust_name},                  $order_info->{cust_email});      # order_id is the AUTO_INCREMENT value from the preceding INSERT statement      $order_info->{order_id} = $dbh->{mysql_insertid};      # add one record to the doc_item table for each document ordered; use      # the order_id value to tie the records to the corresponding doc_order      # record      foreach my $ref (@{$order_info->{item}})      {         # generate password for the document, store it in the item record,          # and use it to construct the URL for downloading the document          my $password = get_rand_pass ();          $ref->{url} = get_url_prefix () # need full URL path here!                          . sprintf ("serve_doc.pl?order=%s;doc=%s;password=%s",                                          escape ($order_info->{order_id}),                                          escape ($ref->{doc_id}),                                          escape ($password));          $dbh->do ("INSERT INTO doc_item                      (order_id,doc_id,price,password)                      VALUES(?,?,?,?)",                          undef,                          $order_info->{order_id},                          $ref->{doc_id},                          $ref->{price},                          $password);      }  }

As it stores the doc_item record for each document, store_order() creates the password that must be supplied by the customer to obtain the document. These passwords are random 32-character MD5-based strings. The gen_rand_pass() routine that creates passwords uses a random number, the current time, and the current process ID as sources of randomness:^[5]

^[5] The current process ID by itself is not necessarily unique if you re running in a mod_perl environment because a given script may be executed several times by the same httpd process.

 sub get_rand_pass  { my $md = Digest::MD5->new ();      $md->add (rand ());             # add random number      $md->add (localtime (time ())); # add current time      $md->add ($$);                  # add current process ID      return ($md->hexdigest ());  }

store_order() also generates the URL to be used for downloading each document and stores it in the order information structure. These URLs are used later when we tell the customer how to obtain the documents. (The serve_doc.pl script named in the URLs is described in the next section.) The URLs are absolute rather than relative. Although relative links would work for a browser page, the URLs will also be used in the mail message we send to the customer. Absolute URLs are needed in that context.

Providing the Documents for Download

We ve finally arrived at the part the customer is interested in: getting the documents. We provide these by presenting a browser page containing a link for each document the customer ordered. We ll also email a copy of the links because the customer might accidentally close the browser window thus causing the links to disappear!

close_order() invokes gen_confirmation_page() to generate the final order information in HTML format for browser display and send_confirmation_email() to create and send the corresponding email message in plain text format:

 sub gen_confirmation_page  { my $order_info = shift;  my $page;      $page .= p ("Thank you for ordering from PseudEcom.")              . p (escapeHTML ("Order number: $order_info->{order_id}"))              . p (escapeHTML ("Order authorization ID: $order_info->{auth_id}"))              . format_items_html ($order_info)              . p ("To download your documents, use the following links:");      foreach my $ref (@{$order_info->{item}})      {

         $page .= p ("Document $ref->{doc_id}: "                      . a ({-href => $ref->{url}}, escapeHTML ($ref->{title})));      }      $page .= p ("A copy of this information also will be mailed to you at\n"                  . escapeHTML ($order_info->{cust_email}) . ".");      return ($page);  }  sub send_confirmation_email  { my $order_info = shift;  my $formatted_items = format_items_text ($order_info);  my %mail = (         From    => "black-hole\@localhost", # YOU SHOULD CHANGE THIS!          To      => $order_info->{cust_email},          Subject => "Your PseudEcom document order",          Message => ""  );  my $page;          $mail{Message} = <<EOF;  Thank you for ordering from PseudEcom.  Order number: $order_info->{order_id}  Order authorization ID: $order_info->{auth_id}  $formatted_items  To download your documents, use the following links:  EOF      foreach my $ref (@{$order_info->{item}})      {         $mail{Message} .= "Document $ref->{doc_id} ($ref->{title}):\n"                          . "$ref->{url}\n";      }      sendmail (%mail)          or $page .= p (escapeHTML ("Oops, failure sending mail to $mail{To}"));      return (defined ($page) ? $page : "");  }

The format_order_html() function used by gen_confirmation_page() has already been shown. format_order_text(), used by send_confirmation_email, is much the same, so I won t show it here.

Each URL sent to the customer to be used for downloading a document includes the order and document numbers and the document-specific password:

 http://.../serve_doc.pl?order=order_id;doc=doc_id;password=xxx

Aside from some code to check the document password, the serve_doc.pl script that handles document downloading is fairly similar in principle to serve_image.pl,a script written for transferring images in Chapter 5. The only real difference is that we generate a different set of headers to send prior to the request data. These headers are

A Content-Type: header with a value of application/pdf.
A Content-Length: header; the value is the size of the file containing the PDF document.
A Content-Disposition: header to let the browser know the document filename. (Whether the browser actually saves the file or just displays it is subject to client user preferences that are beyond our control.)

 serve_doc.pl looks like this:     #! /usr/bin/perl -w     # serve_doc.pl - Serve a PDF file from the document shop     # The document should be requested using a URL of the form:     # .../serve_doc.pl?order=order_num;doc=doc_num;password=xxx     # order_num and doc_num are the order and document ID values, and     # xxx is the document password.     use strict;     use lib qw(/usr/local/apache/lib/perl);     use IO::File;     use CGI qw(:standard);     use WebDB;     use WebDB::PseudEcom;     # Change this pathname as necessary for your site     my $pdf_lib_dir = "/usr/local/apache/lib/htfiles";     my $order_id = param ("order");     my $doc_id = param ("doc");     my $password = param ("password");     (defined ($order_id) && defined ($doc_id) && defined ($password))         or fatal_error ("Invocation error: required parameters are missing");     my $dbh = WebDB::connect ();     # Make sure customer ordered this document and knows the password     $dbh->selectrow_array (                        "SELECT doc_id FROM doc_item                         WHERE order_id = ? AND doc_id = ? AND password = ?",                         undef, $order_id, $doc_id, $password)      or fatal_error ("Document password mismatch");  # Look up document's file name  my $file_name = $dbh->selectrow_array (                     "SELECT filename FROM doc_catalog WHERE doc_id = ?",                      undef, $doc_id);  fatal_error ("No file with ID $doc_id found") unless defined ($file_name);  $dbh->disconnect ();  # Open the document and transfer it as a PDF download.  # Print the request headers, then the file's content.  my $path_name = $pdf_lib_dir . "/" .. $file_name;  my $fh = new IO::File ($path_name)      or fatal_error ("Could not open $file_name");  print header (-type => "application/pdf",                  -Content_Length => (-s $fh),     # size of file                  -Content_Disposition => "attachment; filename = $file_name");  my $data;  print $data while read ($fh, $data, 1024);  undef $fh;          # close file  exit (0);

PDF Documents Need No Conversion

PDF is a platform-neutral format, so we can transfer PDF documents as is with no interpretation or conversion to a different format. I m pointing this out to make a plea: Don t take platform-neutral documents and convert them to platform-specific form. For example, PDF, Word, and Excel documents all can be opened perfectly well in native format on either Mac OS or Windows machines. But here in Wisconsin, some state government agencies do a strange thing with such documents: They package them inside executable .exe installer programs that can be run only under Windows. It s insane to turn a multiple-platform document into something that can be opened only on a single platform.

Suggested Modifications

Add a Revise Order button to the checkout page so that the user can return to the initial document selection page. This handles cases where the user notices upon reaching the checkout page that the order isn t correct.

serve_doc.pl isn t particularly helpful to the customer if an error occurs. Add some text to any error pages it prints to provide the customer with a URL or other contact information for reaching the customer service department. Alternatively, add an option so users can avoid download operations altogether by requesting that documents be delivered as email attachments.

The email address that we collect in the billing form is needed for sending order confirmation messages, but could be useful in other ways. Add a check box to the form that enables customers to indicate whether they d like to receive notices when new documents are available. Then use the doc_order table to create a mailing list so that you can issue such notices.

only for RuBoard - do not distribute or recompile