7.4 Rebase: Building Dynamic Web Pages


The simple examples in the previous sections showed how to load and use the CGI.pm module to display a very simple web page and how to examine the error logs of the web server to help debug a CGI program that doesn't display properly.

The real power of CGI comes from its ability to provide dynamic content ”web pages that may display different information depending on such factors as when they're called, such as the date and time in the previous example. Dynamic content also handles the requests of users that are entered by typing in text fields, clicking on so-called "radio" buttons , selecting from lists, or other ways of inputting.

In this section, I'll show you how to use some of the modules from previous chapters, combined with the use of the CGI.pm module, to make an interactive, dynamic web page for displaying restriction maps. In this web page, the user will select which restriction enzyme or enzymes to search for and specify the sequence to search either by entering the sequence data into a text window or by browsing for the file that contains the sequence.

Here is the short CGI program, webrebase1 , that accomplishes this. The main reason that it's short is because I've already developed modules for reading sequence files, for accessing the Rebase database, for calculating restriction maps, and for displaying the maps with simple text graphics. I can just reuse those modules here to accomplish my task:

 #!/usr/bin/perl # webrebase1 - a web interface to the Rebase modules # To install in web, make a directory to hold your Perl modules in web space use lib "/var/www/html/re"; use Restrictionmap; use Rebase; use SeqFileIO; use CGI qw/:standard/; use strict; use warnings;     print header,     start_html('Restriction Maps on the Web'),     h1('<font color=orange>Restriction Maps on the Web</font>'),     hr,     start_multipart_form,     '<font color=blue>',     h3("1) Restriction enzyme(s)?  "),     textfield('enzyme'), p,     h3("2) Sequence filename (fasta or raw format):  "),     filefield(-name=>'fileseq',         -default=>'starting value',         -size=>50,         -maxlength=>200,     ), p,     strong(em("or")),     h3("Type sequence:  "),     textarea(         -name=>'typedseq',         -rows=>10,         -columns=>60,         -maxlength=>1000,     ), p,     h3("3) Make restriction map:"),     submit, p,     '</font>',     hr,     end_form; if (param(  )) {     my $sequence = '';     # must have exactly one of the two sequence input methods specified     if(param('typedseq') and param('fileseq')) {         print "<font color=red>You have given a file AND typed in sequence: do only one!</font>", hr;                exit;     }elsif(not param('typedseq') and not param('fileseq')) {         print "<font color=red>You must give a sequence file OR type in sequence!</ font>", hr;         exit;     }elsif(param('typedseq')) {         $sequence = param('typedseq');     }elsif(param('fileseq')) {         my $fh = upload('fileseq');         while (<$fh>) {             /^\s*>/ and next; # handles fasta file headers             $sequence .= $_;         }     }     # strip out non-sequence characters     $sequence =~ s/\s//g;     $sequence = uc $sequence;     my $rebase = Rebase->new(         #omit "bionetfile" attribute to avoid recalculating the DBM file         dbmfile => 'BIONET',         mode => '0444',     );     my $restrict = Restrictionmap->new(         enzyme => param('enzyme'),         rebase => $rebase,         sequence => $sequence,         graphictype => 'text',     );         print "Your requested enzyme(s): ",em(param('enzyme')),p,     "<code><pre>\n";     (my $paramenzyme = param('enzyme')) =~ s/,/ /g;     foreach my $enzyme (split(" ", $paramenzyme)) {         print "Locations for $enzyme: ",         join(' ', $restrict->get_enzyme_map($enzyme)), "\n";     }     print "\n\n\n";     print $restrict->get_graphic,     "</pre></code>\n",     hr; } print end_html; 

7.4.1 Installing webrebase1

Installing webrebase1 is almost exactly the same as installing the scripts seen earlier in this chapter, such as cgiex1.cgi . However, because this program depends on several modules, it is necessary to copy them into a directory that can be found by the web server. It's possible to configure a web server to look into any directory; you may prefer to leave your modules in one place and give the necessary permissions to your web server to look there. If your code lives in one place, there won't be any problem with out-of-sync duplicate copies.

However, there are security problems associated with letting the world execute programs from your own directories. Also, if you try out a change that doesn't work while you're tinkering with your code, any users on the web site will find the programs broken as well. So, often it does make sense to have a development area and a production area where you try to ensure that only working, tested , and secure programs are placed for public consumption.

On my Red Hat Linux system, I created a directory /var/www/html/re and copied the modules Restrictionmap.pm , Rebase.pm , and SeqFileIO.pm there. On your system and web server, you may need to check that the existence, ownership, and permissions on that directory and those module files are suitable for your web server's configuration.

I then copied my CGI program webrebase1 into my CGI directory (on my system, /var/www/cgi-bin ), and dealt with the same questions of ownership and file permissions as detailed in earlier sections of this chapter. (As they say down at the dealership , your mileage may vary depending on the operating system and web server that you are using.)

It's possible that the first line that invokes the Perl application, #!/usr/bin/perl , may have to be changed to run from your web space. Sometimes a web server is configured to restrict calling any programs from outside the web space, and a Perl application must be installed into the web space (usually with such extra security precautions as taint checks compiled into the application). If you plan to offer programs to the world from a computer that has sensitive information or is connected to other computers that have sensitive information, such precautions are often desirable. But just to get started, try using the same Perl application you've been using, and it's likely to work.

One consequence of running the Restrict.pm module is that the DBM file called bionet is created in your web server's CGI directory (/ var/www/cgi-bin on my Linux system running an Apache web server). So, another thing to check is whether you have enough space for any files your programs may create in your web space.

7.4.2 Inside webrebase1

webrebase1 first loads the required modules: the ones you've written and the standard CGI.pm module.

The program has two parts . The first part is always executed, and displays the form that asks the user to enter the required information to run the program. The second part executes only when the program is called with parameters set, which happens after the user has filled out the form and hit the Submit Query button.

Let's look at the first part of the code that creates the form. Everything in this part of the form is one long print statement. The list of things to print is composed mostly of calls to various CGI.pm functions. For details on these functions, take a look at the CGI.pm documentation on www.perldoc.org or by typing perldoc CGI at a command prompt.

Here are the CGI functions called but not seen in the earlier programs:

h1('<font color=orange>Restriction Maps on the Web</font>')

This is a header, as seen previously; however, it includes a color directive for the font.

start_multipart_form

Makes the part of the form that handles file uploading work correctly.

hr

Draws a horizontal line across the screen.

' <font color=blue> '

Makes everything in the form blue, up to the closing directive ' </font> '.

h3("1) Restriction enzyme(s)? ")

This header, like similar headers in the form, labels the following textfield so the user knows what information is requested.

textfield('enzyme')

textfield creates a place for the user to type in a line of text. The string the user types in is accessible by means of the parameter named enzyme when the form is submitted. The user can type in the names of enzymes such as EcoRI and HindIII to find more information.

p

This starts a new paragraph in the form.

filefield(-name=>'fileseq' , -default=>'starting value' , -size=>50 ,
-maxlength=>200,)

filefield provides a way to give the name of a file that contains a sequence. When the form is submitted, that file is uploaded from the user's computer onto your computer where it can be used to find the restriction map. As you can see in Figure 7-5, the user can type in the pathname of a sequence file or use a mouse to interactively browse until the desired sequence file is found.

The option name gives the name of the parameter that has the contents of the file; size and maxlength are the size of the field displayed and the maximum length of the filename.

Figure 7-5. Rebase1 in a browser window
figs/mpb_0705.gif
strong(em("or"))

This prints the word or in some strong fashion (usually in a bold font, but at the discretion of the user's web browser) and with some emphasis (usually in italics).

textarea( -name=>'typedseq', -rows=>10 , -columns=>60, -maxlength=>1000,)

textarea provides a box in which the user can type or use the mouse to cut and paste the sequence directly, as opposed to giving the name of a file that contains the sequence.

submit

This button collects the values the user has given on the form into the named parameters and restarts the program by submitting the form to the web server, this time with the parameters set. webrebase1 has a section that uses the parameters to perform a computation, as you will see shortly.

end_form

This closes the start_multipart_form given earlier.

end_html

This CGI function is called at the end of the webrebase1 program, and it prints the final required HTML tags for the page before sending it back to the user's web browser.

When the user hits the Submit button, the parameters are assigned the values the user has indicated, and the program is called again. This time, after printing the form, the program gets to the conditional block beginning:

 if (param(  )) { 

The param( ) CGI function returns a true value if parameters have been sent to the program, so at this point the block is entered. The block does some error checking, extracts the information from the parameters, computes the restriction map, and displays the results.

The error checking ensures that all the data needed from the parameters for the computation to proceed is present.

Assuming the parameters have been set correctly, the program gets the sequence to be mapped from either the typed-in textbox field:

 }elsif(param('typedseq')) {     $sequence = param('typedseq'); 

or from the uploaded file:

 }elsif(param('fileseq')) {     my $fh = upload('fileseq');     while (<$fh>) {         /^\s*>/ and next; # handles fasta file headers         $sequence .= $_;     } } 

As you can see, the uploaded file is provided as an opened filehandle to your webrebase1 program. The while loop assumes that the file is in FASTA format (see the exercises for this chapter), skips the header, and collects the sequence.

After cleaning up the sequence by stripping out newlines and making it uppercase, the program then calls the Restrict and Restrictionmap modules to calculate the restriction map with the requested enzymes as available by means of the param('enzyme') CGI function call.

Finally, the program is ready to display the results. As you can see in Figure 7-6, the results appear after the form. First, webrebase1 prints out the names of the requested enzymes:

 print "Your requested enzyme(s): ",em(param('enzyme')),p, 
Figure 7-6. Results of the Rebase1 query
figs/mpb_0706.gif

Recall that the trailing ,p , is a CGI directive to start a new paragraph; the em( ) CGI function asks the user's web browser to emphasis the text, probably by italics.

The next line is not part of CGI but is an HTML directive that ensures the following lines are printed in a fixed-width font. Every character will thus take the same amount of horizontal space, and all lines and spaces will line up just as they do when they're printed to your screen. Without this directive, the web browser could use a non-fixed font, and the map would improperly display:

 "<code><pre>\n"; 

Of course, as you now know about HTML tags, they are almost always required to appear in pairs, so this directive has a closing tag after the map is displayed:

 "</pre></code>\n", 

The restriction map for each enzyme, by which I mean the simple list of locations for each enzyme in the sequence, is displayed by the following code:

 (my $paramenzyme = param('enzyme')) =~ s/,/ /g; foreach my $enzyme (split(" ", $paramenzyme)) {     print "Locations for $enzyme: ",     join(' ', $restrict->get_enzyme_map($enzyme)), "\n"; } print "\n\n\n"; 

First the space- or comma-separated list of enzymes is collected from the parameter enzyme , and the commas, if present, are removed. The enzymes are then split (on whitespace) into a list, and for each such enzyme, the Restrictionmap method get_enzyme_map is called to display the list of locations.

Finally, the graphic map is displayed. This is accomplished, as before, by a simple call to the Restrictionmap method get_graphic , which returns a simple text version of the graphic (because graphictype => ' text ' was specified when Restrictionmap object $restrict was created).

 print $restrict->get_graphic, 


Mastering Perl for Bioinformatics
Mastering Perl for Bioinformatics
ISBN: 0596003072
EAN: 2147483647
Year: 2003
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net