Ruby and CGI Programming

	Ruby Way By Hal Fulton Slots : 1.0
	Table of Contents

Anyone familiar with Web programming has at least heard of CGI (Common Gateway Interface). CGI was created in the early days of the Web to enable programmatically implemented sites and to allow for more interaction between the end user and the Web server. Although countless replacement technologies have been introduced since its inception, CGI is still alive and well in the world of Web programming. Much of CGI's success and longevity can be attributed to its simplicity. Because of this simplicity, it is quite easy to implement CGI programs in any language. The CGI standard specifies how a Web server process will pass data between itself and its children. Most of this interaction occurs through standard environment variables and streams in the implementation operating system.

CGI programming, and HTTP for that matter, are based around a "stateless" request and response mechanism. Generally, a single TCP connection is made, and the client (usually a Web browser) initiates conversation with a single HTTP command. The two most commonly used commands in the protocol are GET and POST. (We'll get to the meaning of these shortly.) After issuing the command, the Web server responds and closes its output stream.

The following code sample, only slightly more advanced than the standard "Hello world," shows how to do input and output via CGI.

 def parse_query_string   inputs = Hash.new   raw = ENV['QUERY_STRING']   raw.split("&").each do |pair|     name,value = pair.split("=")     inputs[name] = value   end   inputs end inputs = parse_query_string print "Content-type: text/html\r\n\r\n" print "<HTML><BODY>" print "<B><I>Hello</I>, #{ inputs['name']} !</B>" print "</BODY></HTML>"

Accessing the URL (for example) http://mywebserver/cgi-bin/hello.cgi?name=Dali would produce the output "Hello, Dali!" in your Web browser.

As we previously mentioned, there are two main ways to access a URL: the HTTP GET and POST methods. For the sake of brevity, we offer extremely simple explanations of these methods, rather than rigorous definitions. The GET method is usually called when clicking a link or directly referencing a URL (as in the preceding example). Any parameters are passed via the URL query string, which is made accessible to CGI programs via the QUERY_STRING environment variable. The POST method is usually used in HTML form processing. The parameters sent in a POST are included in the message body, and are not visible via the URL. They are delivered to CGI programs via the standard input stream.

Though the previous example was very simple, anything less trivial could quickly become messy. Programs needing to deal with multiple HTTP methods, file uploads, cookies, "stateful" sessions, and other complexities are best suited by a general purpose library for working with the CGI environment. Thankfully, Ruby provides a full-featured set of classes that automate much of the mundane work one would otherwise have to do manually.

We should mention that recently there has been much discussion of a "next generation" CGI library for Rubyone with enhanced capabilities, a better interface, and separation of real CGI issues from mere HTML generation. We hope that great things come from this; but as we go to press, it is sheer vaporware. We can only document what already exists; and though it might be imperfect, it certainly is stable and usable.

Overview: Using the CGI Library

The CGI library is in the file cgi.rb in the standard Ruby distribution. Most of its functionality is implemented around a central class, aptly named CGI. One of the first things you'll want to do when using the library, then, is to create an instance of CGI.

 require "cgi" cgi = CGI.new("html4")

The initializer for the CGI class takes a single parameter, which specifies the level of HTML that should be supported by the HTML generation methods in the CGI package. These methods keep the programmer from having to embed a truckload of escaped HTML into otherwise pristine Ruby code:

 cgi.out do   cgi.html do     cgi.body do       cgi.h1 {  "Hello Again, " }  +       cgi.b {  cgi['name']}     end   end end

Here, we've used the CGI libraries to almost exactly reproduce the functionality of the previous program. As you can see, the CGI class takes care of parsing any input, and stores the resulting values internally as a hash-like structure. So if you specified the URL as some_program.cgi?age=4, the value could be accessed via cgi['age'].

Note in the previous code fragment that it's really only the return value of a block that is used; the HTML is built up gradually and stored, rather than being output immediately. This means that the string concatenation we see here is absolutely necessary; without it, only the last string evaluated would appear.

The CGI class also provides some convenience mechanisms for dealing with URL-encoded strings and escaped HTML or XML. URL encoding is the process of translating strings with unsafe characters to a format that is representable in a URL string. The result is all those weird-looking % strings you see in some URLs while you browse the Web. These strings are actually the numeric ASCII codes represented in hexadecimal with % prepended.

 require "cgi" s = "This| is^(aT$test" s2 = CGI.escape(s)        # "This%7C+is%5E%28aT%24test" puts CGI.unescape(s2)     # Prints "This| is^(aT$test"

Similarly, the CGI class can be used to escape HTML or XML text that should be displayed verbatim in a browser. For example, the string <some_stuff> would not display properly in a browser. If there is a need to display HTML or XML literally in a browserin an HTML tutorial, for examplethe CGI class offers support for translating special characters to their appropriate entities:

 require "cgi" some_text = "<B>This is how you make text bold</B>" translated = CGI.escapeHTML(some_text) # "&lt;B&gt;This is how you make text bold&lt;/B&gt;" puts CGI.unescapeHTML(translated) # Prints "<B>This is how you make text bold</B>"

Displaying and Processing Forms

The most common way of interacting with CGI programs is through HTML forms. HTML forms are created by using specific tags that will be translated to input widgets in a browser. A full discussion or reference is beyond the scope of this text, but there are numerous references available, both in books and on the Web.

The CGI class offers generation methods for all the HTML form elements. The following example shows how to both display and process an HTML form:

 require "cgi" def reverse_ramblings(ramblings)   if ramblings[0] == nil then return "" end   chunks = ramblings[0].split(/\s+/)   chunks.reverse.join(" ") end cgi = CGI.new("html4") cgi.out do   cgi.html do     cgi.body do       cgi.h1 {  "sdrawkcaB txeT" }  +       cgi.b {  reverse_ramblings(cgi['ramblings'])}  +       cgi.form("action" => "/cgi-bin/rb/form.cgi") do         cgi.textarea("ramblings") {  cgi['ramblings'] }  + cgi.submit       end     end   end end

This example displays a text area, the contents of which will be tokenized into words and reversed. For example, typing This is a test into the text area would yield test a is This after processing. The form method of the CGI class can accept a method parameter, which will set the HTTP method (GET, POST, and so on) to be used on form submittal. The default, used in this example, is POST.

This example contains only a small sample of the form elements available in an HTML page. For a complete list, go to any HTML reference.

Working with Cookies

HTTP is, as mentioned previously, a stateless protocol. This means that after a browser finishes a request to a Web site, the Web server has no way to distinguish its next request from any other arbitrary browser on the Web. This is where HTTP cookies come into the picture. Cookies offer a way, albeit somewhat crude, to maintain state between requests from the same browser.

The cookie mechanism works by way of the Web server issuing a command to the browser, via an HTTP response header, asking the browser to store a name/value pair. The data can be stored either in memory or on disk. For every successive request to the cookie's specified domain, the browser will send the cookie data in an HTTP request header.

Of course, you could read and write all these cookies manually, but you've probably already guessed that you're not going to need to. Ruby's CGI libraries provide a Cookie class that conveniently handles these chores.

 require "cgi" lastacc = CGI::Cookie.new("kabhi",                          "lastaccess=#{ Time.now.to_s} ") cgi = CGI.new("html3") if cgi.cookies.size < 1   cgi.out("cookie" => lastacc) do     "Hit refresh for a lovely cookie"   end else   cgi.out("cookie" => lastacc) do     cgi.html do       "Hi, you were last here at: " +       "#{ cgi.cookies['kabhi'].join.split('=')[1]} "     end   end end

Here, a cookie called kabhi is created, with the key lastaccess set to the current time. Then, if the browser has a previous value stored for this cookie, it is displayed. The cookies are represented as an instance variable on the CGI class and stored as a Hash. Each cookie can store multiple key/value pairs, so when you access a cookie by name, you will receive an array.

Working with User Sessions

Cookies are fine if you want to store simple data, and you don't mind the browser being responsible for persistence. But, in many cases, data persistence needs are a bit more complex. What if you've got a lot of data you want to maintain persistently, and you don't want to have to send it back and forth from the client and server with each request? What if there is sensitive data you need associated with a session, and you don't trust the browser with it?

For more advanced persistence in Web applications, use the CGI::Session class. Working with this class is similar to working with the CGI::Cookie class, in that values are stored and retrieved via a hash-like structure.

 require "cgi" require "cgi/session" cgi = CGI.new("html4") sess = CGI::Session.new( cgi, "session_key" => "a_test",                               "prefix" => "rubysess.") lastaccess = sess["lastaccess"].to_s sess["lastaccess"] = Time.now if cgi['bgcolor'][0] =~ /[a-z]/   sess["bgcolor"] = cgi['bgcolor'] end cgi.out do   cgi.html do     cgi.body ("bgcolor" => sess["bgcolor"]) do       "The background of this page"    +       "changes based on the 'bgcolor'" +       "each user has in session."      +       "Last access time: #{ lastaccess} "     end   end end

Accessing /thatscript.cgi?bgcolor=red would turn the page red for a single user for each successive hit until a new bgcolor was specified via the URL. CGI::Session is instantiated with a CGI object and a set of options in a Hash. The optional session_key parameter specifies the key that will be used by the browser to identify itself on each request. Session data is stored in a temporary file for each session, and the prefix parameter assigns a string to be prepended to the filename, making your sessions easy to identify on the filesystem of the server.

There are still many features that CGI::Session is lacking, such as the ability to store objects other than Strings, session storage across multiple servers, and other "nice-to-have" capabilities. Fortunately, a pluggable database_manager mechanism is already in place, and would make some of these features quite easy to add. If you do anything exciting with CGI::Session, be sure to let us know.

Using FastCGI

The most criticized shortcoming of CGI is that it requires a new process to be created for every invocation. The effect on performance is significant. The lack of a capability to leave objects in memory between requests can also have a negative impact on design. The combination of these difficulties has led to the creation of something called FastCGI.

FastCGI is basically nothing more than a protocol definition, a design, and a set of software implementing that protocol. Usually implemented as a Web server plug-in, such as an Apache module, it enables an in-process helper to intercept HTTP requests and route them via socket to a long running backend process. This has a very positive effect on speed compared to the traditional forking approach. It also gives the programmer the freedom to put things in memory and still find them there on the next request.

A fair question would be: How do mod_ruby and FastCGI compare? There are definite tradeoffs involved.

Because Apache is a forking Web server, resources are allocated and freed without the full knowledge of the application, making it problematic to store session information. In FastCGI, all requests are handled by a single process, making it easy to cache data, keep database connections open, and store session data in memory (where arguably it should be).

FastCGI offers no access to Apache's internals. If you really need that kind of access, mod_ruby is a better choice.

FastCGI also works with other Web servers such as Zeus and Netscape. Potentially any server can be supported by using the CGI-to-FastCGI adapter, which is a tiny CGI script that handles CGI connections for you. It is not as efficient as a plug-in like mod_fastcgi, but does still eliminate the overhead of (for example) reconnecting to a database and reloading config files every time a CGI executes.

Conveniently, servers for FastCGI have been implemented in a number of languages, including Ruby. Eli Green created a module (available via the RAA) entirely in Ruby, which implements the FastCGI protocol and eases the development of FastCGI programs.

We present a sample application in Listing 9.8. As you can see, this code fragment mirrors the functionality of the earlier example.

Listing 9.8 A FastCGI Example

 require "fastcgi" require "cgi" last_time = "" def get_ramblings(instream)   # Unbeautifully retrieve the value of the first name/value pair   # CGI would have done this for us.   data = ""   if instream != nil     data = instream.split("&")[0].split("=")[1] || ""   end   return CGI.unescape(data) end def reverse_ramblings(ramblings)   if ramblings == nil then return "" end   chunks = ramblings.split(/\s+/)   chunks.reverse.join(" ") end server = FastCGI::TCP.new('localhost', 9000) begin   server.each_request do |request|   stuff = request.in.read   out = request.out   out << "Content-type: text/html\r\n\r\n"   out << "<html>"   out << "<head><title>Text Backwardizer</title></head>"   out << "<h1>sdrawkcaB txeT</h1>"   out << "<i>You previously said: #{ last_time} </i><BR>"   out << "<b>#{ reverse_ramblings(get_ramblings(stuff))} </b>"   out << "<form method=\"POST\" action=\"/fast/serv.rb\">"   out << "<textarea name=\"ramblings\">"   out << "</textarea>"   out << "<input type=\"submit\" name=\"submit\""   out << "</form>"   out << "</body></html>"   last_time = get_ramblings(stuff)   request.finish   end ensure   server.close end

The first thing that strikes you about this code (if you've read the previous section) is the couple of things that you have to do manually in FastCGI that you wouldn't have had to do with the CGI library. One is the messy hard-coding of escaped HTML. The other is the get_ramblings method, which manually parses the input and returns only the relevant value. This code, by the way, only works with the HTTP POST methodanother convenience lost when not using the CGI library.

That being said, FastCGI is by no means without its advantages. We didn't run any benchmarks on this example, butit's in the nameFastCGI is faster than normal CGI. The overhead of starting up a new process is avoided in favor of making a local network connection to port 9000 (FastCGI::TCP.new('localhost', 9000)). Also, the last_time variable in this example is used to maintain a piece of state in memory in between requestssomething impossible with traditional CGI. Of course, the actual speed increase will depend on a number of complex factors, such as the choice of OS and Web server, the nature of the CGI, the amount of Web traffic, and so on.

We'll also point out that it's possible to a limited extent to mix and match these libraries. The helper functions from cgi.rb can be used on their own (without actually using this library to drive the application). For example, CGI.escapeHTML can be used in isolation from the rest of the library. This would make the previous example a little more readable.

Case Study: A Message Board

One of the most exciting things about the Web today is its ability to create a sense of virtual community. With the Internet, you have the potential to communicate in real-time with people from around the world. People who have never actually met in person can share common interests and even strike up friendships.

There are many ways by which this type of communication can happen via the Internet. The oldest and most firmly established way is the bulletin board metaphor. Since the advent of Usenet or the good old days of BBSs (Bulletin Board Systems), online communities have prospered around this asynchronous form of communication. The new breed of this age-old species is the Web-based message board application. From cheesy online matchmaking services to geek sites like userfriendly.org, the bulletin board metaphor is alive and well on the Web.

If you've ever wondered how to write your own bulletin board, we're here to help you. What follows is a treatise on our own lean, not-so-mean bulletin board system RuBoard.

Although RuBoard works, it certainly isn't good for much more than a starting point in the ways of the bulletin board. It is not the most robust or secure application ever written for the Web, but it should illustrate some concepts that will put you well on your way to making something deployable. During the discussion, we will point out some areas of potential improvement and leave them as the proverbial exercise for the reader.

This is the most lengthy example in the book, consisting of several files. For the sake of completeness, we've included them all in print. The following is a list of the files and their purposes:

board.cgi The main piece of code or "driver" for the entire CGI, through which all requests must pass.
mainlist.rb The main or default screen that displays the entire list of messages previously posted (see Figure 9.2).
Figure 9.2. RuBoard MainList page.
message.rb The Message and MessageStore classes that handle the loading and storing of messages.
savepost.rb The code for the save-post page (saving a post or reply).
viewmessage.rb The code for the view-message page, displaying a single message with all the relevant fields.
post.rb The code for the post-page, enabling the creation of a new post (message).
reply.rb The code for the reply-page, displayed when replying to a previous post.
authenticate.rb The code for authenticating a user (rudimentary in this example). Redirects to whatever page the user was originally trying to reference.

One thing will probably stand out after you've read some of the previous examples. Despite its relative complexity, RuBoard has only one actual CGI program. It consists of several screens (or pages), but there's only one CGI program controlling them all. In past examples, with simplicity in mind, we have always demonstrated CGI programs that served only one distinct page. In this case study, you'll see a central program board.cgi acting as a controller for all the bulletin board's logic and presentation-related activities. Refer to Listing 9.9 for the board.cgi source.

Listing 9.9 Message Board CGI (`board.cgi`)

 #!/usr/local/bin/ruby require "cgi" require "cgi/session" $session = nil def header(cgi)   "<B>Welcome, #{ get_session(cgi)['user']} !</B> - " +   "<i><a href=\"/cgi-bin/rb/board.cgi?cmd=post\">" +   "post a message</a></i> - <i>" +   "<a href=\"/cgi-bin/rb/board.cgi?cmd=mainlist\">home</a>." +   <BR><HR>" end def do_oops_page(cgi, err)   cgi.out do     cgi.html do       cgi.body do         "It appears that you have invoked the" +         " message board incorrectly. Oops.<BR>|#{ err} |"       end     end   end end def do_login_page(cgi)   cgi.out do     cgi.html do       cgi.body do         cgi.h1 {  "Welcome to Ruby Board" }  +         cgi.b {  "Please Login:" }  +         cgi.form("METHOD" => "get",                  "action" => "/cgi-bin/rb/board.cgi") do           cgi.text_field({  "name" => "user" } )  +           cgi.submit("Login") +           cgi.input({ "name" => "cmd",                      "value" => "authenticate",                      "type" => "hidden"} ) +           cgi.input({ "name" => "page",                      "value" => "#{ cgi['cmd'][0]} ",                      "type" => "hidden"} )         end       end     end   end end def run_command(cgi)   command = cgi['cmd'][0]   if command == "" || command == nil     command = "mainlist"   end   if command_safe?(command)     methname = "do_#{ command} _page"   else     do_oops_page(cgi, "Command \"#{ command} \"" +                       " inappropriately formatted.")   end   begin     eval "#{ methname} (cgi)"   rescue NameError     begin       require "#{ command} .rb"     rescue LoadError       do_oops_page(cgi, "Error loading #{ command} ")     end   end   eval "#{ methname} (cgi)"   exit end def command_safe?(command)   if command =~ (/^[a-zA-Z0-9]+$/) then     return true   end   false end def get_session(cgi)   if $session == nil then     $session = CGI::Session.new( cgi, "session_key" => "a_test",                                       "prefix" => "rubysess.")   end   return $session end def validate_session(cgi)   session = get_session(cgi)   if cgi["user"][0] =~ /[a-zA-Z0-9]/     return   end   if session["user"] !~ /[a-zA-Z0-9]/     do_login_page(cgi)   end end if __FILE__ == $0 then   cgi = CGI.new("html4")   validate_session(cgi)   err = run_command(cgi) end

This is a vague hint of the Model View Controller (MVC) design pattern, referenced in Design Patterns, published by Addison-Wesley and authored by the so-called "Gang of Four" (Gamma, Helm, Johnson, and Vlissides). The main advantage of our trimmed down, almost-MVC architecture is that application-wide changes can be implemented in a single place. For example, if we wanted to add a central logging facility, we could easily add it to board.cgi, and every page request would invoke the new utility.

The first thing a user must do when attempting to use the bulletin board is log in. As a function of our centralized CGI design, authentication is handled in one place. (Refer to the validate_session method in board.cgi.) This code won't enable unauthenticated requests through to any page other than the login page.

As you can see, the authentication scheme used ("we'll believe that you are whoever you say you are") isn't all that secure. But it suffices as a more-than-stub example of how you might force authentication in your own application. The basic flow is that a user comes to the application requesting a page, and validate_session first checks to make sure that the user is either already logged in or trying to log in. If the user hasn't logged in and isn't passing in a userid with which to attempt to authenticate himself, the login page will be displayed to prompt for a username. Because the central board.cgi handles all authentication in the application, it would be trivial to replace validate_session with a more robust security implementation.

You might have noticed the get_session call in board.cgi, which gets a handle to the current user's session; this is an implementation of the singleton design pattern (not a rigorous implementation). Here we've used a global variable; this is OK because CGI programs are forked as separate processes, giving each invocation its own memory space. If we were making, for example, a FastCGI program, we would need to devise a different strategy to keep users' sessions from clobbering each other. For this reason, we've hidden the session retrieval logic behind a method, as opposed to directly referencing the global variable from any code that needs access to the session. Again, we could change this method alone if we needed to move our program to an environment that was less friendly to global variables.

The next important piece of RuBoard's design emerges in the calls to the do_login_page and run_command methods. Each page in our humble framework can be referenced internally by a call to do_PAGENAME_page. So, for example, the login page is called via the do_login_page method. The run_command, also in board.cgi, is responsible for determining which page the user is trying to reach and dynamically invoking the necessary code to fulfill the request.

This code is both a little tricky and a little dangerous. It determines which method to run by looking at the value of the cmd key, passed in as a QUERY_STRING parameter from the Web browser. So, for example, invoking /board.cgi?cmd=hello would attempt to run a method called do_hello_page. If the method is not defined, a NameError will be raised, and the program will attempt to require a separate library containing the requested page. There are two red flags here, both involving the use of the cmd key. With the call to eval, we are executing arbitrary code, passed in from an anonymous Internet user, and with the call to require, we are reading arbitrary files from the server's hard disk. The command_safe? method's job is to allay these fears. Of course, this specific implementation leaves much to be desired. The code provided here is a simple starting point for a more robust set of checks.

After having successfully logged in, run_command defaults to an invocation of do_mainlist_page (see Listing 9.10), which creates the page called mainlist. This page gets the current list of messages on the bulletin board, and displays them to the user in a list. The user can then choose to read one of the messages, or to post a new message.

Listing 9.10 Message Board CGI (`mainlist.rb`)

 require "message" def do_mainlist_page(cgi)   messages = get_messages   user = get_session(cgi)['user']   template = get_template   messagerows = get_message_rows(template, messages)   template.gsub!(/%%LISTROW%%/, messagerows)   template.gsub!(/%%USER%%/, user)   template.gsub!(/%%HEADER%%/, header(cgi))   cgi.out{  template } end def get_message_rows(template, messages)   rows = ""   messages.each do |message|     rows << "<TR><TD><a href=\"/cgi-bin/rb/board.cgi?" +             "cmd=viewmessage&id=#{ message.id} \">" +             "#{ message.id} </a></TD><TD>#{ message.title} " +             "</TD><TD>#{ message.sender} </TD><TD>" +             "#{ message.date} </TD></TR>"   end   rows end def get_template   "<HTML><HEAD>     <TITLE>RuBoard!</TITLE>     </HEAD>     <BODY>     %%HEADER%%     <B>Message List</B><BR>     <TABLE border=1>       <TR>       <TD>ID</TD>       <TD>Title</TD>       <TD>Sender</TD>       <TD>Time</TD>       </TR>       %%LISTROW%%     </TABLE>   </BODY></HTML>" end

For a screenshot of a simple mainlist page, refer to Figure 9.2. This figure shows a list with only two messages in it.

The most interesting thing about this listing is the generation of the HTML. We have created our own scaled-down templating system. For a feature-filled, robust templating solution, see eruby or ERb in the Ruby Application Archive. (Also see the section "Using Embedded Ruby.") To keep our examples simple, we're sticking to a basic text replacement here. The advantage of these sorts of templating methods is that they enable the programmer to deal with HTML in a very familiar wayas simple HTML source code. This can sometimes be easier to visualize than the built-in "elements-as-methods" approach of the Ruby CGI library.

When this page is called, run_command will either find cmd=mainlist in the input parameters for the CGI program, or it will default to the mainlist page. It will then load mainlist.rb and execute the do_mainlist_page method. This method uses the Message and MessageStore classes (described later in this section) to retrieve the list of all messages currently on the bulletin board. It then makes a call to get_template and replaces the specially labeled keys, arbitrarily marked with surrounding double percent markers (%%), with dynamically generated text. After we've created a String with the desired presentation, we simply spit it into the cgi object's output stream and the program then exits.

So, where do these messages come from? How are they stored? Let's have a look at Listing 9.11.

Listing 9.11 Message Board CGI (`message.rb`)

 $filepath = "/tmp/messagestore.dat" class Message   attr_accessor :id, :title, :sender, :replies, :date, :body   def initialize(title, sender)     @title = title     @sender = sender     @date = Time.now     @replies = Array.new   end   def add_reply(message)     @replies.push message   end end class MessageStore   attr_accessor :messages, :filepath, :id, :message_table   def MessageStore.load(filepath)     if !FileTest.exist?(filepath)       welcomemsg = Message.new("Welcome to RuBoard", "chad")       welcomemsg.body = "Please enjoy your stay!"   mstore = MessageStore.new       mstore.add_message(welcomemsg)       f = File.new(filepath, "w")       Marshal.dump(mstore, f)       f.close     end     file = File.open(filepath, "r")     Marshal.load(file)   end   def save(filepath)     File.delete(filepath)     Marshal.dump(self, File.new(filepath, "w"))   end   def initialize     @message_table = Hash.new     @id = 0     @messages = Array.new   end   def add_message(message)     message.id = next_id     @message_table[message.id] = message     @messages.push message   end   def get_message(num)     @message_table[num]   end   private   def next_id     @id += 1   end end # Auxiliary methods... def get_message_view(message)   template = "<i>Message %%NUM%%</i><BR>   <i>From: %%SENDER%%</i><BR>   <i>Date: %%DATE%%</i><BR>   <i>Title: %%TITLE%%</i><BR>   <HR>   %%BODY%%   <HR>"   template.gsub!(/%%NUM%%/, message.id.to_s)   template.gsub!(/%%SENDER%%/, message.sender)   template.gsub!(/%%DATE%%/, message.date.to_s)   template.gsub!(/%%TITLE%%/, message.title)   template.gsub!(/%%BODY%%/, message.body)   template end def get_messages   mstore = get_message_store   mstore.messages end def get_message_store MessageStore.load($filepath) end

The two most important items here are the Message class, providing a simple object-oriented view of a message, and the MessageStore class, which handles the storage and retrieval of messages. MessageStore is where most of the message-related work actually takes place. This also happens to be one of those areas of potential improvement that we alluded to earlier.

Looking at MessageStore.load and MessageStore.save, you'll notice that the entire set of bulletin board messages is stored in a single file of marshalled Ruby objects on the server system. Although this design is simple, there are some problems with it. The worst of the problems is that the system can't handle concurrent users correctly. If two users were to attempt to update the message board at the same time, the result would certainly include failure and data loss. A better approach would be to use an RDBMS, such as MySQL or PostgreSQL, as the storage mechanism for bulletin board items. The internals of MessageStore could easily be replaced with database access or some other more suitable solution, because its interface doesn't fully expose the underlying storage strategy. Refer to Chapter 4, "External Data Manipulation," for more ideas about data storage in Ruby, including examples of how to interface with MySQL, a very popular database for Web application programming.

The four methods at the end of message.rb are convenience methods to avoid duplication in the various pages of the message board that require access to this data. Because each Web page is a separate, viewable entity, CGI applications can quickly degrade into an unmaintainable mound of copies and pastes. For this reason, it's important to be especially careful to look for chances to generalize and remove duplication when making CGI programs.

Five other files help comprise RuBoard. We've added these others to a single listing (Listing 9.12) because they are fairly short. These files get required by the run_command method as they are needed; they are kept separate for maintainability.

Listing 9.12 Message Board CGI (Other Files)

 # # File: authenticate.rb # require "message" def do_authenticate_page(cgi)   session = get_session(cgi)   session['user'] = cgi['user'][0]   page = cgi['page'][0]   if page == nil || page == ""     cgi.out do       '<HTML><HEAD><META HTTP-EQUIV="REFRESH"' +       ' CONTENT="1;URL=/cgi-bin/rb/board.cgi?cmd=mainlist">' +       '</HEAD><BODY></BODY></HTML>'     end   else     cgi['cmd'][0] = page     run_command(cgi)   end end # # File: post.rb # require "message" def do_post_page(cgi)   mstore = get_message_store   user = get_session(cgi)['user']   num = cgi['id'][0]   message = mstore.get_message(num.to_i)   template = get_template   template.gsub!(/%%HEADER%%/, header(cgi))   template.gsub!(/%%USER%%/, user)   cgi.out{  template } end def get_template   "<HTML><BODY>   %%HEADER%%   <FORM ACTION=\"/cgi-bin/rb/board.cgi\" METHOD=\"GET\">   <INPUT TYPE=HIDDEN NAME=cmd VALUE=savepost>   <INPUT TYPE=HIDDEN NAME=SENDER VALUE=%%USER%%>   <TABLE BORDER=0>   <TR>   <TD>Title:</TD><TD><INPUT TYPE=TEXT NAME=TITLE></TD>   </TR>   <TR>   <TD>Message Body:</TD>   <TD> <TEXTAREA rows=25 cols=80 NAME=BODY> </TEXTAREA></TD>   <TR><TD><INPUT TYPE=SUBMIT NAME=SUBMIT></TD><TD></TD></TR>   </TR>   </TABLE>   </FORM>   </BODY></HTML>" end # # File: reply.rb # require "message" def do_reply_page(cgi)   mstore = get_message_store   user = get_session(cgi)['user']   num = cgi['id'][0]   message = mstore.get_message(num.to_i)   template = get_template(message)   template.gsub!(/%%HEADER%%/, header(cgi))   template.gsub!(/%%USER%%/, user)   template.gsub!(/%%NUM%%/, num)   cgi.out{  template } end def get_template(message) "<HTML><BODY> %%HEADER%% #{ get_message_view(message)} <FORM ACTION=\"/cgi-bin/rb/board.cgi\" METHOD=\"GET\"> <INPUT TYPE=HIDDEN NAME=SENDER VALUE=%%USER%%> <INPUT TYPE=HIDDEN NAME=cmd VALUE=savepost> <INPUT TYPE=HIDDEN NAME=id VALUE=%%NUM%%> <TABLE BORDER=0> <TR> <TD>Title:</TD><TD><INPUT TYPE=TEXT NAME=TITLE></TD> </TR> <TR> <TD>Message Body:</TD> <TD> <TEXTAREA rows=25 cols=80 NAME=BODY> </TEXTAREA></TD> </TR> <TR><TD><INPUT TYPE=SUBMIT NAME=SUBMIT></TD><TD></TD></TR> </TABLE> </FORM> </BODY></HTML>" end # # File: savepost.rb # require "message" require "viewmessage" def do_savepost_page(cgi)   user = get_session(cgi)['user']   mstore = get_message_store   newmsg = Message.new(cgi['TITLE'][0], user)   newmsg.body = cgi['BODY'][0]   mstore.add_message(newmsg)   viewid = do_reply(mstore, newmsg, cgi)   if !viewid     viewid = newmsg.id   end   mstore.save($filepath)   cgi.out {  "<HTML><HEAD><META HTTP-EQUIV=\"REFRESH\"" +             " CONTENT=\"1;URL=/cgi-bin/rb/board.cgi?" +             "cmd=viewmessage&id=#{ viewid} \"></HEAD>"  +             "<BODY></BODY></HTML>" } end def do_reply(mstore, newmsg, cgi)   id = cgi['id'][0]   if id != nil && id != ""     orig = mstore.get_message(cgi['id'][0].to_i)     orig.add_reply(newmsg)     return id   end end # # File: viewmessage.rb # require "message" def do_viewmessage_page(cgi)   mstore = get_message_store   user = get_session(cgi)['user']   num = cgi['id'][0]   message = mstore.get_message(num.to_i)   template = get_template(message)   template.gsub!(/%%USER%%/, user)   template.gsub!(/%%HEADER%%/, header(cgi))   template.gsub!(/%%NUM%%/,num)   template.gsub!(/%%RESPONSES%%/,                  get_message_rows(message.replies))   cgi.out{  template } end def get_message_rows(messages)   if messages.size < 1     return ""   end   rows = String.new("<TABLE border=1>")   messages.each do |message|     rows << "<TR><TD><a href=\"/cgi-bin/rb/board.cgi?" +             "cmd=viewmessage&id=#{ message.id} \">" +             "#{ message.id} </a></TD><TD>#{ message.title} " +             "</TD><TD>#{ message.sender} </TD><TD>" +             "#{ message.date} </TD></TR>"   end   rows << "</TABLE>"   rows end def get_template(message)   "<HTML><BODY>     %%HEADER%%     #{ get_message_view(message)}     <a href=\"/cgi-bin/rb/board.cgi?cmd=reply&id=%%NUM%%\">reply   to this message.</a>     <HR>     <B>Previous Reponses:</B>     <BR>     %%RESPONSES%%   </BODY></HTML>" end

We hope this case study has given you a good feeling for what it's like to program a real CGI-based Web application. As with all the examples of significant size, an online copy of the full source code is available at the Web site for this book, which is referenced in Appendix D, "Resources on the Web (and Elsewhere)."