Section 19.2. Using FastCGI

19.1. CGI Programming with Ruby

Anyone familiar with web programming has at least heard of CGI ("Common Gateway Interface"). CGI was created in the early days of the Web to enable programmatically implemented sites and to allow for more interaction between the end user and the web server. Although countless replacement technologies have been introduced since its inception, CGI is still alive and well in the world of web programming. Much of CGI's success and longevity can be attributed to its simplicity. Because of this simplicity, it is easy to implement CGI programs in any language. The CGI standard specifies how a web server process will pass data between itself and its children. Most of this interaction occurs through standard environment variables and streams in the implementation operating system.

CGI programming, and HTTP for that matter, are based around a "stateless" request and response mechanism. Generally, a single TCP connection is made, and the client (usually a web browser) initiates conversation with a single HTTP command. The two most commonly used commands in the protocol are GET and POST (we'll get to the meaning of these shortly). After issuing the command, the web server responds and closes its output stream.

The following code sample, only slightly more advanced than the standard "Hello, world," shows how to do input and output via CGI.

def parse_query_string   inputs = Hash.new   raw = ENV['QUERY_STRING']   raw.split("&").each do |pair|     name,value = pair.split("=")     inputs[name] = value   end   inputs end inputs = parse_query_string print "Content-type: text/html\n\n" print "<HTML><BODY>" print "<B><I>Hello</I>, #{inputs['name']}!</B>" print "</BODY></HTML>"

Accessing the URL (for example) http://mywebserver/cgi-bin/hello.cgi? name=Dali would produce the output "Hello, Dali!" in your web browser.

As we previously mentioned, there are two main ways to access a URL: the HTTP GET and POST methods. For the sake of brevity, we offer simple explanations of these rather than rigorous definitions. The GET method is usually called when clicking a link or directly referencing a URL (as in the preceding example). Any parameters are passed via the URL query string, which is made accessible to CGI programs via the QUERY_STRING environment variable. The POST is usually used in HTML form processing. The parameters sent in a POST are included in the message body and are not visible via the URL. They are delivered to CGI programs via the standard input stream.

Though the preceding example was simple, anything less trivial could quickly become messy. Programs needing to deal with multiple HTTP methods, file uploads, cookies, "stateful" sessions, and other complexities are best suited by a general-purpose library for working with the CGI environment. Thankfully, Ruby provides a full-featured set of classes that automate much of the mundane work one would otherwise have to do manually.

Many other toolkits and libraries attempt to make CGI development easier. Among the best of these is Patrick May's ruby-web (formerly Narf). If you want a great deal of low-level control but the standard CGI library isn't to your liking, you might try this library instead (http://ruby-web.org).

If you want a templating solution, Amrita (http://amrita.sourceforge.jp) might be good for you. Also look at Cerise, the web application server based on Amrita (http://cerise.rubyforge.org).

There are probably still other libraries out there. As usual, if you don't find what you're looking for listed here, do an online search or ask on the newsgroup.

19.1.1. Introduction to the `cgi.rb` Library

The CGI library is in the file cgi.rb in the standard Ruby distribution. Most of its functionality is implemented around a central class aptly named CGI. One of the first things you'll want to do when using the library, then, is to create an instance of CGI.

require "cgi" cgi = CGI.new("html4")

The initializer for the CGI class takes a single parameter, which specifies the level of HTML that should be supported by the HTML generation methods in the CGI package. These methods keep the programmer from having to embed a truckload of escaped HTML in otherwise pristine Ruby code:

cgi.out do   cgi.html do     cgi.body do       cgi.h1 { "Hello Again, "} +       cgi.b { cgi['name']}     end   end end

Here, we've used the CGI libraries to almost exactly reproduce the functionality of the previous program. As you can see, the CGI class takes care of parsing any input and stores the resulting values internally as a hashlike structure. So if you specified the URL as some_program.cgi?age=4, the value could be accessed via cgi['age'].

Note in the preceding code that really only the return value of a block is used; the HTML is built up gradually and stored rather than being output immediately. This means that the string concatenation we see here is absolutely necessary; without it, only the last string evaluated would appear.

The CGI class also provides some convenience mechanisms for dealing with URL encoded strings and escaped HTML or XML. URL encoding is the process of translating strings with unsafe characters to a format that is representable in a URL string. The result is all of those weird-looking "%" strings you see in some URLs while you browse the web. These strings are actually the numeric ASCII codes represented in hexadecimal with "%" prepended.

require "cgi" s = "This| is^(aT$test" s2 = CGI.escape(s)        # "This%7C+is%5E%28aT%24test" puts CGI.unescape(s2)     # Prints "This| is^(aT$test"

Similarly, the CGI class can be used to escape HTML or XML text that should be displayed verbatim in a browser. For example, the string "<some_stuff>" would not display properly in a browser. If there is a need to display HTML or XML literally in a browserin an HTML tutorial, for examplethe CGI class offers support for translating special characters to their appropriate entities:

require "cgi" some_text = "<B>This is how you make text bold</B>" translated = CGI.escapeHTML(some_text) # "<B>This is how you make text bold</B>" puts CGI.unescapeHTML(translated) # Prints "<B>This is how you make text bold</B>"

19.1.2. Displaying and Processing Forms

The most common way of interacting with CGI programs is through HTML forms. HTML forms are created by using specific tags that will be translated to input widgets in a browser. A full discussion or reference is beyond the scope of this text, but numerous references are available both in books and on the Web.

The CGI class offers generation methods for all of the HTML form elements. The following example shows how to both display and process an HTML form.

require "cgi" def reverse_ramblings(ramblings)   if ramblings[0] == nil then return " " end   chunks = ramblings[0].split(/\s+/)   chunks.reverse.join(" ") end cgi = CGI.new("html4") cgi.out do   cgi.html do     cgi.body do       cgi.h1 { "sdrawkcaB txeT" } +       cgi.b { reverse_ramblings(cgi['ramblings'])} +       cgi.form("action" => "/cgi-bin/rb/form.cgi") do         cgi.textarea("ramblings") { cgi['ramblings'] } + cgi.submit       end     end   end end

This example displays a text area, the contents of which will be tokenized into words and reversed. For example, typing "This is a test" into the text area would yield "test a is This" after processing. The form method of the CGI class can accept a method parameter, which will set the HTTP method (GET, POST, and so on) to be used on form submittal. The default, used in this example, is POST.

This example contains only a small sample of the form elements available in an HTML page. For a complete list, go to any HTML reference.

19.1.3. Working with Cookies

HTTP is, as mentioned previously, a stateless protocol. This means that, after a browser finishes a request to a website, the web server has no way to distinguish its next request from any other arbitrary browser on the Web. This is where HTTP cookies come into the picture. Cookies offer a way, albeit somewhat crude, to maintain state between requests from the same browser.

The cookie mechanism works by way of the web server issuing a command to the browser, via an HTTP response header, asking the browser to store a name/value pair. The data can be stored either in memory or on disk. For every successive request to the cookie's specified domain, the browser will send the cookie data in an HTTP request header.

Of course, you could read and write all of these cookies manually, but you've probably already guessed that you're not going to need to. Ruby's CGI libraries provide a Cookie class that conveniently handles these chores.

require "cgi" lastacc = CGI::Cookie.new("kabhi",                          "lastaccess=#{Time.now.to_s}") cgi = CGI.new("html3") if cgi.cookies.size < 1   cgi.out("cookie" => lastacc) do     "Hit refresh for a lovely cookie"   end else   cgi.out("cookie" => lastacc) do     cgi.html do       "Hi, you were last here at: "+       "#{cgi.cookies['kabhi'].join.split('=')[1]}"     end   end end

Here, a cookie called "kabhi" is created, with the key "lastaccess" set to the current time. Then, if the browser has a previous value stored for this cookie, it is displayed. The cookies are represented as an instance variable on the CGI class and stored as a Hash. Each cookie can store multiple key/value pairs, so when you access a cookie by name, you will receive an array.

19.1.4. Working with User Sessions

Cookies are fine if you want to store simple data and you don't mind the browser being responsible for persistence. But, in many cases, data persistence needs are a bit more complex. What if you have a lot of data you want to maintain persistently and you don't want to have to send it back and forth from the client and server with each request? What if there is sensitive data you need associated with a session and you don't trust the browser with it?

For more advanced persistence in web applications, use the CGI::Session class. Working with this class is similar to working with the CGI::Cookie class in that values are stored and retrieved via a hashlike structure.

require "cgi" require "cgi/session" cgi = CGI.new("html4") sess = CGI::Session.new( cgi, "session_key" => "a_test",                               "prefix" => "rubysess.") lastaccess = sess["lastaccess"].to_s sess["lastaccess"] = Time.now if cgi['bgcolor'][0] =~ /[a-z]/   sess["bgcolor"] = cgi['bgcolor'] end cgi.out do   cgi.html do     cgi.body ("bgcolor" => sess["bgcolor"]) do       "The background of this page"    +       "changes based on the 'bgcolor'" +       "each user has in session."      +       "Last access time: #{lastaccess}"     end   end end

Accessing "/thatscript.cgi?bgcolor=red" would turn the page red for a single user for each successive hit until a new "bgcolor" was specified via the URL. CGI::Session is instantiated with a CGI object and a set of options in a Hash. The optional session_key parameter specifies the key that will be used by the browser to identify itself on each request. Session data is stored in a temporary file for each session, and the prefix parameter assigns a string to be prepended to the filename, making your sessions easy to identify on the filesystem of the server.

CGI::Session still lacks many features, such as the capability to store objects other than Strings, session storage across multiple servers, and other "nice-to-haves." Fortunately, a pluggable database_manager mechanism is already in place and would make some of these features easy to add. If you do anything exciting with CGI::Session, be sure to share it.

19.1. CGI Programming with Ruby

19.1.1. Introduction to the cgi.rb Library

19.1.2. Displaying and Processing Forms

19.1.3. Working with Cookies

19.1.4. Working with User Sessions

19.1.1. Introduction to the `cgi.rb` Library