Section 19.8. Conclusion

19.7. Ruby and the Web Server

One of the most common servers in use today is Apache. If you use Apache, you need to know about the mod_ruby module, presented in section 19.7.1, "Using mod_ruby."

Another useful concept on the server side is embedded Ruby; two tools for this job are erb (covered here) and eruby. This enables you to embed Ruby code into text (typically HTML or XML) so that it can have data inserted dynamically. This is covered in section 19.7.2, "Using erb."

Some developers in the Ruby community have implemented web servers in Ruby. Of course, you might be wondering why anyone would be concerned with writing a new web server when plenty of good onessuch as Apachealready exist.

There are several situations in which you might actually want your own dedicated web server in Ruby. One is to handle web pages in a specialized way, such as sacrificing functionality for speed, or automatically translating special markup to HTML.

Second, you might also want to experiment with the behavior of the server and its interaction with external code such as CGIs; you might want to play with your own ideas for creating an application server or a server-side development environment. We all know that Ruby is a fun language for software experimentation.

Third, you might want to embed a web server inside another application. This possibility is sometimes exploited by developers who want to expose the functionality of a software system to the outside world; the HTTP protocol is well-defined and simple, and web browsers that serve as clients are everywhere. This trick can even be used as a remote debugging tool, assuming that the system updates its internal state frequently and makes it available to the embedded server.

A final reason is that a small self-contained web server can simplify deployment and configuration. For example, restarting the server in a Rails application is much simpler with WEBrick than if it used Apache by default.

With these ideas in mind, let's look at what is available in the Ruby arena where web servers are concerned. In the past, there have been at least four such servers; as of summer 2006, the two most important ones are WEBrick and Mongrel. These are presented in sections 19.7.3 and 19.7.4, respectively.

19.7.1. Using `mod_ruby`

Typically when a CGI script is written in an interpreted language, an instance of the interpreter is launched with every invocation of the CGI. This can be expensive in terms of server utilization and execution time.

The Apache server solves this problem by allowing loadable modules that in effect attach themselves to the server and become part of it. These are loaded dynamically as needed and are shared by all the scripts that depend on them. The mod_ruby package (available from the Ruby Application Archive) is such a module.

The mod_ruby package implements several Apache directives. Some of these are

RubyRequire Specify one or more libraries needed.
RubyHandler Specify a handler for ruby-object.
RubyPassEnv Specify names of environment variables to pass to scripts.
RubySetEnv Set environment variables.
RubyTimeOut Specify a timeout value for Ruby scripts.
RubySafeLevel Set the $SAFE level.
RubyKanjiCode Set the Ruby character encoding.

The software also provides Ruby classes and modules for interacting with Apache. The Apache module (using module here in the Ruby sense) has a few module functions as server_version and unescape_url; it also contains the Request and Table classes.

Apache::Request is a wrapper for the request_rec data type, defining methods such as request_method, content_type, readlines, and more. The Apache::Table class is a wrapper for the table data type, defining methods such as get, add, and each.

Extensive instructions are available for compiling and installing the mod_ruby package. Refer to its accompanying documentation (or the equivalent information on the Web).

19.7.2. Using `erb`

First, let's dispel any confusion over terminology. We're not talking here about embedding a Ruby interpreter in an electronic device such as a TV or a toaster. We're talking about embedding Ruby code inside text.

Second, we'll note that there is more than one scheme for embedding Ruby code in text files. This section discusses only the most common tool, which is erb (created by Shugo Maeda).

Why do we mention such a tool in connection with the Web? Obviously, it's because the most common form of text in which we'll embed Ruby code is HTML (or XML).

Having said that, it's conceivable there might be other uses for it. Perhaps it could be used in an old-fashioned text-based adventure game, or in some kind of mail-merge utility, or as part of a cron job to create a customized message-of-the-day file (/etc/motd) every night at midnight. Don't let your creativity be constrained by our lack of imagination. Feel free to dig up new and interesting uses for erb and share them with the rest of us. Most of the examples we give here are generic (and thus contrived); they don't have much to do with HTML specifically.

The erb utility is simply a filter or preprocessor. A special notation is used to delimit Ruby code, expressions, and comments; all other text is simply passed through "as is."

The symbols <% and %> are used to mark the pieces of text that will be treated specially. There are three forms of this notation, varying in the first character inside the "tag."

If it is an equal sign (=), the tag is treated as a Ruby expression that is evaluated; the resulting value is inserted at the current location in the text file. Here is a sample text file:

This is <%= "ylno".reverse %> a test. Do <%= "NOT".downcase %> be alarmed.

Assuming the file for this example is called myfile.txt, we can filter it in this way:

erb myfile.txt

The output, by default written to standard output, will look like this:

This is only a test. Do not be alarmed.

We can also use the character # to indicate a comment.

Life <%# so we've heard %> is but a dream.

As you'd expect, the comment is ignored. The line above will produce this line of output.

Life is but a dream.

Any other character following the percent sign will be taken as a piece of Ruby code, and its output (not its evaluated value) will be placed into the text stream. For readability, I recommend using a blank space here, though erb does not demand it.

In this example, the tag in the first line of text does not insert any text (because it doesn't produce any output). The second line works as expected.

The answer is <% "42" %>. Or rather, the answer is <% puts "42" %>.

So the output would be:

The answer is . Or rather, the answer is 42.

The effect of the Ruby code is cumulative. For example, a variable defined in one tag may be used in a subsequent tag.

<% x=3; y=4; z=5 %> Given a triangle of sides <%=x%>, <%=y%>, and <%=z%>, we know it is a right triangle because <%= x*x %> + <%= y*y %> = <%= z*z %>.

The spaces we used inside the tags in the last line are not necessary, but they do increase readability as we've said. The output will be:

Given a triangle of sides 3, 4, and 5, we know it is a right triangle because 9 + 16 = 25.

Try putting a syntax error inside a tag. You'll find that erb has very verbose reporting; it actually prints out the generated Ruby code and tells us as precisely as it can where the error is.

What if we want to include one of the "magic" strings as a literal part of our text? You might be tempted to try a backslash to escape the characters, but this won't work. We recommend a technique like the following.

There is a less-than-percent <%="<%"%> on this line and a percent-greater-than <%="%"+">"%> on this one. Here we see <%="<%="%> and <%="<%#"%> as well.

The output then will be:

There is a less-than-percent <% on this line and a percent-greater-than %> on this one. Here we see <%= and <%# as well.

Note that it's a little easier to embed an opening symbol than a closing one. This is because they can't be nested, and erb is not smart enough to ignore a closing symbol inside a string.

Of course, erb does have certain features that are tailored to HTML. The flag -M can be used to specify a mode of operation; the valid modes are f, c, and n respectively.

The f mode (filter) is the default, which is why all our previous examples worked without the -Mf on the command line. The -Mc option means CGI mode; it prints all errors as HTML. The -Mn option means NPH-CGI mode ("no-parse-headers"); it outputs extra HTML headers automatically. Both CGI and NPH-CGI modes set $SAFE to be 1 for security reasons (assuming that the application is a CGI and thus may be invoked by a hostile user). The -n flag (or the equivalent --noheader) will suppress CGI header output.

It's possible to set up the Apache web server to recognize embedded Ruby pages. You do this by associating the type application/x-httpd-erb with some extension (.rhtml being a logical choice) and defining an action that associates this type with the eruby executable. For more information, consult the Apache documentation.

19.7.3. Using WEBrick

WEBrick is the work of Masayoshi Takahashi and Yuuzou Gotou (with patches from many others). It is a full-featured HTTP server library and is part of the standard Ruby distribution. The name apparently is related to the word brick, meaning that it is small, compact, and self-contained.

WEBrick is ignorant of most of the details of web applications. It doesn't know about user sessions or any such thing; it only knows servlets which act independently of each other. If you want higher-level functionality, look for it in another library (possibly layered on top of WEBrick, like IOWA or Tofu), or write it yourself.

The basic usage of WEBrick is as follows: Create a server instance; define any mount handlers; define signal handlers; and start the server. Here is a small example:

require 'webrick' server = WEBrick::HTTPServer.new(:DocumentRoot => '.') # (No mount handlers in this simple example) trap('INT')  { server.shutdown} trap('TERM') { server.shutdown} server.start

If you run the preceding code example, you will get a web server running on port 80 like any other. It serves files from the current directory.

To create a servlet, inherit from the WEBrick::HTTPServlet::AbstractServlet class. Then mount that servlet using a URI prefix. When the server tries to handle a URL, it looks for the longest prefix (that is, the best match). Here is an "empty" example (with handlers that don't do anything):

class EventsHandler < HTTPServlet::AbstractServlet   # ... end class RecentHandler < HTTPServlet::AbstractServlet   # ... end class AlphaHandler  < HTTPServlet::AbstractServlet   # ... end # ... server.mount('/events', EventsHandler) server.mount('/events/recent', RecentHandler) server.mount('/events/alpha', AlphaHandler)

How does a servlet work? The basic idea is that for every HTTP operation you want to support (such as GET), you define a corresponding method (such as do_GET). If you're used to writing software to contact a web server, you now have to think backwards. Now your code is the web server. Rather than getting back a code like 404, you'll be sending that code. Here is a very simple example:

class TinyHandler < WEBrick::HTTPServlet::AbstractServlet   def do_GET(request, response)     # Process request, return response     status, ctype, body = process_request(request)     response.status = status     response['Content-type'] = ctype     response.body = body   end   def process_request(request)     text = "A very short web page..."     return 200, "text/html", text   end end

A more sophisticated servlet would likely have an initialize method. If it did, any parameters you needed to pass to it would go on the end of the server.mount call.

Fortunately, you don't have to write your own servlet for every little task you want WEBrick to perform. It has several predefined handlers of its own (all in the WEBrick::HTTPServlet namespace):

FileHandler
ProcHandler
CGIHandler
ERBHandler

Since ProcHandler is especially interesting, let's look at it briefly. It allows us to be "lazy" and avoid subclassing the AbstractServlet class. Instead, we pass in a simple proc:

# Mount a block directly... server.mount_proc('/here') do |req, resp|   resp.body = "This is the output of my block." end # Create a Proc and mount it... some_proc = Proc.new do |req, resp|   resp.body = 'This is the output from my Proc." end server.mount_proc('/there', some_proc) # Another way to mount a Proc... my_handler = HTTPServlet::ProcHandler.new(some_proc)) server.mount('/another', my_handler)

WEBrick also has many other convenient features such as hooks that you can use to perform extra little tasks as needed (for example, trigger a task at server startup). WEBrick also has extensive logging capabilities, HTTP authentication, and other features. For more details, consult the online documentation at http://ruby-doc.org or elsewhere.

19.7.4. Using Mongrel

Mongrel is the work of Zed Shaw (with contributions by other people). It was created largely to address the performance issues that WEBrick has. As such, it is very successful; it is many times faster than WEBrick (though exact measurements are hard to make, since they depend on so many variables).

Mongrel is commonly used in conjunction with Rails, and the Mongrel documentation tends to be Rails-centric. However, it is not "tied" to Rails; it can be used in many other contexts.

It is true that Mongrel is more of an application where WEBrick is more of a library. They do certainly have many features in common, but their usage and their APIs are different.

Very often you will invoke Mongrel as an application without writing any code that specifically supports it. It takes a the three basic commands start, stop, and restart; the start command has a large number of command line parameters that can modify its behavior. Some of these are --port portnum, --log filename, --daemonize, and many others. For a full list, issue this command:

mongrel_rails start -h

Running with the defaults in place is reasonable, but sooner or later you will want to do something special or uncommon. For these cases, you can use configuration files.

The easy way to write a config file with Mongrel is to use the -G option. This will write a config file containing all the other options you specified on the command line. For example, you could use this command line:

mongrel_rails start -G myconfig.yml -p 3000 -r /home/hal/docs -l my.log

Then these options would be stored (in YAML form) in the myconfig.yml file. (With -G, the server exits after writing the config file.)

To read a config file, use -C:

mongrel_rails start -C myconfig.yml

Don't mix -C with other options. This flag assumes that the specified file contains all the options you want to use.

Mongrel has its own API for tweaking the server's behavior with fine granularity. The -S option allows you to specify the name of a script written with this API (which is like a small DSL or Domain-Specific Language). The documentation gives this example script (which adds a directory handler for another directory):

# File: config/mongrel.conf uri "/newstuff", :handler => DirHandler.new("/var/www/newstuff") # Invoke this by issuing the command: # mongrel_rails start -S config/mongrel.conf

It's also possible to use Mongrel much in the way we use WEBrick. The following example works fine and is fairly intuitive:

require 'mongrel' class TinyHandler < Mongrel::HttpHandler    def process(request, response)      response.start(200) do |head,out|        head["Content-Type"] = "text/html"        out.write <<-EOF          This is only a test...        EOF      end    end end server = Mongrel::HttpServer.new("0.0.0.0", "3000") server.register("/stuff", TinyHandler.new) server.register("/other", Mongrel::DirHandler.new("./other")) server.run.join    # Wait on server thread

If you are a sophisticated user of Mongrel, you might be interested in its GemPlugin system. These are basically autoloaded gems which become "a part" of the functionality of Mongrel. For example, the "Mongrel cluster" plugin allows easy management of a cluster of Mongrel servers.

There is much more to Mongrel than we've seen here. For a discussion of logging, debugging, the details of the gem plugin system, and more, you can go to the online documentation at http://mongrel.rubyforge.org.

19.7. Ruby and the Web Server

19.7.1. Using mod_ruby

19.7.2. Using erb

19.7.3. Using WEBrick

19.7.4. Using Mongrel

19.7.1. Using `mod_ruby`

19.7.2. Using `erb`