Prethreading


 
Network Programming with Perl
By Lincoln  D.  Stein
Slots : 1
Table of Contents
Chapter  15.   Preforking and Prethreading

    Content

If you are working with a threading version of Perl, you can design your server to use a prethreading architecture. Prethreading is similar to preforking, except that instead of launching multiple processes to call accept() , the prethreading server creates multiple threads to deal with incoming connections. As it is for the preforking server, the rationale is to avoid the overhead of creating a new thread for every incoming connection.

In this section we develop a prethreading Web server that implements the same adaptive features as the preforking server from the previous sections.

A Threaded Web Server

Before we jump into the prethreaded Web server, we'll review a plain threaded version (Figure 15.8). The idea is that the main thread runs an accept() loop. Each incoming connection is handed off to a new "worker" thread that handles the connection and then exits. Thus, one new thread is created and destroyed for each incoming connection.

Figure 15.8. Threaded Web server

graphics/15fig08.gif

Lines 1 “7: Load modules In addition to the standard modules, we load the Thread module, making Perl's threaded API available for our use.

Lines 8 “10: Create constants, globals , and interrupt handlers We select a path to use for the server's PID file and install signal handlers that will gracefully terminate the server when it receives either a TERM or INT .

Lines 11 “17: Create listening socket and autobackground We create the listening socket and go into the background by calling the Daemon module's init_server() routine. We again create an IO::Select object to use in the main loop to avoid being blocked in accept() when a termination signal is received.

Lines 18 “24: accept() loop The accept() loop is similar to others we've seen in this chapter. For each new incoming connection, we call Thread->new() to launch a new thread of execution. The new thread will run the do_thread() subroutine.

Lines 26 “31: do_thread() subroutine In the do_thread() subroutine, we first detach our thread so that it isn't necessary for the main thread to join() us after we are through. We then call handle_connection() , and when this is done, close the socket.

Like the other servers in this chapter, web_thread1.pl autobackgrounds itself at startup time. Its status messages are written to the syslog, and you can stop it by sending it a TERM or INT signal in this way:

 % kill -TERM 'cat /tmp/web_thread.pid' 

The simplicity of the threaded design is striking.

Simple Prethreaded Server

In contrast to the threaded server, the prethreaded server works by creating its worker threads in advance. Each worker independently runs an accept() loop. The very simplest prethreaded server looks like this:

 use Thread; use IO::Socket; use Web; use constant PRETHREAD => 5; my $socket = IO::Socket::INET->new( LocalPort => $port,                                     Listen    => SOMAXCONN,                                     Reuse     => 1 ) or die; Thread->new(\&do_thread,$socket) for (1..PRETHREAD); sleep; sub do_thread {     my $socket = shift     while (1) {         next unless my $c = $socket->accept         handle_connection($c);         close $c;     } } 

The main thread creates a listening socket and then launches PRETHREAD threads of execution, each running the subroutine do_thread() . The main thread then goes to sleep. Meanwhile, each thread enters an accept() loop in which it waits for an incoming connection, handles it, and then goes back to waiting. Which thread handles which connection is nondeterministic.

Of course, things are not quite this simple. As is, this code won't work on all platforms because on some systems, the call to accept() fails if more than one thread calls it simultaneously , and we need a mechanism to ensure that only one thread will call accept() at one time.

Fortunately, because we are using threads, we can take advantage of the built-in lock() call and don't have to resort to locking an external file. We simply declare a scalar global variable $ACCEPT_LOCK and modify the do_thread() routine to look like this:

 sub do_thread {     my $socket = shift     my $c;     while (1) {        {           lock $ACCEPT_LOCK;           next unless $c = $socket->accept;        }        handle_connection($c);        close $c;     } } 

The while() loop now contains an inner block that defines the scope of the lock. Within that block we attempt to get a lock on $ACCEPT_LOCK . Due to the nature of thread locking, only one thread can obtain a lock at a time; the others are suspended until the lock becomes available. After obtaining the lock, we call accept() , blocking until there is an incoming connection. Immediately after accepting a new connection, we release the lock by virtue of leaving the scope of the inner brace . This allows another thread to obtain the lock and call accept() . We now handle the connection as before.

Adaptive Prethreading

Another deficiency in the basic prethreading server is that if all the threads launched at server startup are busy serving connections, incoming connections have to wait. We would like the main thread to launch new threads when needed to handle an increased load on the server and to delete excess threads when the load diminishes.

We can accomplish this using a strategy similar to that of the preforking server by maintaining a global %STATUS hash that the main server thread monitors and each thread updates. Unlike with the preforking server, there's no need to use pipes or shared memory to keep this hash updated. Since the threads are all running in the same process, they can modify %STATUS directly, provided that they take appropriate steps to synchronize their access to the hash by locking it before modifying it.

The keys of %STATUS are the thread identifiers (TIDs), and the values are one of the strings "busy," "idle," or "goner." The first two have the same meaning they did in the preforking servers. We'll explain the third status code later. To simplify the management of %STATUS , we use a small subroutine named status() that allows threads to examine and change the hash in a thread-safe manner. Given a TID, status() returns the status of the indicated thread:

 my $tid    = Thread->self->tid; my $status = status($tid); 

With two arguments, status() changes the status of the thread:

 status($tid => 'busy'); 

If the second argument is undef , status() deletes the indicated thread from %STATUS entirely.

Each worker thread's accept() loop invokes status() to change the status of the current thread to "idle" before calling accept() and to "busy" after it accepts a connection.

The main thread monitors changes to %STATUS and acts on them. To do its job efficiently , the thread must have a way to know when a marker has changed %STATUS . The best way is to use a condition variable. Each time through the main thread's loop, it calls cond_wait() on the condition variable, putting itself to sleep until one of the worker threads indicates that the variable has changed. Code in the status() subroutine calls cond_broadcast() whenever a thread updates %STATUS , waking up the main thread and allowing it to manage the change.

The last detail is that the adaptive server needs a way to shut down gracefully. As before, the server responds to the TERM and INT signals by shutting down, but how does the main thread tell its various worker threads that shutdown time has arrived?

There is currently no way to deliver a signal specifically to a thread. The way we finesse this is to have each worker periodically check its status code for a special value of "goner" and then exit. To decommission a worker, the master simply calls status() to set the worker's status code appropriately.

Figure 15.9 lists the prethreaded Web server. Its increased size relative to the simple threaded server indicates the substantial complexity of code that is required to coordinate the activities of the multiple threads.

Figure 15.9. Prethreaded Web server

graphics/15fig09.gif

Lines 1 “8: Load modules We bring in the IO::Socket, IO::File, and IO::Select modules, along with the Thread module. Thread doesn't import the cond_wait() and cond_broadcast() functions by default, so we import those functions explicitly.

Lines 9 “14: Define constants We define the various constants used by the server, including PRETHREAD , the number of threads to launch at startup time, the high and low water marks, which have the same significance as in the preforked servers, and a DEBUG flag to turn on status messages. We also define a MAX_REQUEST constant to control the number of transactions a thread will accept before spontaneously exiting.

Lines 15 “18: Declare global variables $ACCEPT_LOCK , as discussed previously, is used for protecting accept() so that only one thread can accept from the listening socket at a time. %STATUS reports the state of each thread, indexed by its TID, and $STATUS is a condition variable used both to lock %STATUS and to indicate when it has changed. $DONE flags the main thread that the server is shutting down.

Line 19: Install signal handlers We install a signal handler named terminate() for the INT and TERM signals. This handler sets $DONE to true and returns.

Lines 20 “25: Create listening server socket and go into background We create a listening socket and autobackground by calling init_server() . We also create an IO::Select object containing the listening socket for use by each worker thread.

Line 26: Prelaunch some threads We launch PRETHREAD threads by calling launch_thread() the appropriate number of times before we enter the main loop.

Lines 27 “40: Main thread: monitor worker threads for status changes The main thread now enters a loop that runs until $DONE is true, indicating that the user has requested server termination. Each time through the loop, we lock the $STATUS condition variable and immediately call cond_wait() , unlocking the condition variable and putting the main thread to sleep until another thread calls cond_broadcast() on the variable.

When cond_wait() returns, we know that a worker thread has signalled that its status has changed and that $STATUS is again locked, protecting us against further changes to %STATUS . We count the number of idle threads and either launch new ones or shut down existing ones to keep the number of idle threads between the low and high water marks. The way we do this is similar to the adaptive preforking servers, except that we cannot kill worker threads with a signal. Instead, we set their status to "goner" and allow them to exit themselves .

Lines 41 “47: Clean up After the main loop has finished, we set each worker thread's status to "goner" and call exit() . Although the main thread has now finished, the server process itself won't exit until each thread has finished processing pending transactions, checked its status code, and exited as well.

Lines 48 “67: do_thread() routine The do_thread() routine forms the body of each worker thread. We begin by recovering the current thread's TID and initializing our status to "idle." We now enter a loop that terminates when our status code becomes "goner" or we have serviced the number of transactions specified by MAX_REQUEST .

We need to poll our status on a periodic basis to recognize when termination has been requested, so we don't want to get blocked in lock() or accept() . To do this, we take advantage of the IO::Select object created by the main thread to call can_read() with a timeout of 1 second. If an incoming connection arrives within that time, we service it. Otherwise, we return to the top of the loop so that we can check that our status hasn't changed.

If can_read() returns true, the socket is ready for accept() . We serialize access to accept() by locking the $ACCEPT_LOCK variable, and call accept() . If this is successful, we set our status to "busy" and handle the connection. After the connection is done, we again set our status to "idle." After the accept() loop is done, we set our status to undef , causing the status() subroutine to remove our TID from the %STATUS hash.

Lines 71 “83: status() subroutine The status() subroutine is responsible for keeping %STATUS up to date. We begin by locking $STATUS so that the hash doesn't change from underneath us. If we were called with only the TID of a thread, we look up its status in %STATUS and return it. Otherwise, if we were provided with a new status code for the TID, we change %STATUS accordingly and call cond_broadcast() on the $STATUS variable in order to alert any threads that are waiting on the variable that %STATUS has been updated.

When we run the prethreaded Web server with DEBUG true, we can see messages appear in the syslog that indicate the birth and death of each worker thread, interspersed with messages from the master thread that indicate its tally of each worker's status:

 Jun 25 14:03:36 pesto web_prethread1.pl: Thread 1: starting Jun 25 14:03:36 pesto web_prethread1.pl: Thread 2: starting Jun 25 14:03:36 pesto web_prethread1.pl: Thread 3: starting Jun 25 14:03:36 pesto web_prethread1.pl: Thread 4: starting Jun 25 14:03:36 pesto web_prethread1.pl: Thread 5: starting Jun 25 14:03:36 pesto web_prethread1.pl:                 1=>idle 2=>idle 3=>idle 4=>idle 5=>idle Jun 25 14:03:40 pesto web_prethread1.pl:                 1=>busy 2=>idle 3=>idle 4=>idle 5=>idle Jun 25 14:03:40 pesto web_prethread1.pl: Thread 1: handling connection Jun 25 14:03:44 pesto web_prethread1.pl:                 1=>busy 2=>idle 3=>busy 4=>idle 5=>idle Jun 25 14:03:44 pesto web_prethread1.pl: Thread 3: handling connection Jun 25 14:03:47 pesto web_prethread1.pl: Thread 2: handling connection Jun 25 14:03:47 pesto web_prethread1.pl:                 1=>busy 2=>busy 3=>busy 4=>idle 5=>idle Jun 25 14:03:52 pesto web_prethread1.pl: Thread 4: handling connection Jun 25 14:03:52 pesto web_prethread1.pl:                 1=>busy 2=>busy 3=>busy 4=>busy 5=>idle Jun 25 14:03:52 pesto web_prethread1.pl: Thread 6: starting Jun 25 14:03:52 pesto web_prethread1.pl:                 1=>busy 2=>busy 3=>busy 4=>busy 5=>idle 6=>idle 

The NetServer::Generic Module

The NetServer::Generic module, written by Charlie Stross, is a framework for writing your own server applications. It provides the core functionality for managing multiple TCP connections, supporting both the forking and preforking models. It is being updated to provide support for the multiplexing and threading architectures and will probably be ready to use by the time you read this.

NetServer::Generic, which is on CPAN, is straightforward to use. You simply provide the module with configuration information and a callback function to be called when an incoming connection is received, and the module takes care of the rest. Configuration variables allow you to control preforking parameters, such as the maximum and minimum number of idle children and the number of requests a preforked child will accept before it shuts down. NetServer::Generic also sports a flexible access control system that accepts or denies access to the server based on the hostname or IP address of the remote client.

The callback function is invoked with STDIN and STDOUT attached to the socket. This means that a Web server with all the functionality of the examples in this chapter plus access control runs to fewer than 100 lines.

NetServer::Generic does not provide customized logging functions, autobackgrounding, PID file handling, or other more specialized functions frequently required by production servers. However, you can always layer these onto your application. In any case, the module is perfect when you need to get a server up and running fast and inetd provides insufficient performance for your needs.


   
Top


Network Programming with Perl
Network Programming with Perl
ISBN: 0201615711
EAN: 2147483647
Year: 2000
Pages: 173

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net