Summary

	Network Programming with Perl By Lincoln D. Stein Slots : 1
	Table of Contents

	Chapter 18. The UDP Protocol

Content

Increasing the Robustness of UDP Applications

Because UDP is unreliable, problems arise when you least expect them. Although the echo client of Figure 18.5 looks simple, it actually contains a hidden bug. To bring out this bug, try pointing the client at an echo server running on a remote UNIX host somewhere on the Internet. Instead of typing directly into the client, redirect its standard input from a large text file, such as /usr/dict/words :

 %  udp_echo_cli1.pl wuarchive.wustl.edu echo </usr/dict/words

If the quality of your connection is excellent , you may see the entire contents of the file scroll by and the command-line prompt reappear after the last line is echoed . More likely, though, you will see the program get part way through the text file and then hang indefinitely. What happened ?

Remember that UDP is an unreliable protocol. Any datagram sent to the remote server may fail to reach its destination, and any datagram returned from the server to the local host may vanish into the ether . If the remote server is very busy, it may not be able to keep up with the flow of incoming packets, resulting in buffer overrun errors.

Our echo client doesn't take these possibilities into account. After we send() the message, we blithely call recv() , assuming that a response will be forthcoming. If the response never arrives, we block indefinitely, making the script hang.

This is yet another example of deadlock. We won't get a message from the server until we send it one to echo back, but we can't do that because we're waiting for a message from the server!

As with TCP, we can avoid deadlock either by timing out the call to recv() or by using some form of concurrency to decouple the input from the output.

Timing Out UDP Receives

It's straightforward to time out a call to recv() using an eval{} block and an ALRM handler:

 eval {    local $SIG{ALRM} = sub { die "timeout\n" };    alarm($timeout);    $result = $sock->recv($msg_in,max_msg_LEN);    alarm(0); }; if ($@) {    die $@ unless $@ eq "timeout\n";    warn "Timed out!\n"; }

We wrap recv() in an eval{} block and set a local ALRM handler that invokes die() . Just prior to making the system call, we call the alarm() function with the desired timeout value. If the function returns normally, we call alarm(0) to cancel the alarm. Otherwise, if the alarm clock goes off before the function returns, the ALRM handler runs and we die. But since this fatal error is trapped within an eval{} block, the effect is to abort the entire block and to leave the error message in the $@ variable. Our last step is to examine this variable and issue a warning if a timeout occurred or die if the variable contains an unexpected error.

Using a variant of this strategy, we can design a version of the echo client that transmits a message and waits up to a predetermined length of time for a response. If the recv() call times out, we try again by retransmitting the request. If a predetermined number of retransmissions fail, we give up.

Figure 18.6 shows a modified version of the echo client, udp_echo_cli2.pl .

Figure 18.6. udp_echo_cli2.pl implements a timeout on `recv()`

graphics/18fig06.gif

Lines 1 “15: Initialize module, create socket The main changes are two new constants to control the timeouts. TIMEOUT specifies the time, in seconds, that the client will allow recv() to wait for a message. We set it to 2 seconds. MAX_RETRIES is the number of times the client will try to retransmit a message before it assumes that the remote server is not answering.

Lines 16 “30: Main loop We now place a do{} loop around the calls to send() and recv() . The do{} loop retransmits the outgoing message every time a timeout occurs, up to MAX_RETRIES times. Within the do{} loop, we call send() to transmit the message as before, but recv() is wrapped in an eval{} block. The only difference between this code and the generic idiom is that the local ALRM handler bumps up a variable named $retries each time it is invoked. This allows us to track the number of timeouts. After the eval{} block completes, we check whether the number of retries is greater than the maximum retry setting. If so we issue a short warning and die.

The easiest way to test the new and improved echo client is to point it at a port that isn't running the echo service, for example, 2008 on the local host:

 %  udp_echo_cli2.pl localhost 2008   anyone home?  Retrying...1 Retrying...2 Retrying...3 Retrying...4 Retrying...5 timeout

Duplicates and Out-of-Sequence Datagrams

While this timeout code fixes the problem with deadlocks, it opens the door on a new one: duplicates. Instead of being lost, it is possible that the missing response was merely delayed and that it will arrive later. In this case, the program will receive an extra message that it isn't prepared to deal with.

If you are sufficiently dexterous and are using a UNIX machine, you can demonstrate this with the reverse-echo server/echo client pair from Figures 18.4 and 18.6. Launch the echo server and echo clients in separate windows . Type a few lines into the echo client to get things going. Now suspend the echo server by typing ^Z, and go back to the client window and type another line. The client will begin to generate timeout messages. Quickly go back to the server window and resume the server by typing the fg command. The client will recover from the timeout and print the server's response. Unfortunately, the client and server are now hopelessly out of synch! The responses the client displays are those from the retransmitted requests , not the current request.

Another problem that we can encounter in UDP communications is out-of-sequence datagrams, in which two datagrams arrive in a different order from that in which they were sent. The general technique for dealing with both these problems is to attach a sequence number to each outgoing message and design the client/server protocol in such a way that the server returns the same sequence number in its response. In this section, we develop a better echo client that implements this scheme. In so doing, we show how select() can be used with UDP sockets to implement timeouts and prevent deadlock.

To implement a sequence number scheme, both client and server have to agree on the format of the messages. Our scheme is a simple one. Each request from client to server consists of a sequence number followed by a " : " character, a space, and a payload of arbitrary length. Sequence numbers begin at 0 and count upward. For example, in this message, the sequence number is 42 and the payload is " the meaning of life ":

 42: the meaning of life

The reverse-echo server generates a response that preserves this format. The server's response to the sample request given earlier would be:

 42: efil fo gninaem eht

The modifications to the reverse-echo server of Figure 18.4 are trivial. We simply replace line 19 with a few lines of code that detect messages having the sequence number/payload format and generate an appropriately formatted response.

 if ( $msg_in =~ /^(\d+): (.*)/ ) {   $msg_out = ": ".reverse ; } else {   $msg_out = reverse $msg_in; }

For backward compatibility, messages that are not in the proper format are simply reversed as before. Another choice would be to have the server discard unrecognized messages.

All the interesting changes are in the client, which we will call udp_echo_cli3.pl (Figure 18.7). Our strategy is to maintain a hash named %PENDING to contain a record of every request that has been sent. The hash is indexed by the sequence number of the outgoing request and contains both a copy of the original request and a counter that keeps track of the number of times the request has been sent.

Figure 18.7. The udp_echo_cli3.pl script detects duplicate and misordered messages

graphics/18fig07.gif

A global variable $seqout is incremented by 1 every time we generate a new request, and another global, $seqin , keeps track of the sequence number of the last response received from the server so that we can detect out-of-order responses.

We must abandon the send-and-wait paradigm of the earlier UDP clients and assume that responses from the server can arrive at unpredictable times. To do this, we use select() with a timeout to multiplex between STDIN and the socket. Whenever the user types a new request (i.e., a string to be reversed), we bump up the $seqout variable and create a new request entry in the %PENDING array.

Whenever a response comes in from the server, we check its sequence number to see if it corresponds to a request that we have made. If it does, we print the response and delete the request from %PENDING . If a response comes in whose sequence number is not found in %PENDING , then it is a duplicate response, which we discard. We store the most recent sequence number of an incoming response in $seqin , and use it to detect out-of-order responses. In the case of this client, we simply warn about out-of-order responses, but don't take any more substantial action.

If the call to select() times out before any new messages arrive, we check the %PENDING array to see if there is still one or more unsatisfied requests. If so, we retransmit the requests and bump up the counter for the number of times the request has been tried.

In order to mix line-oriented reads from STDIN with multiplexing, we take advantage of the IO::Getline module that we developed in Chapter 13 (Figure 13.2). Let's walk through the code now:

Lines 1 “9: Load modules, define constants We bring in the IO::Socket, IO::Select, and IO::Getline modules.

Lines 10 “12: Define the structure of the %PENDING hash The %PENDING hash is indexed by request sequence number. Its values are two-element array references containing the original request and the number of times the request has been sent. We use symbolic constants for the indexes of this array reference, such that $PENDING{$seqno}[REQUEST] is the text of the request and $PENDING{$seqno}[ TRIES ] is the number of times the request has been sent to the server.

Lines 13 “18: Global variables $seqout is the master counter that is used to assign unique sequence numbers to each outgoing request. $seqin keeps track of the sequence number of the last response we received. The server $host and $port are read from the command line as before.

Lines 19 “22: Create socket, IO::Select objects, and IO::Getline objects We create a UDP socket as before. If successful, we create an IO::Select set initialized to contain the socket and STDIN , as well as an IO::Getline object wrapped around STDIN .

Lines 23 “25: The select() loop We now enter the main loop of the program. Each time through the loop we call the select set's can_read() method with the desired timeout. This returns a list of filehandles that are ready for reading, or if the timeout expired , an empty list. We loop through each of the filehandles that are ready for reading. There are only two possibilities. One is that the user has typed something and STDIN has some data for us to read. The other is that a message has been received and we can call recv() on the socket without blocking.

Lines 26 “32: Handle input on STDIN If STDIN is ready to read, we fetch a line from its IO::Getline wrapper by calling the getline() method. Recall that the syntax for IO::Getline->getline() works like read() . It copies the line into a scalar variable (in this case, $_ ) and returns a result code indicating the success of the operation.

If getline() returns false, we know we've encountered the end of file and we exit the loop. Otherwise, we check whether we got a complete line by looking at the line length returned by getline() , and if so, remove the terminating end-of-line sequence and call send_message() with the message text and a new sequence number.

Lines 33 “37: Handle a message on the socket If the socket is ready to read, then we've received a response from the server. We retrieve it by calling the socket's recv() method and pass the message to our receive_message() subroutine.

Lines 39 “41: Handle retries If @ready is empty, then we have timed out. We call the do_retries() subroutine to retransmit any requests that are pending.

Lines 42 “49: The send_message() subroutine This subroutine is responsible for transmitting a request to the server given a unique sequence number and the text of the request. We construct the message using the simple format discussed earlier and send() it to the server.

We then add the request to the %PENDING hash. This subroutine is also called on to retransmit requests, so rather than setting the TRIES field to 1, we increment it and let Perl take care of creating the field if it doesn't yet exist.

Lines 50 “66: The receive_message() subroutine This subroutine is responsible for processing an incoming response. We begin by parsing the sequence number and the payload. If it doesn't fit the format, we print a warning and return. Having recovered the response's sequence number, we check to see whether it is known to the %PENDING hash. If not, this response is presumably a duplicate. We print a warning and return. We check to see whether the sequence number of this response is greater than the sequence number of the last one. If not, we print a warning, but don't take any other action.

If all these checks pass, then we have a valid response. We print it out, remember its sequence number, and delete the request from the %PENDING hash.

Lines 67 “77: The do_retries() subroutine This subroutine is responsible for retransmitting pending requests whose responses are late. We loop through the keys of the %PENDING hash and examine each one's TRIES field. If TRIES is greater than the MAX_RETRIES constant, then we print a warning that we are giving up on the request and delete it from %PENDING . Otherwise, we invoke send_message() on the request in order to retransmit it.

To test udp_echo_cli3.pl , I modified the reverse-echo server to make it behave unreliably. The modification occurs at line 20 of Figure 18.4 and consists of this:

 for (1..3) {    $sock->send($msg_out) or die "send(): $!\n" if rand() > 0.7; }

Instead of sending a single response as before, we now send a variable number of responses using Perl's rand() function to generate a random coin flip. Sometimes the server sends one response, sometimes none, and sometimes several.

When we run udp_echo_cli3.pl against this unreliable server, we see output like the following. In this transcript, the user input is bold, standard error is italic, and the output of the script is roman.

 %  udp_echo_cli3.pl localhost 2007 hello there   0: retrying...  hello there => ereht olleh  Discarding duplicate message seqno = 0 Discarding duplicate message seqno = 0   this is unreliable communications   1: retrying...  this is unreliable communications => snoitacinummoc elbailernu si siht  but it works anyway   2: retrying...  but it works anyway => yawyna skrow ti tub  Discarding duplicate message seqno = 2 Discarding duplicate message seqno = 2

Even though some responses were dropped and others were duplicated , the client still managed to associate the correct response with each request.

A cute thing about this client is that it will work with unmodified UDP echo servers. This is because we designed the message protocol in such a way that the protocol is correct even if the server just returns the incoming message without modification.

As written in Figure 18.7, the client is slightly inefficient because we time out can_read() , even when there's nothing in %PENDING to wait for. We can fix this problem by modifying line 23 of Figure 18.7 to read this way:

 my @ready = $select->can_read ( %PENDING ? TIMEOUT : () );

If %PENDING is nonempty , we call can_read() with a timeout. Otherwise, we pass an empty list for the arguments, causing can_read() to block indefinitely until either the socket or STDIN are ready to read.

Top

Increasing the Robustness of UDP Applications

Timing Out UDP Receives

Figure 18.6. udp_echo_cli2.pl implements a timeout on recv()

Duplicates and Out-of-Sequence Datagrams

Figure 18.7. The udp_echo_cli3.pl script detects duplicate and misordered messages

Figure 18.6. udp_echo_cli2.pl implements a timeout on `recv()`