Signals | Network Programming with Perl

	Network Programming with Perl By Lincoln D. Stein Slots : 1
	Table of Contents

	Chapter 2. Processes, Pipes, and Signals

Content

As with filehandles, understanding signals is fundamental to network programming. A signal is a message sent to your program by the operating system to tell it that something important has occurred. A signal can indicate an error in the program itself such as an attempt to divide by zero, an event that requires immediate attention such as an attempt by the user to interrupt the program, or a noncritical informational event such as the termination of a subprocess that your program has launched.

In addition to signals sent by the operating system, processes can signal each other. For example, when the user presses control-C (^C) on the keyboard to send an interrupt signal to the currently running program, that signal is sent not by the operating system, but by the command shell that pro cesses and interprets keystrokes. It is also possible for a process to send signals to itself.

Common Signals

The POSIX standard defines nineteen signals. Each has a small integer value and a symbolic name . We list them in Table 2.2 (the gaps in the integer sequence represent nonstandard signals used by some systems).

The third column of the table indicates what happens when a process receives the signal. Some signals do nothing. Others cause the process to terminate immediately, and still others terminate the process and cause a core dump. Most signals can be "caught." That is, the program can install a handler for the signal and take special action when the signal is received. A few signals, however, cannot be intercepted in this way.

You don't need to understand all of the signals listed in Table 2.2 because either they won't occur during the execution of a Perl script, or their generation indicates a low-level bug in Perl itself that you can't do anything about. However, a handful of signals are relatively common, and we'll look at them in more detail now.

HUP signals a hangup event. This typically occurs when a user is running a program from the command line, and then closes the command-line window or exits the interpreter shell. The default action for this signal is to terminate the program.

INT signals a user-initiated interruption. It is generated when the user presses the interrupt key, typically ^C. The default behavior of this signal is to terminate the program. QUIT is similar to INT , but also causes the program to generate a core file (on UNIX systems). This signal is issued when the user presses the "quit" key, ordinarily ^\.

Table 2.2. POSIX Signals

Signal Name	Value	Notes	Comment
`HUP`	1	A	Hangup detected
`INT`	2	A	Interrupt from keyboard
`QUIT`	3	A	Quit from keyboard
`ILL`	4	A	Illegal Instruction
`ABRT`	6	C	Abort
`FPE`	8	C	Floating point exception
`KILL`	9	AF	Termination signal
`USR1`	10	A	User-defined signal 1
`SEGV`	11	C	Invalid memory reference
`USR2`	12	A	User-defined signal 2
`PIPE`	13	A	Write to pipe with no readers
`ALRM`	14	A	Timer signal from alarm clock
`TERM`	15	A	Termination signal
`CHLD`	17	B	Child terminated
`CONT`	18	E	Continue if stopped
`STOP`	19	DF	Stop process
`TSTP`	20	D	Stop typed at tty
`TTIN`	21	D	tty input for background process
`TTOU`	22	D	tty output for background process
Notes: A ”Default action is to terminate process. B ”Default action is to ignore the signal. C ”Default action is to terminate process and dump core. D ”Default action is to stop the process. E ”Default action is to resume the process. F ”Signal cannot be caught or ignored.

By convention, TERM and KILL are used by one process to terminate another. By default, TERM causes immediate termination of the program, but a program can install a signal handler for TERM to intercept the terminate request and possibly perform some cleanup actions before quitting. The KILL signal, in contrast, is uncatchable. It causes an immediate shutdown of the process without chance of appeal . For example, when a UNIX system is shutting down, the script that handles the shutdown process first sends a TERM to each running process in turn , giving it a chance to clean up. If the process is still running a few tens of seconds later, then the shutdown script sends a KILL .

The PIPE signal is sent when a program writes to a pipe or socket but the process at the remote end has either exited or closed the pipe. This signal is so common in networking applications that we will look at it closely in the Handling PIPE Exceptions section.

ALRM is used in conjunction with the alarm() function to send the program a prearranged signal after a certain amount of time has elapsed. Among other things, ALRM can be used to time out blocked I/O calls. We will see examples of this in the Timing Out Long-Running Operations section.

CHLD occurs when your process has launched a subprocess, and the status of the child has changed in some way. Typically the change in status is that the child has exited, but CHLD is also generated whenever the child is stopped or continued . We discuss how to deal with CHLD in much greater detail in Chapters 4 and 9.

STOP and TSTP both have the effect of stopping the current process. The process is put into suspended animation indefinitely; it can be resumed by sending it a CONT signal. STOP is generally used by one program to stop another. TSTP is issued by the interpreter shell when the user presses the stop key (^Z on UNIX systems). The other difference between the two is that TSTP can be caught, but STOP cannot be caught or ignored.

Catching Signals

You can catch a signal by adding a signal handler to the %SIG global hash. Use the name of the signal you wish to catch as the hash key. For example, use $SIG{INT} to get or set the INT signal handler. As the value, use a code reference: either an anonymous subroutine or a reference to a named subroutine. For example, Figure 2.6 shows a tiny script that installs an INT handler. Instead of terminating when we press the interrupt key, it prints out a short message and bumps up a counter. This goes on until the script counts three interruptions, at which point it finally terminates. In the transcript that follows , the "Don't interrupt me!" message was triggered each time I typed ^C:

Figure 2.6. Catching the INT signal

graphics/02fig06.gif

 %  interrupt.pl  I'm sleeping. I'm sleeping. Don't interrupt me! You've already interrupted me 1x. I'm sleeping. I'm sleeping. Don't interrupt me! You've already interrupted me 2x. I'm sleeping. Don't interrupt me! You've already interrupted me 3x.

Let's look at the script in detail.

Lines 1 “3: Initialize script We turn on strict syntax checking, and declare a global counter named $interruptions . This counter will keep track of the number of times the script has received INT .

Line 4: Install INT handler We install a handler for the INT signal by setting $SIG{INT} to a reference to the subroutine handle_interruptions() .

Lines 5 “8: Main loop The main loop of the program simply prints a message and calls sleep with an argument of 5. This puts the program to sleep for 5 seconds, or until a signal is received. This continues until the $interruptions counter becomes 3 or greater.

Lines 9 “12: The handle_interruptions() subroutine The handle_interruptions() subroutine is called whenever the INT signal arrives, even if the program is busy doing something else at that moment. In this case, our signal handler bumps up $interruptions and prints out a warning.

For short signal handlers, you can use an anonymous subroutine as the handler. For example, this code fragment is equivalent to that in Figure 2.6, but we don't have to formally name the handler subroutine:

 $SIG{INT} = sub {                 $interruptions++;                 warn "Don't interrupt me! You've already interrupted                 me ${interruptions}x.\n";                  };

In addition to code references, %SIG recognizes two special cases. The string " DEFAULT " restores the default behavior of the signal. For example, setting $SIG{INT} to " DEFAULT " will cause the INT signal to terminate the script once again. The string " IGNORE " will cause the signal to be ignored altogether.

As previously mentioned, don't bother installing a handler for either KILL or STOP . These signals can be neither caught nor ignored, and their default actions will always be performed.

If you wish to use the same routine to catch several different signals, and it is important for the subroutine to distinguish one signal from another, it can do so by looking at its first argument, which will contain the name of the signal. For example, for INT signals, the handler will be called with the string " INT ":

 $SIG{TERM} = $SIG{HUP} = $SIG{INT} = \&handler sub handler {    my $sig = shift;    warn "Handling a $sig signal.\n"; }

Handling PIPE Exceptions

We now have what we need to deal with PIPE exceptions. Recall the write_ ten.pl and read_three.pl examples from Figures 2.4 and 2.5 in The Dreaded PIPE Error section. write_ten.pl opens a pipe to read_three.pl and tries to write ten lines of text to it, but read_three.pl is only prepared to accept three lines, after which it exits and closes its end of the pipe. write_ten.pl , not knowing that the other end of the connection has exited, attempts to write a fourth line of text, generating a PIPE signal.

We will now modify write_ten.pl so that it detects the PIPE error and handles it more gracefully. We will use variants on this technique in later chapters that deal with common issues in network communications.

The first technique is shown in Figure 2.7, the write_ten_ph.pl script. Here we set a global flag, $ok , which starts out true. We then install a PIPE handler using this code:

Figure 2.7. The write_ten_ph.pl script

graphics/02fig07.gif

 $SIG{PIPE} = sub { undef $ok };

When a PIPE signal is received, the handler will undefine the $ok flag, making it false.

The other modification is to replace the simple for() loop in the original version with a more sophisticated version that checks the status of $ok . If the flag becomes false, the loop exits. When we run the modified script, we see that the program runs to completion, and correctly reports the number of lines successfully written:

 %  write_ten_ph.pl  Writing line 1 Read_three got: This is line number 1 Writing line 2 Read_three got: This is line number 2 Writing line 3 Read_three got: This is line number 3 Writing line 4 Wrote 3 lines of text

Another general technique is to set $SIG{PIPE} to 'IGNORE' , in order to ignore the PIPE signal entirely. It is now our responsibility to detect that something is amiss, which we can do by examining the result code from print() . If print() returns false, we exit the loop.

Figure 2.8 shows the code for write_ten_i.pl , which illustrates this technique. This script begins by setting $SIG{PIPE} to the string 'IGNORE' , suppressing PIPE signals. In addition, we modify the print loop so that if print() is successful, we bump up $count as before, but if it fails, we issue a warning and exit the loop via last .

Figure 2.8. The write_ten_i.pl script

graphics/02fig08.gif

When we run write_ten_i.pl we get this output:

 %  write_ten_i.pl  Writing line 1 Read_three got: This is line number 1 Writing line 2 Read_three got: This is line number 2 Writing line 3 Read_three got: This is line number 3 Writing line 4 An error occurred during writing: Broken pipe Wrote 3 lines of text

Notice that the error message that appears in $! after the unsuccessful print is "Broken pipe." If we wanted to treat this error separately from other I/O errors, we could explicitly test its value via a pattern match, or, better still, check its numeric value against the numeric error constant EPIPE. For example:

 use Errno ':POSIX'; ... unless (print PIPE "This is line number $_\n") { # handle write error    last if $! == EPIPE;   # on PIPE, just terminate the loop    die "I/O error: $!";   # otherwise die with an error message }

Sending Signals

A Perl script can send a signal to another process using the kill() function:

$count = kill($signal,@processes)

The kill() function sends signal $signal to one or more processes. You may specify the signal numerically , for example 2, or symbolically as in " INT ". @processes is a list of one or more process IDs to deliver the signal to. The number of processes successfully signaled is returned as the kill() function result.

One process can only signal another if it has sufficient privileges to do so. In general, a process running under a normal user's privileges can signal only other processes that are running under the same user's privileges. A process running with root or superuser privileges, however, can signal any other process.

The kill() function provides a few tricks. If you use the special signal number 0, then kill() will return the number of processes that could have been signaled, without actually delivering the signal. If you use a negative number for the process ID, then kill() will treat the absolute value of the number as a process group ID and deliver the signal to all members of the group.

A script can send a signal to itself by calling kill() on the variable $$ , which holds the current process ID. For example, here's a fancy way for a script to commit suicide:

 kill INT => $$; # same as kill('INT',$$)

Signal Handler Caveats

Because a signal can arrive at any time during a program's execution, it can arrive while the Perl interpreter is doing something important, like updating one of its memory structures, or even while inside a system call. If the signal handler does something that rearranges memory, such as allocating or dis posing of a big data structure, then on return from the handler Perl may find its world changed from underneath it, and get confused , which occasionally results in an ugly crash.

To avoid this possibility, signal handlers should do as little as possible. The safest course is to set a global variable and return, as we did in the PIPE handler in Figure 2.7. In addition to memory-changing operations, I/O operations within signal handlers should also be avoided. Although we liberally pepper our signal handlers with diagnostic warn() statements throughout this book, these operations should be stripped out in production programs.

It's generally OK to call die() and exit() within signal handlers. The exception is on Microsoft Windows systems, where due to limitations in the signal library, these two calls may cause "Dr. Watson" errors if invoked from within a signal handler.

Indeed, the implementation of signals on Windows systems is currently extremely limited. Simple things, such as an INT handler to catch the interrupt key, will work. More complex things, such as CHLD handlers to catch the death of a subprocess, do not work. This is an area of active development so be sure to check the release notes before trying to write or adapt any code that depends heavily on signals.

Signal handling is not implemented in MacPerl.

Timing Out Slow System Calls

A signal may occur while Perl is executing a system call. In most cases, Perl automatically restarts the call and it takes up exactly where it left off.

A few system calls, however, are exceptions to this rule. One is sleep() , which suspends the script for the indicated number of seconds. If a signal interrupts sleep() , however, it will exit early, returning the number of seconds it slept before being awakened. This property of sleep() is quite useful because it can be used to put the script to sleep until some expected event occurs.

$slept = sleep([$seconds])

Sleep for the indicated number of seconds or until a signal is received. If no argument is provided, this function will sleep forever. On return, sleep() will return the number of seconds it actually slept.

Another exception is the four-argument version of select() , which can be used to perform a timed wait until one or more of a set of filehandles are ready for I/O. This function is described in detail in Chapter 12.

Sometimes the automatic restarting of system calls is not what you want. For example, consider an application that prompts a user to type her password and tries to read the response from standard input. You might want the read to time out after some period of time in case the user has wandered off and left the terminal unattended. This fragment of code might at first seem to do the trick:

 my $timed_out = 0;  $SIG{ALRM} = sub { $timed_out = 1 }; print STDERR "type your password: "; alarm (5);    # five second timeout my $password = <STDIN>; alarm (0); print STDERR "you timed out\n" if $timed_out;

Here we use the alarm() function to set a timer. When the timer expires , the operating system generates an ALRM signal, which we intercept with a handler that sets the $timed_out global to true. In this code we call alarm() with a five-second timeout, and then read a line of input from standard input. After the read completes, we call alarm() again with an argument of zero, turning the timer off. The idea is that the user will have five seconds in which to type a password. If she doesn't, the alarm clock goes off and we fall through to the rest of the program.

$seconds_left = alarm($seconds)

Arrange for an ALRM signal to be delivered to the process after $seconds . The function result is the number of seconds left from the previous timer, if any. An argument of zero disables the timer.

The problem is that Perl automatically restarts slow system calls, including <> . Even though the alarm clock has gone off, we remain in the <> call, waiting for the user's keyboard input.

The solution to this problem is to use eval{} and a local ALRM handler to abort the read. The general idiom is this:

 print STDERR "type your password: "; my $password =   eval {     local $SIG{ALRM} = sub { die "timeout\n" };     alarm (5);    # five second timeout     return <STDIN>;   }; alarm (0); print STDERR "you timed out\n" if $@ =~ /timeout/;

Instead of having an ALRM handler in the main body of the program, we localize it within an eval{} block. The eval{} block sets the alarm, as before, and attempts to read from STDIN . If <> returns before the timer goes off, then the line of input is returned from the eval{} block, and assigned to $password .

However, if the timer goes off before the input is complete, the ALRM handler executes, dying with the error message "timeout." However, since we are dying within an eval{} block, the effect of this is for eval{} to return undef , setting the variable $@ to the last error message. We pattern match $@ for the timeout message, and print a warning if found.

In either case, we turn off the timer immediately after returning from the eval{} block in order to avoid having the timer go off at an inconvenient moment.

We will use this technique several times in later chapters when we need to time out slow network calls.

Top