Signals
As with filehandles, understanding signals is fundamental to network programming. A signal is a message sent to your program by the operating system to tell it that something important has occurred. A signal can
indicate
an error in the program itself such as an attempt to divide by zero, an event that requires immediate attention such as an attempt by the
user
to interrupt the program, or a noncritical informational event such as the termination of a subprocess that your program has launched.
In addition to signals sent by the operating system, processes can signal each other. For example, when the user presses control-C (^C) on the keyboard to send an interrupt signal to the currently running program, that signal is sent not by the operating system, but by the command shell that pro cesses and interprets keystrokes. It is also possible for a process to send signals to itself.
Common Signals
The POSIX standard defines nineteen signals. Each has a small integer value and a symbolic
name
. We list them in Table 2.2 (the gaps in the integer sequence represent nonstandard signals used by some systems).
The third column of the table indicates what happens when a process receives the signal. Some signals do nothing. Others cause the process to terminate immediately, and still others terminate the process and cause a
core
dump. Most signals can be "caught." That is, the program can install a handler for the signal and take special action when the signal is received. A few signals, however, cannot be intercepted in this way.
You don't need to understand all of the signals listed in Table 2.2 because either they won't occur during the execution of a Perl script, or their generation indicates a low-level bug in Perl itself that you can't do anything about. However, a handful of signals are relatively common, and we'll look at them in more detail now.
HUP
signals a hangup event. This typically occurs when a user is running a program from the command line, and then
closes
the command-line window or exits the interpreter shell. The default action for this signal is to terminate the program.
INT
signals a user-initiated interruption. It is generated when the user presses the interrupt key, typically ^C. The default behavior of this signal is to terminate the program.
QUIT
is similar to
INT
, but also causes the program to generate a core file (on UNIX systems). This signal is issued when the user presses the "quit" key, ordinarily ^\.
Table 2.2. POSIX Signals
|
Signal Name
|
Value
|
Notes
|
Comment
|
|
HUP
|
1
|
A
|
Hangup
detected
|
|
INT
|
2
|
A
|
Interrupt from keyboard
|
|
QUIT
|
3
|
A
|
Quit from keyboard
|
|
ILL
|
4
|
A
|
Illegal Instruction
|
|
ABRT
|
6
|
C
|
Abort
|
|
FPE
|
8
|
C
|
Floating point exception
|
|
KILL
|
9
|
AF
|
Termination signal
|
|
USR1
|
10
|
A
|
User-defined signal 1
|
|
SEGV
|
11
|
C
|
Invalid memory reference
|
|
USR2
|
12
|
A
|
User-defined signal 2
|
|
PIPE
|
13
|
A
|
Write to pipe with no readers
|
|
ALRM
|
14
|
A
|
Timer signal from alarm clock
|
|
TERM
|
15
|
A
|
Termination signal
|
|
CHLD
|
17
|
B
|
Child
terminated
|
|
CONT
|
18
|
E
|
Continue if
stopped
|
|
STOP
|
19
|
DF
|
Stop process
|
|
TSTP
|
20
|
D
|
Stop typed at tty
|
|
TTIN
|
21
|
D
|
tty input for background process
|
|
TTOU
|
22
|
D
|
tty output for background process
|
|
Notes:
A ”Default action is to terminate process.
B ”Default action is to ignore the signal.
C ”Default action is to terminate process and dump core.
D ”Default action is to stop the process.
E ”Default action is to resume the process.
F ”Signal cannot be caught or ignored.
|
By convention,
TERM
and
KILL
are used by one process to terminate another. By default,
TERM
causes immediate termination of the program, but a program can install a signal handler for
TERM
to intercept the terminate request and possibly perform some cleanup actions before quitting. The
KILL
signal, in contrast, is uncatchable. It causes an immediate shutdown of the process without chance of
appeal
. For example, when a UNIX system is shutting down, the script that handles the shutdown process first sends a
TERM
to each running process in
turn
, giving it a chance to clean up. If the process is still running a few tens of seconds later, then the shutdown script sends a
KILL
.
The
PIPE
signal is sent when a program
writes
to a pipe or socket but the process at the remote end has either exited or closed the pipe. This signal is so common in networking applications that we will look at it closely in the Handling PIPE Exceptions section.
ALRM
is used in conjunction with the
alarm()
function to send the program a prearranged signal after a certain amount of time has elapsed. Among other things,
ALRM
can be used to time out blocked I/O calls. We will see examples of this in the Timing Out Long-Running Operations section.
CHLD
occurs when your process has launched a subprocess, and the status of the child has changed in some way. Typically the change in status is that the child has exited, but
CHLD
is also generated whenever the child is stopped or
continued
. We discuss how to deal with
CHLD
in much greater detail in Chapters 4 and 9.
STOP
and
TSTP
both have the effect of stopping the current process. The process is put into
suspended
animation indefinitely; it can be resumed by sending it a
CONT
signal.
STOP
is
generally
used by one program to stop another.
TSTP
is issued by the interpreter shell when the user presses the stop key (^Z on UNIX systems). The other difference between the two is that
TSTP
can be caught, but
STOP
cannot be caught or ignored.
Catching Signals
You can catch a signal by adding a signal handler to the
%SIG
global hash. Use the name of the signal you wish to catch as the hash key. For example, use
$SIG{INT}
to get or set the
INT
signal handler. As the value, use a code reference: either an anonymous subroutine or a reference to a named subroutine. For example, Figure 2.6 shows a tiny script that
installs
an
INT
handler. Instead of terminating when we press the interrupt key, it prints out a short message and bumps up a counter. This goes on until the script counts three interruptions, at which point it finally terminates. In the transcript that
follows
, the "Don't interrupt me!" message was triggered each time I typed ^C:
Figure 2.6. Catching the INT signal
%
interrupt.pl
I'm sleeping.
I'm sleeping.
Don't interrupt me! You've already interrupted me 1x.
I'm sleeping.
I'm sleeping.
Don't interrupt me! You've already interrupted me 2x.
I'm sleeping.
Don't interrupt me! You've already interrupted me 3x.
Let's look at the script in detail.
Lines 1 “3: Initialize script
We turn on strict syntax checking, and declare a global counter named
$interruptions
. This counter will keep track of the number of times the script has received
INT
.
Line 4: Install INT handler
We install a handler for the
INT
signal by setting
$SIG{INT}
to a reference to the subroutine
handle_interruptions()
.
Lines 5 “8: Main loop
The main loop of the program simply prints a message and calls sleep with an argument of 5. This puts the program to sleep for 5 seconds, or until a signal is received. This continues until the
$interruptions
counter becomes 3 or greater.
Lines 9 “12: The
handle_interruptions()
subroutine The
handle_interruptions()
subroutine is called whenever the
INT
signal arrives, even if the program is busy doing something else at that moment. In this case, our signal handler bumps up
$interruptions
and prints out a warning.
For short signal handlers, you can use an anonymous subroutine as the handler. For example, this code fragment is equivalent to that in Figure 2.6, but we don't have to
formally
name the handler subroutine:
$SIG{INT} = sub {
$interruptions++;
warn "Don't interrupt me! You've already interrupted
me ${interruptions}x.\n";
};
In addition to code references,
%SIG
recognizes two special cases. The string "
DEFAULT
"
restores
the default behavior of the signal. For example, setting
$SIG{INT}
to "
DEFAULT
" will cause the
INT
signal to terminate the script once again. The string "
IGNORE
" will cause the signal to be ignored altogether.
As previously mentioned, don't bother installing a handler for either
KILL
or
STOP
. These signals can be
neither
caught nor ignored, and their default actions will always be performed.
If you wish to use the same routine to catch several different signals, and it is important for the subroutine to distinguish one signal from another, it can do so by looking at its first argument, which will contain the name of the signal. For example, for
INT
signals, the handler will be called with the string "
INT
":
$SIG{TERM} = $SIG{HUP} = $SIG{INT} = \&handler
sub handler {
my $sig = shift;
warn "Handling a $sig signal.\n";
}
Handling PIPE Exceptions
We now have what we need to deal with PIPE exceptions. Recall the
write_ ten.pl
and
read_three.pl
examples from Figures 2.4 and 2.5 in The Dreaded PIPE Error section.
write_ten.pl
opens a pipe to
read_three.pl
and
tries
to write ten lines of text to it, but
read_three.pl
is only prepared to accept three lines, after which it exits and closes its end of the pipe.
write_ten.pl
, not knowing that the other end of the connection has exited, attempts to write a fourth line of text, generating a
PIPE
signal.
We will now modify
write_ten.pl
so that it detects the
PIPE
error and handles it more gracefully. We will use variants on this technique in later chapters that deal with common issues in network communications.
The first technique is shown in Figure 2.7, the
write_ten_ph.pl
script. Here we set a global flag,
$ok
, which starts out true. We then install a
PIPE
handler using this code:
Figure 2.7. The
write_ten_ph.pl
script
$SIG{PIPE} = sub { undef $ok };
When a
PIPE
signal is received, the handler will undefine the
$ok
flag, making it false.
The other modification is to replace the simple
for()
loop in the original version with a more sophisticated version that checks the status of
$ok
. If the flag becomes false, the loop exits. When we run the modified script, we see that the program runs to completion, and correctly
reports
the number of lines successfully written:
%
write_ten_ph.pl
Writing line 1
Read_three got: This is line number 1
Writing line 2
Read_three got: This is line number 2
Writing line 3
Read_three got: This is line number 3
Writing line 4
Wrote 3 lines of text
Another general technique is to set
$SIG{PIPE}
to
'IGNORE'
, in order to ignore the
PIPE
signal entirely. It is now our responsibility to detect that something is amiss, which we can do by examining the result code from
print()
. If
print()
returns false, we exit the loop.
Figure 2.8 shows the code for
write_ten_i.pl
, which illustrates this technique. This script begins by setting
$SIG{PIPE}
to the string
'IGNORE'
, suppressing
PIPE
signals. In addition, we modify the print loop so that if
print()
is successful, we bump up
$count
as before, but if it fails, we issue a warning and exit the loop via
last
.
Figure 2.8. The
write_ten_i.pl
script
When we run
write_ten_i.pl
we get this output:
%
write_ten_i.pl
Writing line 1
Read_three got: This is line number 1
Writing line 2
Read_three got: This is line number 2
Writing line 3
Read_three got: This is line number 3
Writing line 4
An error occurred during writing: Broken pipe
Wrote 3 lines of text
Notice that the error message that appears in
$!
after the
unsuccessful
print is "Broken pipe." If we wanted to treat this error separately from other I/O errors, we could explicitly test its value via a pattern match, or, better still, check its numeric value against the numeric error constant EPIPE. For example:
use Errno ':POSIX';
...
unless (print PIPE "This is line number $_\n") { # handle write error
last if $! == EPIPE; # on PIPE, just terminate the loop
die "I/O error: $!"; # otherwise die with an error message
}
Sending Signals
A Perl script can send a signal to another process using the
kill()
function:
|
$count = kill($signal,@processes)
The
kill()
function sends signal
$signal
to one or more processes. You may specify the signal
numerically
, for example 2, or symbolically as in "
INT
".
@processes
is a list of one or more process IDs to deliver the signal to. The number of processes successfully signaled is returned as the
kill()
function result.
|
One process can only signal another if it has sufficient privileges to do so. In general, a process running under a normal user's privileges can signal only other processes that are running under the same user's privileges. A process running with root or superuser privileges, however, can signal any other process.
The
kill()
function provides a few tricks. If you use the special signal number 0, then
kill()
will return the number of processes that could have been signaled, without actually delivering the signal. If you use a negative number for the process ID, then
kill()
will treat the absolute value of the number as a process
group
ID and deliver the signal to all
members
of the group.
A script can send a signal to itself by calling
kill()
on the variable
$$
, which holds the current process ID. For example, here's a fancy way for a script to commit suicide:
kill INT => $$; # same as kill('INT',$$)
Signal Handler Caveats
Because a signal can arrive at any time during a program's execution, it can
arrive
while the Perl interpreter is doing something important, like updating one of its memory structures, or even while inside a system call. If the signal handler does something that rearranges memory, such as allocating or dis
posing
of a big data structure, then on return from the handler Perl may find its world changed from underneath it, and get
confused
, which occasionally results in an ugly crash.
To avoid this possibility, signal handlers should do as little as possible. The safest course is to set a global variable and return, as we did in the
PIPE
handler in Figure 2.7. In addition to memory-changing operations, I/O operations within signal handlers should also be avoided. Although we liberally pepper our signal handlers with diagnostic
warn()
statements throughout this book, these operations should be stripped out in production programs.
It's generally OK to call
die()
and
exit()
within signal handlers. The exception is on Microsoft Windows systems, where due to limitations in the signal library, these two calls may cause "Dr. Watson" errors if invoked from within a signal handler.
Indeed, the implementation of signals on Windows systems is currently extremely limited. Simple things, such as an
INT
handler to catch the interrupt key, will work. More complex things, such as
CHLD
handlers to catch the death of a subprocess, do not work. This is an area of active development so be sure to check the release notes before trying to write or adapt any code that depends heavily on signals.
Signal handling is not implemented in MacPerl.
Timing Out Slow System Calls
A signal may occur while Perl is executing a system call. In most cases, Perl automatically restarts the call and it takes up exactly where it left off.
A few system calls, however, are exceptions to this rule. One is
sleep()
, which suspends the script for the indicated number of seconds. If a signal interrupts
sleep()
, however, it will exit early, returning the number of seconds it slept before being awakened. This property of
sleep()
is quite useful because it can be used to put the script to sleep until some expected event occurs.
|
$slept = sleep([$seconds])
Sleep for the indicated number of seconds or until a signal is received. If no argument is provided, this function will sleep forever. On return,
sleep()
will return the number of seconds it actually slept.
|
{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}
Another exception is the four-argument version of
select()
, which can be used to perform a timed wait until one or more of a set of filehandles are ready for I/O. This function is described in detail in Chapter 12.
Sometimes the automatic restarting of system calls is not what you want. For example, consider an application that prompts a user to type her password and tries to read the response from standard input. You might want the read to time out after some period of time in case the user has wandered off and left the terminal unattended. This fragment of code might at first seem to do the trick:
my $timed_out = 0;
$SIG{ALRM} = sub { $timed_out = 1 };
print STDERR "type your password: ";
alarm (5); # five second timeout
my $password = <STDIN>;
alarm (0);
print STDERR "you timed out\n" if $timed_out;
Here we use the
alarm()
function to set a timer. When the timer
expires
, the operating system generates an
ALRM
signal, which we intercept with a handler that sets the
$timed_out
global to true. In this code we call
alarm()
with a five-second timeout, and then read a line of input from standard input. After the read completes, we call
alarm()
again with an argument of zero, turning the timer off. The idea is that the user will have five seconds in which to type a password. If she doesn't, the alarm clock goes off and we fall through to the rest of the program.
|
$seconds_left = alarm($seconds)
Arrange for an
ALRM
signal to be delivered to the process after
$seconds
. The function result is the number of seconds left from the previous timer, if any. An argument of zero disables the timer.
|
The problem is that Perl automatically restarts slow system calls, including
<>
. Even though the alarm clock has gone off, we
remain
in the
<>
call, waiting for the user's keyboard input.
The solution to this problem is to use
eval{}
and a local
ALRM
handler to abort the read. The general idiom is this:
print STDERR "type your password: ";
my $password =
eval {
local $SIG{ALRM} = sub { die "timeout\n" };
alarm (5); # five second timeout
return <STDIN>;
};
alarm (0);
print STDERR "you timed out\n" if $@ =~ /timeout/;
Instead of having an
ALRM
handler in the main body of the program, we localize it within an
eval{}
block. The
eval{}
block sets the alarm, as before, and attempts to read from
STDIN
. If
<>
returns before the timer goes off, then the line of input is returned from the
eval{}
block, and assigned to
$password
.
However, if the timer goes off before the input is complete, the
ALRM
handler executes, dying with the error message "timeout." However, since we are dying within an
eval{}
block, the effect of this is for
eval{}
to return
undef
, setting the variable
$@
to the last error message. We pattern match
$@
for the timeout message, and print a warning if found.
In either case, we turn off the timer immediately after returning from the
eval{}
block in order to avoid having the timer go off at an inconvenient moment.
We will use this technique several times in later chapters when we need to time out slow network calls.
|