Section 12.1. Signal Concepts

12.1. Signal Concepts

12.1.1. Life Cycle of a Signal

Signals have a well-defined life; they are created, they are stored until the kernel can take an action based on the signal, and then they cause an action to occur. Creating a signal is variously called raising, generating, or sending a signal. Normally, a process sends a signal to another process while the kernel generates a signal to send to a process. When a process sends itself a signal, it is often called raising the signal. These terms are not used with complete consistency, however.

During the time between a signal being sent and the signal causing an action to occur, the signal is called pending. This means that the kernel knows the signal needs to be handled, but it has not yet had the chance to do so. Once the signal is given to the target process, the signal has been delivered. If delivering the signal causes a special piece of code (a signal handler) to be run, the signal has been caught. There are ways for a process to prevent asynchronous delivery of a signal but still handle the signal (through the sigwait() system call, for example). When this happens, the signal has been accepted.

To help keep things clear, we use this set of terminology throughout the book.^[1]

^[1] This terminology is also used in much of the standards literature, including the Single Unix Specification.

12.1.2. Simple Signals

Originally, handling signals was simple. The signal() system call was used to tell the kernel how to deliver a particular signal to the process:

 #include <signal.h> void * signal(int signum, void * handler);

signum is the signal to handle, and handler defines the action to perform when the signal is delivered. Normally, the handler is a pointer to a function that takes no parameters and returns no value. When the signal is delivered to the process, the kernel executes the handler function as soon as possible. Once the function returns, the kernel resumes process execution wherever it was interrupted. System-level engineers will recognize this type of signal mechanism as analogous to hardware interrupt delivery; interrupts and signals are very similar and present many of the same problems.

There are many signal numbers available. Table 12.1 on page 217 lists all the non-real-time signals Linux currently supports. They have symbolic names that begin with SIG, and we use SIGFOO when we talk about a generic signal.

Table 12.1. Signals
Signal	Description	Default Action
`SIGABRT`	Delivered by `abort()`	Terminate, core
`SIGALRM`	An `alarm()` has expired	Terminate
`SIGBUS`	Hardware-dependent error	Terminate, core
`SIGCHLD`	Child process terminated	Ignored
`SIGCONT`	Process has been continued after being stopped	Ignored
`SIGFPE`	Arithmetic point exception	Terminate, core
`SIGHUP`	The process's controlling tty was closed	Terminate
`SIGILL`	An illegal instruction was encountered	Terminate, core
`SIGINT`	User sent the interrupt character (`^C`)	Terminate
`SIGIO`	Asynchronous I/O has been received	Terminate
`SIGKILL`	Uncatchable process termination	Terminate
`SIGPIPE`	Process wrote to a pipe w/o any readers	Terminate
`SIGPROF`	Profiling segment ended	Terminate
`SIGPWR`	Power failure detected	Terminate
`SIGQUIT`	User sent the quit character (`^\`)	Terminate, core
`SIGSEGV`	Memory violation	Terminate, core
`SIGSTOP`	Stops the process without terminating it	Process stopped
`SIGSYS`	An invalid system call was made	Terminate, core
`SIGTERM`	Catchable termination request	Terminate
`SIGTRAP`	Breakpoint instruction encountered	Terminate, core
`SIGTSTP`	User sent suspend character (`^Z`)	Process stopped
`SIGTTIN`	Background process read from controlling tty	Process stopped
`SIGTTOU`	Background process wrote to controlling tty	Process stopped
`SIGURG`	Urgent I/O condition	Ignored
`SIGUSR1`	Process-defined signal	Terminate
`SIGUSR2`	Process-defined signal	Terminate
`SIGVTALRM`	`setitimer()` timer has expired	Terminate
`SIGWINCH`	Size of the controlling tty has changed	Ignored
`SIGXCPU`	CPU resource limit exceeded	Terminate, core
`SIGXFSZ`	File-size resource limit exceeded	Terminate, core

The handler can take on two special values, SIG_IGN and SIG_DFL (both of which are defined through <signal.h>). If SIG_IGN is specified, the signal is ignored; SIG_DFL tells the kernel to perform the default action for the signal, usually killing the process or ignoring the signal. Two signals, SIGKILL and SIGSTOP, cannot be caught. The kernel always performs the default action for these two signals, killing the process and stopping the process, respectively.

The signal() function returns the previous signal handler (which could have been SIG_IGN or SIG_DFL). Signal handlers are preserved when new processes are created by fork(), and any signals that are set to SIG_IGN remain ignored after an exec().^[2] All signals not being ignored are set to SIG_DFL after an exec().

^[2] This is the mechanism used by the nohup utility.

All this seems simple enough until you ask yourself: What will happen if SIGFOO is sent to a process that is already running a signal handler for SIGFOO? The obvious thing for the kernel to do is interrupt the process and run the signal handler again. This creates two problems. First, the signal handler must function properly if it is invoked while it is already running. Although this may be easy, signal handlers that manipulate program-wide resources, such as global data structures or files, need to be written very carefully. Functions that behave properly when they are called in this manner are called reentrant functions.^[3]

^[3] The need for reentrant functions is not limited to signal handlers. Multithreaded applications must take great care to ensure proper reentrancy and locking.

The simple locking techniques that are sufficient to coordinate data access between concurrent processes do not allow reentrancy. For example, the file-locking techniques presented in Chapter 13 cannot be used to allow a signal handler that manipulates a data file to be reentrant. When the signal handler is called the first time, it can lock the data file just fine and begin writing to it. If the signal handler is then interrupted by another signal while it holds the lock, the second invocation of the signal handler cannot lock the file, because the first invocation holds the lock. Unfortunately, the invocation that holds the lock is suspended until the invocation that wants the lock finishes running.

The difficulty of writing reentrant signal handlers is a major reason for the kernel not to deliver signals that a process is already handling. Such a model also makes it difficult for processes to handle a large number of signals that are being sent to the process very rapidly. As each signal results in a new invocation of the signal handler, the process's stack grows without bound, despite the program itself being well-behaved.

The first solution to this problem was ill-conceived. Before the signal handler was invoked, the handler for that signal was reset to SIG_DFL and the signal handler was expected to set a more appropriate signal disposition as soon as it could. Although this did simplify writing signal handlers, it made it impossible for a developer to handle signals in a reliable fashion. If two occurrences of the same signal occurred quickly, the kernel handled the second signal in the default fashion. That meant that the second signal was ignored (and lost forever) or the process was terminated. This signal implementation is known as unreliable signals because it makes it impossible to write well behaved signal handlers.

Unfortunately, this is exactly the signal model used in the ANSI/ISO C standard.^[4] Although reliable signal APIs that fix these shortcomings are widespread, ANSI/ISO's unreliable standardization of the signal() function will probably be around forever.

^[4] Well, not exactly. The ANSI/ISO C signal handling model is not as well specified as the one we just presented. It does, however, mandate that signal handlers be reset to SIG_DFL before a signal is delivered, forcing all ANSI/ISO C signal() functions to be unreliable.

12.1.3. Reliable Signals

The implementers of BSD realized that a solution to the multiple signals problem would be to simply wait until the process finishes handling the first signal to deliver the second signal. This ensures that both signals are received and removes the risk of stack overflows. Recall that when the kernel is holding a signal for later delivery, the signal is said to be pending.

However, if a process is sent SIGFOO while a SIGFOO signal is already pending, only one of those SIGFOO signals is delivered to the process. There is no way for a process to know how many times a signal was sent to it, as multiple signals may have been coalesced into one. This is not normally much of a problem, however. As signals do not carry any information other than the signal number with them, sending a signal twice in a very short period of time is usually the same as sending it a single time, so if the program receives the signal only once, it does not matter much. This is different from performing the default action on the second signal (which occurs with unreliable signals).^[5]

^[5] The POSIX Real Time Signal specification allows some signals to be queued and for signals to carry a limited amount of data, changing this model significantly. Real-time signals are discussed on pages 227-230.

The idea of a signal's being automatically blocked has been extended to allow a process to explicitly block signals. This makes it easy to protect critical pieces of code, while still handling all the signals that are sent. Such protection lets the signal handlers manipulate data structures that are maintained by other pieces of the code by providing simple synchronization.

Although BSD provided the basic signal model POSIX adopted, the POSIX standard committee made it simpler for system calls to modify the disposition of groups of signals by introducing new system calls that operate on sets of signals. A set of signals is represented by the data type sigset_t, and a set of macros is provided to manipulate it.^[6]

^[6] This is similar to the fd_set type used by the select() system call discussed in Chapter 13.

12.1.4. Signals and System Calls

A signal is often delivered to a process that is waiting for an external event to occur. For instance, a text editor is often waiting for read() to return input from the terminal. When the system administrator sends the process a SIGTERM signal (the normal signal sent by the kill command, allowing a process to terminate cleanly), the process could handle it in a few ways:

It could make no attempt to catch the signal and be terminated by the kernel (the default handling of SIGTERM). This would leave the user's terminal in a nonstandard configuration, making it difficult for them to continue.
It could catch the signal, have the signal handler clean up the terminal, and then exit. Although this is appealing, in complex programs it is difficult to write a signal handler that knows enough about what the program was doing when it was interrupted to clean it up properly.
It could catch the signal, set a flag indicating that the signal occurred, and somehow cause the blocked system call (in this case, read()) to exit with an error indicating something unusual happened. The normal execution pathway could then check for the flag and handle it appropriately.

As the final choice seems much cleaner and easier than the others, the original signal implementation caused slow system calls to return EINTR when they were interrupted by a signal, whereas fast system calls were completed before the signal was delivered.

Slow system calls take an indeterminate amount of time to complete. System calls that wait for unpredictable resources, such as other processes, network data, or a Homo sapiens to perform some action are considered slow. The wait() family of system calls, for example, does not normally return until a child process exits. As there is no way to know how long that may take, wait() is a slow system call. File access system calls are considered slow if they access slow files, and fast if they access fast files.^[7]

^[7] The difference between fast files and slow files is the same as the difference between fast and slow system calls and is discussed in more detail on page 167.

It was the process's job to handle EINTR and restart system calls as necessary. Although this provided all the functionality people needed, it made it more difficult to write code that handled signals. Every time read() was called on a slow file descriptor, the code had to be modified to check for EINTR and restart the call, or the code might not perform as expected.

To "simplify" things, 4.2BSD automatically restarted certain system calls (notably read() and write()). For the most common operations, programs no longer needed to worry about EINTR because the system call would continue after the process handled the signal. Later versions of Unix changed which system calls would be automatically restarted, and 4.3BSD allows you to choose whether to restart system calls. The POSIX signal standard does not specify which behavior should be used, but all popular systems agree on how to handle this case. By default, system calls are not restarted, but for each signal, the process can set a flag that indicates that it would like the system to automatically restart system calls interrupted by that signal.

12.1. Signal Concepts

12.1.1. Life Cycle of a Signal

12.1.2. Simple Signals

Table 12.1. Signals

12.1.3. Reliable Signals

12.1.4. Signals and System Calls