Signals

UNIX programs often interact with their environment and other programs through the use of signals. Signals are software interrupts that the kernel raises in a process at the behest of other processes, or as a reaction to events that occur in the kernel.

Note

The Windows POSIX subsystem is capable of dealing with signals as well, but they are primarily a UNIX feature.

Each process defines how to handle its incoming signals by choosing to associate one of the following actions with a signal:

Ignoring the signal A process can ignore a signal by informing the kernel that it wants to ignore the signal. Two signals can't be ignored: SIGKILL and SIGSTOP. SIGKILL always kills a process, and SIGSTOP always stops a process.
Blocking the signal A process can postpone handling a signal by blocking it, in which case the signal is postponed until the process unblocks it. As with blocking, the SIGKILL and SIGSTOP signals can't be blocked.
Installing a signal handler A process can install a signal handler, which is a function called when a signal is delivered. This function is called completely asynchronously: When a signal is delivered, the execution context of a process is suspended, and a new one is created where execution starts in the designated signal handler function. When that handler returns, execution resumes where it left off.

If a process doesn't indicate specifically how it deals with a particular signal, then a default action will be taken. Table 13-2 lists the signals provided by a typical POSIX-compliant implementation and the default actions associated with those signals. This table is taken from the Linux signal(7) man page.

Table 13-2. Signals and Their Default Actions
Signal Number	Signal Name	Meaning	Default Action
1	`SIGHUP`	Hang up from controlling terminal	Terminate
2	`SIGINT`	Interrupt	Terminate
3	`SIGQUIT`	Quit	Core dump
4	`SIGILL`	Illegal instruction	Core dump
5	`SIGTRAP`	Software trap	Core dump
6	`SIGABRT`	Abort	Core dump
7	`SIGEMT`	EMT instruction	Terminate
8	`SIGFPE`	Floating point exception	Core dump
9	`SIGKILL`	Kill	Terminate
10	`SIGBUS`*	Data bus error	Core dump
11	`SIGSEGV`	Segmentation fault	Core dump
12	`SIGSYS`*	Invalid system call parameter	Core dump
13	`SIGPIPE`	Write to a pipe when there's no process to read from it	Terminate
14	`SIGALRM`	Alarm	Terminate
15	`SIGTERM`	Terminate	Terminate
16	`SIGURG`	Urgent data on I/O channel	Ignore
17	`SIGSTOP`	Stop process	Stop
18	`SIGTSTP`	Interactive stop	Stop
19	`SIGCONT`	Continue	Continue a stopped process
20	`SIGCHLD`	Child exited	Ignored
21	`SIGTTIN`	Background read attempt from terminal	Stop
22	`SIGTTOU`	Background write attempt from terminal	Stop
23	`SIGIO`	I/O available or completed	Terminate
24	`SIGXCPU`	CPU time limit exceeded	Core dump
25	`SIGXFSZ`	File size limit exceeded	Core dump
26	`SIGVTALRM`	Virtual time alarm	Terminate
27	`SIGPROF`	Profiling time alarm	Terminate
28	`SIGWINCH`	Window size change	Ignored
29	`SIGINFO`	Information request	Terminate
30	`SIGUSR1`	User-defined signal	Ignored
31	`SIGUSR2`	User-defined signal	Ignored

Note that the numbers assigned to signals might vary among operating systems and architectures, and not all signals are available on all architectures. For example, SIGBUS isn't defined for machines with an Intel architecture, but is defined for machines with a Sun SPARC architecture. If a signal isn't defined for a specific architecture, it might be ignored instead of performing the default action listed in Table 13-2.

Each process has a signal mask, which is a bitmask describing which signals should be blocked by a process and which signals should be delivered. A process can block a signal by altering this signal mask, as you see shortly in "Handling Signals."

Signal handling is an important part of many UNIX applications. Although signals are a fairly simple mechanism, there are some subtleties to dealing with them correctly when implementing software. So before you move on to signal-related problems, the following sections briefly describe the signal API.

Sending Signals

The kill() system call is used to send a signal to a process. You can test whether processes are present by killing them with signal zero or by trying an invalid signal and looking for a permission denied message.

To send a signal to a process in Linux and Solaris, the sender must be the superuser or have a real or effective user ID equal to the receiver's real or saved set user ID. However, a sender can always send SIGCONT to a process in its session.

To send a signal to a process in the BSD OSs, the sender must be the superuser, or the real or effective user IDs must match the receiver's real or effective user IDs. Note that this means a daemon that temporarily assumes the role of an unprivileged user with seteuid() opens itself to signals being delivered from that user.

Earlier versions of Linux had the same behavior as BSD. For example, if the Network File System (NFS) userland daemon temporarily set its effective user ID to that of a normal user, that normal user could send signals to the daemon and potentially kill it. This is what precipitated the introduction of file system user IDs (FSUIDs) in Linux. They are now largely redundant in Linux because temporarily assuming an effective user ID no longer exposes a daemon to signals.

FTP daemons are another good example of a situation in which a daemon running as root assumes the effective user permissions of a nonprivileged user. If a normal user logs in to an FTP daemon, the daemon uses that user's effective user ID so that it can perform file system interaction safely. On a BSD system, therefore, if that same user is logged in to a shell, he or she can send signals to the daemon and kill it. In previous versions, this had more significant consequences, as a core dump often contained password information from the system authentication database.

OpenBSD has a unique restriction: A nonroot user can send only the following signals to a setuid or setgid process: SIGKILL, SIGINT, SIGTERM, SIGSTOP, SIGTTIN, SIGTTOU, SIGTSTP, SIGHUP, SIGUSR1, SIGUSR2, and SIGCONT.

Handling Signals

There are a number of ways to instruct a process how to respond to a signal. First, the signal() function is used to set a routine for installing a handler to deal with the specified signal. The semantics from the man page are shown in the following prototype:

#include <signal.h> typedef void (*sighandler_t)(int); sighandler_t signal(int signum, sighandler_t handler);

The signum parameter indicates what signal to handle, and the handler argument indicates the routine that should be called for this signal. The signal() function returns the old handler for the specified signal. Instead of specifying a new signal-handling routine, the developer can elect to specify one of two constants for the handler parameter: SIG_IGN if the signal should be ignored and SIG_DFL if the default action should be taken when a signal is received.

Note

The default action varies depending on what signal is received. For example, the default action for SIGSEGV is to create a core image and terminate the process. The default action for SIGSTOP is to place the current process in the background. The default actions for each signal were presented earlier in Table 13-2.

Developers can also set handlers via the sigaction() interface, which has the following prototype:

#include <signal.h> int sigaction(int sig, const struct sigaction *act,               struct sigaction *oact);

This interface enables you to set and retrieve slightly more detailed attributes for each signal an application handles. These attributes are supplied in the form of the sigaction structure, which is roughly defined like this:

struct sigaction {      void      (*sa_handler)(int);      void      (*sa_sigaction)(int, siginfo_t *, void *);      sigset_t  sa_mask;      int       sa_flags; }

The exact structure definition varies slightly between implementations. Basically, there are two function pointers: one to a signal handler (sa_handler) and one to a signal catcher (sa_sigaction). Developers set one or the other to be called upon receipt of the specified signal.

Note

Which handler is called from the sigaction structurethe handler (sa_handler) or the catcher (sa_sigaction)? It depends on the sa_flags member in the structure. If the SA_SIGINFO flag is set, sa_sigaction is called. Otherwise, sa_handler is called. In reality, because you are supposed to specify only one and can't define both, often these two structure members are coded as a union, so defining one overrides a previous definition of the other.

The sa_mask field describes a set of signals that should be blocked while the signal handler is running, and the sa_flags member describes some additional behavioral characteristics for how to handle the signal, which are mentioned in "Signal Vulnerabilities" later in this chapter.

The following function is used to change the process signal mask so that previously blocked signals can be delivered or to block the delivery of certain signals:

int sigprocmask(int how, const sigset_t *set, sigset_t *oset)

The how argument specifies how the set parameter should be interpreted and can take one of three values:

SIG_BLOCK Indicates that the set parameter contains a set of signals to be added to the process signal mask
SIG_UNBLOCK Indicates that the set parameter contains a set of signals to be unblocked from the current signal mask
SIG_SETMASK Indicates that the set parameter should replace the current signal mask

The oset parameter is filled in with the previous signal mask of the process.

In addition to these functions, you can make a multitude of other signal-related library calls. Only the ones to declare signal handlers and set actions are described in the following sections.

Jump Locations

On UNIX systems, you can return to a point in a program from any other point in a program contingent on a certain condition. To do this, you use setjmp(), longjmp(), sigsetjmp(), and siglongjmp(). Although these functions aren't part of the signal API, they are quite relevant, as they are often used in signal-handling routines to return to a certain location in the program in order to continue processing after a signal has been caught.

The setjmp() function is used to designate a point in the program to which execution control is returned when the longjmp() function is called:

int setjmp(jmp_buf env) void longjmp(jmp_buf env, int val)

The context the program is in when setjmp() is called is restored when returned to via longjmp()that is, the register contents are reset to the state they were in when setjmp() was originally called, including the program counter and stack pointer, so that execution can continue at that point. A return value of 0 indicates a direct call of setjmp(), and a value of nonzero indicates that execution has returned to this point from a longjmp(). The val parameter supplied to longjmp() indicates what setjmp() returns when longjmp() is called. Because longjmp() hands execution off to a different part of the program, it doesn't return. Here's an example of these two functions in action:

jmp_buf env; int process_message(int sock) {     struct pkt_header header;     for(;;)     {         if(setjmp(env) != 0)             log("Invalid request received, ignoring message");         if(read_packet_header(sock, &header)) < 0)             return -1;         switch(header.type)         {             case USER:                 parse_username_request(sock);                 break;             case PASS:                 parse_password_request(sock);                 break;             case OPEN:                 parse_openfile_request(sock);                 break;             case QUIT                 parse_quit_request(sock);                 break;             default:                 log("invalid message");                 break;         }     } }

Say you had a function such as the one in this example, and then several functions deep from the parse_openfile_request(), you had the following function for opening a file on the system:

int open_file_internal(unsigned char *filename) {     if(strstr(filename, "../"))         longjmp(env, 1);     ... open file ... }

In this case, the longjmp() call causes the program to restart execution at the location of the corresponding setjmp() function, in process_message(). The setjmp() function will return a nonzero valuein this case, 1 because 1 was specified as the second parameter to longjmp().

There are also two other very similar functions sigsetjmp() and siglongjmp() that are used to achieve a similar effect except that they take process signal masks into consideration as well. This is achieved through the savesigs parameter passed to sigsetjmp():

int sigsetjmp(sigjmp_buf env, int savesigs) int siglongjmp(sigjmp_buf env, int val)

If the savesigs value is nonzero, the signal mask of the process at the time sigsetjmp() is called is also saved so that when siglongjmp() is called, it can be restored. In the next section, you see why mixing these functions with signal handlers is a dangerous practice.

Signal Vulnerabilities

A signal-handling routine can be called at any point during program execution, from the moment the handler's installed until the point it's removed. Therefore, any actions that take place between those two points in time can be interrupted. Depending on what the signal handler does, this interruption could turn out to be a security vulnerability. To understand the text in this section, you must be familiar with the term asynchronous-safe (sometimes referred to as async-safe, or signal-safe). An asynchronous-safe function is a function that can safely and correctly run even if it is interrupted by an asynchronous event, such as a signal handler or interrupting thread. An asynchronous-safe function is by definition reentrant, but has the additional property of correctly dealing with signal interruptions. Generally speaking, all signal handlers need to be asynchronous-safe; the reasons why will become clear throughout this section.

Basic Interruption

The first problem with handling signals occurs when the handler relies on some sort of global program state, such as the assumption that global variables are initialized when in fact they aren't. Listing 13-1 presents a short example.

Listing 13-1.

char *user; int cleanup(int sig) {     printf("caught signal! Cleaning up..\n");     free(user);     exit(1); } int main(int argc, char **argv) {     signal(SIGTERM, cleanup);     signal(SIGINT, cleanup);     ... do stuff ...     process_file(fd);     free(user);     close(fd);     printf("bye!\n");     return 0; } int process_file(int fd) {     char buffer[1024];     ... read from file into buffer ...     user = malloc(strlen(buffer)+1);     strcpy(user, buffer);     ... do stuff ...     return 0; }

The problem with this code is that cleanup() can be called at any time after it's installed to handle the SIGTERM and SIGINT signals. If either signal is sent to the process before process_file() is called, the user variable isn't initialized. This isn't much of a problem because the initial value is NULL. However, what if a signal is delivered after free(user) and before the program exits? The user variable is deallocated with the free() function twice! That's definitely not good. You would be in even more trouble if the signal handler didn't exit the program because a signal could be sent during the strcpy() operation to free the buffer being copied into. The function would continue to copy data into a free heap chunk, which can lead to memory corruption and possibly arbitrary code execution.

In order to see how a bug of this nature might look in production code, take a look at a real-world example: OpenSSH. The following signal-handling routine is installed in OpenSSH in the main() function. It is called when OpenSSH receives an alarm signal (SIGALRM), the intention being to limit the amount of time a connecting client has to complete a successful login:

grace_alarm_handler(int sig) {     /* XXX no idea how fix this signal handler */     if (use_privsep && pmonitor != NULL && pmonitor->m_pid > 0)         kill(pmonitor->m_pid, SIGALRM);     /* Log error and exit. */     fatal("Timeout before authentication for %s", get_remote_ipaddr()); }

Most of this code is not that interesting, except for the call to fatal(). If you examine the implementation of fatal() in the OpenSSH source code, you can see it calls the cleanup_exit() function, which in turn calls do_cleanup() to deallocate global structures and exit the process. The do_cleanup() implementation is shown.

void do_cleanup(Authctxt *authctxt) {     static int called = 0;     debug("do_cleanup");     /* no cleanup if you're in the child for login shell */     if (is_child)         return;     /* avoid double cleanup */     if (called)         return;     called = 1;     if (authctxt == NULL)         return; #ifdef KRB5     if (options.kerberos_ticket_cleanup &&         authctxt->krb5_ctx)         krb5_cleanup_proc(authctxt); #endif     ... more stuff ...     /*      * Cleanup ptys/utmp only if privsep is disabled      * or if running in monitor.      */     if (!use_privsep || mm_is_monitor())         session_destroy_all(session_pty_cleanup2); }

As you can see, the do_cleanup() function is somewhat reentrant, because it checks whether it has already been called, and if it has, it just returns immediately. This prevents fatal() from calling itself, or being interrupting by a signal that results in a call to fatal(), such as the grace_alarm_handler() function. However, any functions called in do_cleanup() are also required to be reentrant if they're called elsewhere in the program. If any called function is not reentrant, then it would be possible for the vulnerable function to be interrupted by the SIGALRM signal, which will eventually lead to the same non-reentrant function being invoked again. Now take a look at the krb5_cleanup_proc() function:

void krb5_cleanup_proc(Authctxt *authctxt) {     debug("krb5_cleanup_proc called");     if (authctxt->krb5_fwd_ccache) {         krb5_cc_destroy(authctxt->krb5_ctx, authctxt->krb5_fwd_ccache);         authctxt->krb5_fwd_ccache = NULL;     }     if (authctxt->krb5_user) {         krb5_free_principal(authctxt->krb5_ctx,             authctxt->krb5_user);         authctxt->krb5_user = NULL;     }     if (authctxt->krb5_ctx) {         krb5_free_context(authctxt->krb5_ctx);         authctxt->krb5_ctx = NULL;     } }

This function simply frees a series of elements and sets them to NULL, thus preventing potential double-free scenarios. However, the krb5_user element is a structure composed of a number of pointers to strings designated by the client and limited by how much input OpenSSH accepts, which is quite a lot. The Kerberos library essentially frees these pointers one by one in a loop. After the krb5_user element is cleaned up, the authctxt->krb5_user element is set to NULL. Although this makes the function less susceptible to reentrancy problems, it is still not entirely safe. If this function were to be interrupted while deallocating the individual strings contained within krb5_user, then it is possible that krb5_user could be accessed when it is in an inconsistent state.

The krb5_user variable is filled out by krb5_parse_name(), which is called by auth_krb5_password() when authenticating clients using Kerberos authentication. The auth_krb5_password() implementation is shown:

int auth_krb5_password(Authctxt *authctxt, const char *password) {     krb5_error_code problem;     krb5_ccache ccache = NULL;     int len;     temporarily_use_uid(authctxt->pw);     problem = krb5_init(authctxt);     if (problem)         goto out;     problem = krb5_parse_name(authctxt->krb5_ctx,         authctxt->pw->pw_name,             &authctxt->krb5_user);     if (problem)         goto out; #ifdef HEIMDAL     problem = krb5_cc_gen_new(authctxt->krb5_ctx,         &krb5_mcc_ops, &ccache);     if (problem)         goto out;     problem = krb5_cc_initialize(authctxt->krb5_ctx, ccache,         authctxt->krb5_user);     if (problem)         goto out;     restore_uid();     problem = krb5_verify_user(authctxt->krb5_ctx,         authctxt->krb5_user, ccache, password, 1, NULL);     ... more stuff ...  out:     restore_uid();     if (problem) {         if (ccache)             krb5_cc_destroy(authctxt->krb5_ctx, ccache);         ... more stuff ...         krb5_cleanup_proc(authctxt);         if (options.kerberos_or_local_passwd)             return (-1);         else             return (0);     }     return (authctxt->valid ? 1 : 0); }

When an error occurs at any point during the auth_krb5_password() function, krb5_cleanup_proc() is called. This error normally occurs when krb5_verify_user() is called for a user lacking valid credentials. So, what would happen if krb5_cleanup_proc() is in the process of freeing thousands of strings when the signal timeout occurs? The signal handler is called, which in turn calls krb5_cleanup_proc() again. This second call to krb5_cleanup_proc() receives the krb5_user element, which is not NULL because it's already in the middle of processing; so krb5_cleanup_proc() once again starts deallocating all of the already deallocated string elements in this structure, which could lead to exploitable memory corruption.

Non-Returning Signal Handlers

Non-returning signal handlers are those that never return execution control back to the interrupted function. There are two ways this can happenthe signal handler can explicitly terminate the process by calling exit(), or the signal handler can return to another part of the application using longjmp(). It's generally safe for a longjmp() to simply terminate the program. However, a signal handler that uses longjmp() to return to another part of the application is very unlikely to be completely asynchronous-safe, because any of the code reachable via the signal handler must be asynchronous-safe as well. This section will focus on the various problems that can arise from attempting to restart execution using the longjmp() function.

To see this in action, consider the Sendmail SMTP server signal race vulnerability. It occurs when reading e-mail messages from a client. The collect() function responsible for reading e-mail messages is shown in part:

void collect(fp, smtpmode, hdrp, e, rsetsize)     SM_FILE_T *fp;     bool smtpmode;     HDR **hdrp;     register ENVELOPE *e;     bool rsetsize; {     ... other declarations ...     volatile time_t dbto;     ...     dbto = smtpmode ? TimeOuts.to_datablock : 0;     /*     **  Read the message.     **     **    This is done using two interleaved state machines.     **    The input state machine is looking for things like     **    hidden dots; the message state machine is handling     **    the larger picture (e.g., header versus body).     */     if (dbto != 0)     {         /* handle possible input timeout */         if (setjmp(CtxCollectTimeout) != 0)         {             if (LogLevel > 2)                 sm_syslog(LOG_NOTICE, e->e_id,                       "timeout waiting for input from %s                           during message collect",                       CURHOSTNAME);             errno = 0;             if (smtpmode)             {                 /*                 **  Override e_message in usrerr() as this                 **  is the reason for failure that should                 **  be logged for undelivered recipients.                 */                 e->e_message = NULL;             }             usrerr("451 4.4.1 timeout waiting for input                 during message collect");             goto readerr;         }         CollectTimeout = sm_setevent(dbto, collecttimeout,             dbto);     }

This block of code essentially sets up a handler for the SIGALRM signal, which is called when dbto seconds has elapsed. Sendmail uses an event abstraction instead of just using signals, but the call to sm_setevent() instructs Sendmail to call the collecttimeout() function when the time dbto indicates has expired. Notice the setjmp() call, indicating that you return to this function later. When the corresponding longjmp() occurs, you can see that you log some kind of message and then jump to readerr, which logs some sender information and then returns to the main Sendmail SMTP processing code. Now look at how collecttimeout() works:

static void collecttimeout(timeout)     time_t timeout; {     int save_errno = errno;     /*     **  NOTE: THIS CAN BE CALLED FROM A SIGNAL HANDLER. DO NOT ADD     **    ANYTHING TO THIS ROUTINE UNLESS YOU KNOW WHAT YOU ARE     **    DOING.     */     if (CollectProgress)     {         /* reset the timeout */         CollectTimeout = sm_sigsafe_setevent(timeout,              collecttimeout, timeout);         CollectProgress = false;     }     else     {         /* event is done */         CollectTimeout = NULL;     }     /* if no progress was made or problem resetting event,        die now */     if (CollectTimeout == NULL)     {         errno = ETIMEDOUT;         longjmp(CtxCollectTimeout, 1);     }     errno = save_errno; }

In certain cases, the collecttimeout() function can issue a call to longjmp(), which will return back into collect(). This alone should be setting off alarm bells in your head; the presence of this longjmp() call virtually guarantees that this function isn't asynchronous-safe because you already know that the target of the jump winds up back in the main SMTP processing code. So if this signal-handling routine is called when any non-asynchronous-safe operation is being conducted, and you can reach that code again from the SMTP processing code, you have a bug. As it turns out, there are a few non-asynchronous-safe operations; the most dangerous is the logging function sm_syslog():

sm_syslog(level, id, fmt, va_alist)     int level;     const char *id;     const char *fmt;     va_dcl #endif /* __STDC__ */ {     static char *buf = NULL;     static size_t bufsize;     char *begin, *end;     int save_errno;     int seq = 1;     int idlen;     char buf0[MAXLINE];     char *newstring;     extern int SyslogPrefixLen;     SM_VA_LOCAL_DECL     ... initialization ...     if (buf == NULL)     {         buf = buf0;         bufsize = sizeof buf0;     }     ... try to fit log message in buf, else reallocate it         on the heap     if (buf == buf0)         buf = NULL;     errno = save_errno; }

This code might need a little explanation because it has been edited to fit the page. The sm_syslog() function has a static character pointer buf, which is initialized to NULL. On function entry, it is immediately set to point to a stack buffer. If the message being logged is too large, a bigger buffer on the heap is allocated to hold the log message. In this case, the heap buffer is retained for successive calls to sm_syslog(), since buf is static. Otherwise, buf is just set back to NULL and uses a stack buffer again next time. So, what would happen if you interrupt this function with collecttimeout()? The call to longjmp() in collecttimeout() would invalidate part of the stack (remember, longjmp() resets program stack and frame pointers to what they were when setjmp() was called), but the static buf variable isn't reset to NULLit points to an invalidated region of the stack. Therefore, the next time sm_syslog() is called, buf is not NULL (indicating that a heap buffer has been allocated, although in this case buf is really pointing to a stack location), so the log message is written to the wrong part of the stack!

When you are attempting to evaluate whether code is asynchronous-safe, you must account for the entire state of the programnot just global variables. The state of the program can also include static variables, privilege levels, open and closed file descriptors, the process signal mask, and even local stack variables. This last item might seem counter-intuitive since stack variables only have a local scope inside the function that declares them. However, consider the fact that a function might be interrupted at any point during execution by a signal, and then a different part of the function is returned to through the use of longjmp(). In this scenario, it is possible that stack variables used by that function are not in an expected state.

A security researcher from the FreeBSD project named David Greenman pointed out a perfect example of exploiting a state change bug in WU-FTPD v2.4, which is detailed in a mail he sent to the bugtraq security mailing list (archived at http://seclists.org/bugtraq/1997/Jan/0011.html). Essentially, the program installed two signal handlers, one to handle SIGPIPE and one to handle SIGURG. The SIGPIPE handler is shown in Listing 13-2.

Listing 13-2. Signal Race Vulnerability in WU-FTPD

static void lostconn(signo)     int signo; {     if (debug)         syslog(LOG_DEBUG, "lost connection");     dologout(-1); } /*  * Record logout in wtmp file  * and exit with supplied status.  */ void dologout(status)     int status; {     if (logged_in) {         (void) seteuid((uid_t)0);         logwtmp(ttyline, "", ""); #if defined(KERBEROS)         if (!notickets && krbtkfile_env)             unlink(krbtkfile_env); #endif     }     /* beware of flushing buffers after a SIGPIPE */     _exit(status); }

Upon receipt of a SIGPIPE signal, the process sets its effective user ID to 0, logs some information, and then exits. Here's the SIGURG handler:

static void myoob(signo)     int signo; {     char *cp;     /* only process if transfer occurring */     if (!transflag)         return;     cp = tmpline;     if (getline(cp, 7, stdin) == NULL) {         reply(221, "You could at least say goodbye.");         dologout(0);     }     upper(cp);     if (strcmp(cp, "ABOR\r\n") == 0) {         tmpline[0] = '\0';         reply(426, "Transfer aborted. Data connection closed.");         reply(226, "Abort successful");         longjmp(urgcatch, 1);     }     if (strcmp(cp, "STAT\r\n") == 0) {         if (file_size != (off_t) -1)             reply(213, "Status: %qd of %qd bytes transferred",                 byte_count, file_size);         else             reply(213, "Status: %qd bytes transferred",                 byte_count);     } } ... void send_file_list(whichf)     char *whichf; { ...     if (setjmp(urgcatch)) {         transflag = 0;         goto out;     }

Upon receipt of a SIGURG signal (which can be delivered by sending a TCP segment with the URG flag set in the TCP header), some data is read. If it's ABOR\r\n, the process calls longjmp() to go back to another part of the program, which eventually goes back to the main processing loop for receiving FTP commands. It's possible for a SIGPIPE to occur while handling the data connection, and then be interrupted after it has set the effective user ID to 0 but before it calls exit() by a SIGURG signal. In this case, the program returns to the main processing loop with an effective user ID of 0, thus allowing users to modify files with root privileges.

Another problem with signal handlers that use longjmp() to return back into the program is a situation where the jump target is invalid. For setjmp() and sigsetjmp() to work correctly, the function that calls them must still be on the runtime execution stack at any point where longjmp() or siglongjmp() is called from. This is a requirement because state restoration performed by longjmp() is achieved by restoring the stack pointer and frame pointer to the values they had when setjmp() was invoked. So, if the original function has since terminated, the stack pointer and frame pointer restored by longjmp() point to undefined data on the stack. Therefore, if a longjmp() can be activated at any point after the function that calls setjmp() has returned, the possibility for exploitation exists. Take a look at a modified version of the process_message() example used earlier in this section:

jmp_buf env; void pipe_handler(int signo) {     longjmp(env); } int process_message(int sock) {     struct pkt_header header;     int err = ERR_NONE;     if(setjmp(env) != 0)     {         log("user disconnected!");         err = ERR_DISCONNECTED;             goto cleanup;     }     signal(SIGPIPE, pipe_handler);     for(;;)     {            if(read_packet_header(sock, &header)) < 0)                return ERR_BAD_HEADER;            switch(header.type)            {            case USER:                parse_username_request(sock);                break;            case PASS:                parse_password_request(sock);                break;            case OPEN:                parse_openfile_request(sock);                break;            case QUIT                parse_quit_request(sock);                goto cleanup;            default:                log("invalid message");                break;            }     } cleanup:     signal(SIGPIPE, SIG_DFL);     return err; }

In this example, longjmp() is called when a SIGPIPE is received, which you can safely assume that users are able to generate in any parsing functions for the different commands, as the program might be required to write some data back to the client. However, this code has a subtle error: If read_packet_header() returns less than 0, the SIGPIPE handler is never removed, and process_message() returns. So, if a SIGPIPE is delivered to the application later, pipe_handler() calls longjmp(), which returns to the process_message() function. Because process_message() is no longer on the call stack, the stack and frame pointers point to stack space used by some other part of the program, and memory corruption most likely occurs.

To summarize, signal handlers with longjmp() calls require special attention when auditing code for the following reasons:

The signal handler doesn't return, so it's highly unlikely that it will be asynchronous-safe unless it exits immediately.
It might be possible to find a code path where the function that did the setjmp() returns, but the signal handler with the longjmp() isn't removed.
The signal mask might have changed, which could be an issue if sigsetjmp() and siglongjmp() aren't used. If they are, does restoring the old signal mask cause problems as well?
Permissions might have changed (as in the WU-FTPD example).
Program state might have changed such that the state of variables that are valid when setjmp() is originally called but not necessarily when longjmp() is called.

Signal Interruption and Repetition

The bug presented in WU-FTPD introduces an interesting concept: The signal handler itself can also be interrupted, or it can be called more than once. An interesting paper by Michael Zalewski, "Delivering Signals for Fun and Profit," describes these two related attacks (available at www.bindview.com/Services/Razor/Papers/2001/signals.cfm).

Sometimes developers will construct signal handlers with the expectation that they are only executed once, or not at all. If a signal handler may be invoked more than once due to the delivery of multiple signals, the handler may inadvertently perform an operation multiple times that is really only safe to perform once. As an example, consider the cleanup() function presented in Listing 13-1 at the beginning of this section; it can be invoked by the delivery of either a SIGTERM or a SIGINT signal. As such, it would be possible to deliver a SIGTERM signal to the process followed rapidly by a SIGINT signal, and thus have it execute multiple times, resulting in deallocating the user variable more than once. When you're auditing instances of sigaction(), note that the combination of the SA_ONESHOT and SA_RESETHAND flags indicate that the signal handler is used only once, and then the default action for that signal is restored.

Note

The signal() function behaves a little differently in Linux than it does on BSD systems; when a signal handler is installed with the signal() function in Linux, after the signal is triggered once, the default action is restored for that signal. Conversely, BSD systems leave the signal handler defined by the user in place until it's explicitly removed. So the program behaves a little differently depending on whether it runs on Linux or BSD, which might determine whether a signal handler is vulnerable to attacks such as those detailed previously.

The second problem that can arise is that a signal handler itself can be interrupted by another signal, which might cause problems if the signal handler isn't asynchronous-safe. A signal handler can be interrupted only if a signal is delivered to the process that's not blocked. Typically, a process blocks signals by using the sigprocmask() function (except for SIGKILL and SIGSTOP, which can't be caught or blocked). With this function, developers can define a set of signals in the form of a sigset_t argument that describes all signals that should be blocked while the handler is running. If a process receives a signal while it's blocked, the kernel makes a note of the signal and delivers it to the process after it's unblocked.

In addition, when a signal handler is running, certain signals can be implicitly blocked, which might affect whether a signal handler can be interrupted. In a signal handler installed with signal(), the signal the handler catches is blocked for the period of time the signal handler is running. So, for example, a signal handler installed to handle SIGINT can't be interrupted by the delivery of another SIGINT while it's running. This is also the case with sigaction(), except when the SA_NODEFER flag is supplied in the sa_flags member of the sigaction structure. The sigaction() function also enables developers to supply additional signals that are blocked for the duration of the signal-handling routine by supplying them in the sa_mask field of the sigaction structure.

Therefore, when you're evaluating whether a signal can be interrupted by another signal, you need to establish what the process's signal mask is when the handler is running. It's quite common for signal handlers to be interruptible by other signals; for example, a SIGINT handler might be interrupted by a SIGALRM signal. Again returning to our cleanup() example from Listing 13-1, you would be able to interrupt the handler that has caught SIGINT by sending a SIGTERM at the appropriate time, thus having the cleanup() function interrupt itself because it's the handler for both.

One nasty problem that tends to catch developers off-guard is the use of library functions within a signal handler. In "Delivering Signals for Fun and Profit," Zalewski talks about libc functions that are and are not asynchronous-safe. The complete list of functions guaranteed to be asynchronous-safe by POSIX standards is shown (taken from the OpenBSD signal(3) man page):

Base Interfaces: _exit(), access(), alarm(), cfgetispeed(), cfgetospeed(), cfsetispeed(), cfsetospeed(), chdir(), chmod(), chown(), close(), creat(), dup(), dup2(), execle(), execve(), fcntl(), fork(), fpathconf(), fstat(), fsync(), getegid(), geteuid(), getgid(), getgroups(), getpgrp(), getpid(), getppid(), getuid(), kill(), link(), lseek(), mkdir(), mkfifo(), open(), pathconf(), pause(), pipe(), raise(), read(), rename(), rmdir(), setgid(), setpgid(), setsid(), setuid(), sigaction(), sigaddset(), sigdelset(), sigemptyset(), sigfillset(), sigismember(), signal(), sigpending(), sigprocmask(), sigsuspend(), sleep(), stat(), sysconf(), tcdrain(), tcflow(), tcflush(), tcgetattr(), tcgetpgrp(), tcsendbreak(), tcsetattr(), tcsetpgrp(), time(), times(), umask(), uname(), unlink(), utime(), wait(), waitpid(), write() Real-time Interfaces: aio_error(), clock_gettime(), sigpause(), timer_getoverrun(), aio_return(), fdatasync(), sigqueue(), timer_gettime(), aio_suspend(), sem_post(), sigset(), timer_settime() ANSI C Interfaces: strcpy(), strcat(), strncpy(), strncat(), and perhaps some others Extension Interfaces: strlcpy(), strlcat(), syslog_r()

Everything else is considered not safe. Notice the lack of some commonly used functions in this list: syslog(), malloc(), free(), and the printf() functions. Signal handlers that use any functions not listed here are potentially at risk. Exactly what level of risk they are exposed to depends on the function they use and its implementation specifics; a signal handler that interrupts a malloc() or free() and then calls malloc() or free() is at risk of corrupting the heap because it might be in an inconsistent state when the signal handler is called. Many of the functions not included in the safe list use these heap functions internally.

Although functions manipulating the system heap might initially appear to be the most major concern, it's much less of a problem than it used to be. Many libc implementations now contain some sort of concurrency controls over the system heap that prevent more than one heap function from being entered at a time. Still, a signal handler that uses the heap in an unsafe manner should be flagged, as you can't assume the system will handle concurrency correctly, especially when you don't know what system the software is running on.

Signals Scoreboard

A signal function contains the special property that it can run at any time from installation to removal, so you need to give signal handlers special attention. The procedure for auditing a signal-handling function involves an extra step on top of the standard code-auditing practices you have already learned in this book. Specifically, you need to assess whether the signal function is asynchronous-safe. As you have learned, asynchronous-safe isn't quite the same as thread safe. In fact, sometimes thread APIs aren't asynchronous-safe; for example, in PThreads, the use of a mutex data type in a signal handler can cause the program to become deadlocked! When examining a signal handler, therefore, you might find it helpful to record some basic statistics on your analysis of the function, as shown in Table 13-3. These logs are similar to the Synchronization Scoreboards introduced earlier in this chapter.

Table 13-3. Signal Handler Scoreboard
Function name		`Alrmhandler`
Location		`src/util.c`, line 140
Signal		SIGALRM
Installed		`src/main.c`, line 380
Removed		Never
Unsafe library functions used		`malloc()`, `free()`, `syslog()`
Notes	This function is used to handle a network timeout from reading data. By default, it occurs after three minutes of inactivity. Interesting if you can interrupt `read_data()` in `src/net.c`, particularly when the buffer length is updated but before the buffer has been reallocated.

When you're determining the risk level associated with a signal handler running at a certain time, you should user your scoreboard to help identify any issues. First, attempt to locate non-reentrant functions called while the signal handler is installed. This means finding functions that have static variables or that modify global variables or resources without any sort of locking mechanisms.

Next, you should look for signal handlers using the longjmp() and siglongjmp() functions. They cause the signal handler to never return and practically guarantee that the signal handler is not asynchronous-safe unless it jumps to a location that immediately exits. Also, remember the point from the "Jump Locations" section earlier in this chapter: When setjmp() is returned to from a longjmp(), the context of the process might be much different than it was when the function containing the setjmp() was originally called. Stack variable values might have changed, and global variables and shared resources are likely to have changed. However, it's quite easy for developers to make assumptions about the state of a variable based on conditions when the function was originally called. When you encounter a signal handler that uses the *jmp() functions, it's definitely worth noting and attempting to verify whether any of the five conditions listed in the "Signal Vulnerabilities" section can result in a vulnerability in the program.

Table 13-2. Signals and Their Default Actions

Sending Signals

Handling Signals

Jump Locations

Signal Vulnerabilities

Basic Interruption

Listing 13-1.

Non-Returning Signal Handlers

Listing 13-2. Signal Race Vulnerability in WU-FTPD

Signal Interruption and Repetition

Signals Scoreboard

Table 13-3. Signal Handler Scoreboard