Process Cores | Real World Mac Maintenance and Backups

Now that we have sufficiently covered structure and hangs as they pertain to Linux processes, let us move on to process core dumps. A core dump enables the user to visually inspect a process's last steps. This section details how cores are created and how to best use them.

Signals

Process core dumps are initiated by the process receiving a signal. Signals are similar to hardware interrupts. As with interrupts, a signal causes a task to branch from its normal execution, handling a routine and returning to the point of interruption. Normal executing threads encounter signals throughout their life cycles. However, there are a finite number of signal types that result in a core dump, whereas other signal types result in process termination.

A process can receive a signal from three sources: the user, the process, or the kernel.

From the User

A user can send a signal in two ways: either using an external command such as kill or within a controlling tty, typing Ctrl+c to send a sigint as defined by stty -a. (Note that by definition, daemons do not have a controlling tty and therefore cannot be signaled in this manner.)

# stty -a speed 9600 baud; rows 41; columns 110; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; start = ^Q; stop = ^S;

From the Program

From a program, you can perform the raise() or alarm() system call, allowing a program to signal itself. Consider this example: a ten-second sleep without using the sleep call.

main() { alarm(10); pause() }

From the Kernel

The kernel can send a signal, such as SIGSEGV, to a process when it attempts an illegal action, such as accessing memory that it does not own or that is outside of its address range.

Linux supports two types of signals: standard and real-time. A complete overview of signals is outside the scope of this chapter; however, there are a few key differences to note. Standard signals have predefined meanings, whereas real-time signals are defined by the programmer. Additionally, only one standard signal of each type can be queued per process, whereas real-time signals can build up. An example of this was shown earlier in this chapter when a process was blocked on I/O. A kill -9 sigkill was sent to the process and placed in SigPnd.

SigPnd: 0000000000000100 < A signal waiting to be processed, in this case sigkill

In troubleshooting a process, a user might want to force a process to dump core. As stated, this is accomplished by sending the appropriate signal to the process. Sometimes after this step is taken, the dump does not follow because the process has not returned from an interrupt due to some other issue. The result is a pending signal that needs to be processed. Because the signals that result in a core are standard signals, sending the same signal multiple times does not work because subsequent signals are ignored until the pending signal has been processed. The pending signals are processed after the program returns from the interrupt but before proceeding to user space. This fact is illustrated in the entry.S source file, as shown in the following:

arch/i386/kernel/entry.S ... ret_from_intr() ... _reschedule() ... _signal_return() ...         jsr     do_signal        ; arch/cris/kernel/signal.c ...

It is also possible to have difficulty achieving the dump because signals are being blocked (masked), caught, or ignored. An application might have signal handlers that catch the signal and perform their own actions. Signal blocking prevents the delivery of the signal to the process. Ignoring a signal just means that the process throws it away upon delivery. Additionally, the signal structure of a process is like any other structure in that the child inherits the parent's configuration. That being stated, if a signal is blocked for the parent, the child of that process has the same signals blocked or masked. However, some signals cannot be masked or ignored, as detailed in the man page on signal. Two such signals are sigkill and sigstop.

The user can obtain a list of signals from the kill command. This yields a list of signals that the user can send to a process. Possible signals include the following (note that this is not a complete list):

$ kill -l  1) SIGHUP      2) SIGINT      3) SIGQUIT     4) SIGILL  5) SIGTRAP     6) SIGABRT     7) SIGBUS      8) SIGFPE  9) SIGKILL    10) SIGUSR1    11) SIGSEGV    12) SIGUSR2 13) SIGPIPE    14) SIGALRM    15) SIGTERM    17) SIGCHLD ...

As mentioned earlier and illustrated next, the man page on signal details the signals that produce a core file.

$ man 7 signal ... Signal     Value     Action    Comment -------------------------------------------------------------------------       SIGHUP        1      Term    Hangup detected on controlling terminal                                    or death of controlling process        SIGINT       2      Term    Interrupt from keyboard        SIGQUIT      3      Core    Quit from keyboard        SIGILL       4      Core    Illegal Instruction        SIGABRT      6      Core    Abort signal from abort(3)        SIGFPE       8      Core    Floating point exception ...

The source code on signal also provides this list as illustrated next:

linux/kernel/signal.c ... #define SIG_KERNEL_COREDUMP_MASK (\         M(SIGQUIT)   | M(SIGILL)    | M(SIGTRAP)   | M(SIGABRT)  |\         M(SIGFPE)    | M(SIGSEGV)   | M(SIGBUS)    | M(SIGSYS)   |\         M(SIGXCPU)   | M(SIGXFSZ)   | M_SIGEMT                    ) ...

Limits

By default, most Linux distributions disable the creation of process core dumps; however, the user can enable this capability. The capability to create or not create core dumps is accomplished by the use of resource limits and the setting of a core file size. Users can display and modify their resource limits by using the ulimit command.

In this listing, we depict core dumps being disabled by displaying the user soft limits:

$ ulimit -a core file size      (blocks, -c) 0  < COREs have been disabled data seg size       (kbytes, -d) unlimited file size           (blocks, -f) unlimited max locked memory   (kbytes, -l) unlimited max memory size     (kbytes, -m) unlimited open files                  (-n) 1024 pipe size        (512 bytes, -p) 8 stack size          (kbytes, -s) 8192 cpu time           (seconds, -t) unlimited max user processes          (-u) 4095 virtual memory      (kbytes, -v) unlimited

There are two limits for each resource: a soft limit (shown previously) and a hard limit. The two limits differ in how they can be modified. The hard limit can be thought of as a ceiling that defines the maximum value of a soft limit. Users can change their hard limit only once, whereas they can change their soft limits to any values at any time as long as they do not exceed the hard limit.

Rerunning the ulimit command with the -Ha option as shown below, we see the hard limits for each resource.

$ ulimit -Ha core file size      (blocks, -c) unlimited data seg size       (kbytes, -d) unlimited file size           (blocks, -f) unlimited max locked memory   (kbytes, -l) unlimited max memory size     (kbytes, -m) unlimited open files                  (-n) 1024 pipe size        (512 bytes, -p) 8 stack size          (kbytes, -s) unlimited cpu time           (seconds, -t) unlimited max user processes          (-u) 4095 virtual memory      (kbytes, -v) unlimited

A user can set a hard or soft limit to unlimited, as in the previous example. unlimited just means that the process does not have an artificial limit imposed by setrlimit. However, the kernel must represent "unlimited" with a value so that it has a manageable range. The program is limited by what the kernel can address or the physical limits of the machine, whichever comes first. Thus, even when set to unlimited, a limit exists. The 32-bit representation of unlimited (denoted "infinity") is defined in sys_ia32.c as indicated next:

... #define RLIM_INFINITY32 0xffffffff   <--  Equals  4294967295 bytes ~ 4Gig #define RESOURCE32(x) ((x > RLIM_INFINITY32) ? RLIM_INFINITY32 : x) struct rlimit32 {         unsigned        rlim_cur;     <-- soft limit         unsigned        rlim_max;     <-- hard limit }; ...

Anytime a process dumps core and the resource limit core file size is anything other than zero, the kernel writes the core image. There are times, however, when user limits are set to low, resulting in a corrupt or unusable core image. If the core file resource limit is not adequate to accommodate the process's core image, the kernel either does not produce a dump, truncates the dump, or attempts to save only the stack portion of the process's context.

What occurs if the kernel is unable to create the dump depends on the type of executing process. Linux supports a multitude of executable formats. Originally, the a.out binary was used, which contains a magic number in its header. Traditionally, this magic number was used to characterize the binary typefor example, exec magic, demand magic, shared_mem magic, and so on. However, it was decided early on that the Executable and Linking Format (ELF) would be Linux's default binary format because of its flexibility. Although AT&T defined the original ELF-32 binary format, UNIX System Laboratories performed the original development of this format. Later HP and INTEL defined the ELF-64 binary format. Today's Linux systems contain very few, if any, a.out binaries, and support has been removed from the main kernel and placed into a module called binfmt_aout.o, which must be loaded before executing one of these binaries.

Referencing the binfmt source for each format details what action is taken in the event of a process attempting to produce a core file, as illustrated next.

The following snippet is from fs/binfmt_aout.c.

... /* If the size of the dump file exceeds the rlimit, then see what would happen    if we wrote the stack, but not the data area. */ ...

The next snippet is from fs/binfmt_elf.c.

... /*  * Actual dumper  *  * This is a two-pass process; first we find the offsets of the bits,  * and then they are actually written out. If we run out of core limit  * we just truncate.  */

The Core File

After the core file is generated, we can use it to determine the reason for the core dump. First, we must identify the process that created the core and the signal that caused the process to die. The most common way of determining this information is through the file command. Next, we determine whether the program in question has had its symbols stripped. This information can be determined by executing the file command against the binary. As mentioned earlier, the core file is the process's context, which includes the magic number or type of executable that created the core file. The file command uses a data file to keep track of file types, which by default is located in /etc/magic.

In Scenario 8-4, we show an example of a program with an easily reproducible hang. We can use tools such as gdb and other GNU debuggers/wrappers such as gstack to solve the problem.

Scenario 8-4: Using GDB to Evaluate a Process That Hangs

We use gdb to evaluate a core file created when a program was terminated because it hangs.

$ ll gmoo* -rwxr-xr-x    1 chris   chris    310460 Jan  2 20:25 gmoo.stripped* -rwxr-xr-x    1 chris   chris    321486 Jan  2 22:25 gmoo.not.stripped*

The file command informs us of the type of executable (defined in /etc/magic). In the previous example, we have one binary that hangs when executing.

It is helpful to determine the type of binary, as in the following example:

$ file gmoo.* gmoo.not.stripped: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped gmoo.stripped:      ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), stripped

When the command is hung, we send a kill -11 (SIGSEGV) to the program, causing the program to exit and dump core. An example of such a resulting core file follows:

$ file core.6753 core.6753: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4- style, SVR4-style, from 'gmoo.stripped'

Using the GNU Project Debugger (GDB), we get the following:

$ gdb -q ./gmoo.striped ./core.6753 ... Core was generated by './gmoo.striped'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libgtk-1.2.so.0...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libgtk-1.2.so.0 Reading symbols from /usr/lib/libgdk-1.2.so.0...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libgdk-1.2.so.0 Reading symbols from /usr/lib/libgmodule-1.2.so.0...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libgmodule-1.2.so.0 Reading symbols from /usr/lib/libglib-1.2.so.0...(no debugging symbols found)...done. ... (gdb) backtrace #0  0x4046b8e6 in connect () from /lib/i686/libpthread.so.0 #1  0x0806bef1 in gm_net_connect () #2  0x080853e1 in gm_world_connect () #3  0x0806c7cf in gm_notebook_try_add_world () #4  0x0806cd8c in gm_notebook_try_restore_status () #5  0x08061eab in main () #6  0x404c5c57 in __libc_start_main () from /lib/i686/libc.so.6 (gdb) list No symbol table is loaded. Use the "file" command. (gdb)

Without the source, we have gone about as far as we can. We must use other tools, such as strace, in combination with gdb. Other tool suites such as valgrind can also prove useful.

Now, let us look at an example of the same hang with a non-stripped version of the binary.

$ gdb -q ./gmoo.not.stripped ./core.6881 ... Core was generated by './gmoo.not.stripped'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libgtk-1.2.so.0...done. Loaded symbols for /usr/lib/libgtk-1.2.so.0 Reading symbols from /usr/lib/libgdk-1.2.so.0...done. Loaded symbols for /usr/lib/libgdk-1.2.so.0 Reading symbols from /usr/lib/libgmodule-1.2.so.0...done. ... (gdb) backtrace #0  0x40582516 in poll () from /lib/i686/libc.so.6 (gdb)

Although the stack trace appears to be different, we have identified the root cause. The program is hung on a network poll call, which, according to the man page, is a structure made up of file descriptors. Using other tools, such as lsof, strace, and so on, we can determine exactly the network IP address upon which the process is hung.