9.7 The savecrash Process

     

9.7 The savecrash Process

When the system crashes, the kernel uses entry points into the IODC code to help it write the crashdump to disk. Before invoking the dumpsys() routines, we will be given a 10-second window of opportunity on the system console to interrupt the crashdump process and decide what kind of crashdump to create. We can choose a Full dump (an image of memory), a Selective dump (only the page classes defined earlier), or No dump. Once the dump has been written to disk, the system will reboot. The problem we have is that a dump device is an empty volume with no filesystem in it. As such, we need a utility to read the binary information from the dump device(s) and create a series of files in the filesystem; this is the job of the savecrash command. After the system has rebooted, a process known as savecrash is run in order to get the crashdump out of the dump devices and into a usable format in the filesystem. The default location for the files created by savecrash is under the directory /var/adm/crash . The files created will be in subdirectories named crash.0 , crash.1 for each crash that has occurred. The savecrash command may decide to compress the files, depending on the size of the crashdump and the amount of available space under /var/adm/crash . There will be a number of files in this directory. There are basically two types of file; an INDEX file and image files. The INDEX file is a text file giving a brief description of when the crash happened , the files making up the crashdump, as well as a panic string . Don't get confused about my use of the term panic string . The panic string in the INDEX files is simply a term used to describe the one-liner issued by the kernel to describe the instruction or event that caused the system to crash. The image files are a series of files (referenced in the INDEX file) that contain the crashdump itself. The number of image files created depends on how big the crashdump is and will be determined by the savecrash command. If our system is crashing often, we may have decide to create /var/adm/crash as a separate volume or use a different directory altogether to store our crashdumps. To configure the behavior of savecrash , e.g., which directory to store crashdumps, we configure the file /etc/rc.config.d/savecrash :

 

 root@hpeos003[]  vi /etc/rc.config.d/savecrash  #!/sbin/sh # @(#)B.11.11_LR # Savecrash configuration # # # SAVECRASH:    Set to 0 to disable saving system crash dumps. # SAVECRASH=1 # SAVECRASH_DIR:Directory name for system crash dumps.  Note: the filesystem #               in which this directory is located should have as much free #               space as your system has RAM.   # SAVECRASH_DIR=/var/adm/crash   root@hpeos003[] 

As you can see, most of the options are commented out and, hence, take default values. The comments you can see above regarding the amount of space in the filesystem is a little out of date because crashdumps are no longer a Full dump by default.

After a system crash, we should also see a message added by the savecrash command in the file /var/adm/shutdownlog (if it exists). Ultimately, the files under the /var/adm/crash directory need to be analyzed by an experienced Response Center engineer trained in kernel internals. It is their job to find the root cause as to why our system crashed. In the next chapter, we discuss how we can assist in that process by distinguishing between an HPMC, a TOC, and a PANIC.



HP-UX CSE(c) Official Study Guide and Desk Reference
HP-UX CSE(c) Official Study Guide and Desk Reference
ISBN: N/A
EAN: N/A
Year: 2006
Pages: 434

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net