System and Crash Dump Analysis


When a fatal hardware or software error causes OpenVMS failure, it copies the contents of memory to a system dump file and records the hardware context of all the processors into that file as well. This overwrites the previous dump in that file. When the computer is booted again, processing of the dump file differs according to the architecture. On a VAX, certain data about the failure is written into CLUE$OUTPUT:CLUE$HISTORY.DATA. That information can be viewed in the following example:

     $ CLUE/DISPLAY     # Node   Time              Type         Process Name       Module       1 LOON    4-SEP-2002 09:28 OPERATOR     SYSTEM           UNKNOWN       2 BEAVER 31-AUG-2002 18:12 OPERATOR     SYSTEM           UNKNOWN       3 BEAVER 31-AUG-2002 17:58 OPERATOR     SYSTEM           UNKNOWN       4 OTTER  11-JUN-2002 11:29 OPERATOR     SYSTEM           UNKNOWN 

To give you an introduction about how rich this command is, look at the first-level help output:

     CLUE_DISPLAY >help     CLUE       /DISPLAY          /DISPLAY = display_command/quals.. filename         The display module of CLUE reads the specified CLUE History file,         generated by the CLUE/BINARY command, and prompts for user action.         A number of commands, as listed below, are available to the user         from the "CLUE_DISPLAY >" prompt. These commands may also be given         as a value with /DISPLAY from the DCL command line, for example:         CLUE/DISPLAY=DIR/SINCE=1-JAN/OUT=TMP.LIS CLUE$HISTORY.DATA.         If no filename is specified, the default filename is         CLUE$HISTORY.DATA.        Additional information available:        DIRECTORY  SHOW       EXTRACT    DELETE     EXIT     CLUE /DISPLAY Subtopic? 

On an Alpha, a one-line summary is added to SYS$ERRORLOG:CLUE$HISTORY.DAT, and extensive information about the failure is written SYS$ERRORLOG: CLUE$node_ddmmyy_hhmm.LIS. These two files are in simple ASCII format and can be displayed with TYPE, as illustrated. Only part of the files is displayed in these examples:

 $ type clue$history.dat /page Date              Vers System/CPU         Node    Bugcheck  Process     PC                                                                              Module ----------------- ---- ------------------ -----  --------  ---------  --------                                                                    ----------------  8-JAN-1999 11:48 V7.1 DEC 2000 Model 300 EAGLE   CLUEXIT  NULL       801A82A4                                                                         SYS$CLUSTER 11-JAN-1999 10:38 V7.1 DEC 2000 Model 300 EAGLE   CLUEXIT  NULL       801A82A4                                                                         SYS$CLUSTER 16-FEB-1999 19:00 V7.1 DEC 2000 Model 300 EAGLE   CLUEXIT  NULL       801A82A4                                                                         SYS$CLUSTER 22-OCT-1999 01:18 V7.1 DEC 2000 Model 300 EAGLE   CLUEXIT  NULL       801A82A4                                                                         SYS$CLUSTER 28-MAR-2002 17:40 V7.1 DEC 2000 Model 300 EAGLE   CLUEXIT  CONFIGURE  801B6448                                                                         SYS$CLUSTER 29-MAR-2002 07:36 V7.1 DEC 2000 Model 300 EAGLE   PROCGONE STARTUP    8664E820                                                                     IMAGE_MANAGEMENT $ type CLUE$EAGLE_290302_0736.LIS /page OpenVMS (TM) Alpha Operating System, Version V7.1 -- System Dump Analysis                                                             29-MAR-2002 07:36:37.81 Crashdump Summary Information: Crash Time:        29-MAR-2002 07:36:37.81 Bugcheck Type:     PROCGONE, Process not in system Node:              EAGLE   (Clustered) CPU Type:          DEC 2000 Model 300 VMS Version:       V7.1 Current Process:   STARTUP Current Image:     <not available> Failing PC:        FFFFFFFF.8664E820    IMAGE_MANAGEMENT_PRO+0A820 Failing PS:        18000000.00000001 Module:            IMAGE_MANAGEMENT Offset:            00012820 Boot Time:         29-MAR-2002 07:36:09.00 System Uptime:               0 00:00:28.81 Crash/Primary CPU: 00/00 System/CPU Type:   0602 Saved Processes:   3 Pagesize:          8 KByte (8192 bytes) Physical Memory:   256 MByte (32768 PFNs, contiguous memory) Dumpfile Pagelets: 12459 blocks Dump Flags:        writecomp,errlogcomp,dump_style Dump Type:         compressed,selective EXE$GL_FLAGS:      poolpging,init,bugdump Paging Files:      1 Pagefile and 1 Swapfile installed 

In either case, the dump file should be saved for later analysis by Compaq/HP personnel. This can be done with the COPY command, but Compaq/HP recommends that SDA be used instead. By default, the dump file, SYS$SYSTEM:SYSDUMP.DMP, is never backed up by BACKUP. Use the following on either a VAX or Alpha system:

     $ANALYZE/CRASH     SDA> COPY SAVEDUMP.DMP 

A more extensive examination of the dump can be performed with ANALYZE/ CRASH at a later time if necessary. SDA has about 20 commands to display symbolically (e.g., most of the OpenVMS data structures, such as process management, memory management, lock management, cluster management, multiprocessor synchronization). Other displays are available as well, for instance:

  • Display the contents of a specific process stack.

  • Display the call frame.

  • Read the OpenVMS global variables and display them.

  • Display device status.

  • Validate the integrity of queue links.

It should be pointed out that OpenVMS is different internally on the two architectures not only because of the obvious RISC/CISC differences, but also because the method of mapping virtual to physical addresses and I/O structures is different. For instance, to convert virtual space to physical space, the VAX uses two mapping tables, which involves 512-byte pages. But to accommodate the much larger address space of the Alpha, it has three mapping tables, involving 8,192-byte pages.

Furthermore, features have been added to OpenVMS Alpha but not to OpenVMS VAX, such as context switching of process threads. This feature is intended to take better advantage of advanced symmetric multiprocessing (SMP) on the Alpha. Dump processing illustrates differences between OpenVMS as implemented on the VAX versus its implementation on the Alpha. Even though SDA is used in both cases, the two tools are distinct on the two architectures. There is a separate manual for each architecture to describe how this tool works.

In addition to crash analysis, the running system can also be examined. There are two commands, ANALYZE/CRASH (to examine a crash dump) and ANALYZE/SYSTEM (to examine a running system), and both use SDA.

For OpenVMS on VAX systems, the crash log utility extractor (CLUE) is automatically run when the system is booted if the system crashed previously. This supports a CLUE history file, which contains key system parameters pertaining to the crash. The system manager may access this database with CLUE at any time to review crashes. The database is called CLUE$OUTPUT:CLUE$HISTORY.DATA. CLUE is documented in the OpenVMS System Management Utilities Reference Manual. SDA is documented in the OpenVMS VAX System Dump Analyzer Utility Manual.

OpenVMS on Alpha systems does not run CLUE automatically, and ANALYZE/CRASH_DUMP (or SDA, as it is called in some documents) must be used, as indicated previously. ANALYZE/CRASH is documented in the OpenVMS DCL Dictionary and in the Open VMS Alpha System Analysis Tools Manual. CLUE is called from SDA.




Getting Started with OpenVMS System Management
Getting Started with OpenVMS System Management (HP Technologies)
ISBN: 1555582818
EAN: 2147483647
Year: 2004
Pages: 130
Authors: David Miller

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net