Monitor Hardware Status


If a hardware component detects an error, it is reported by OpenVMS. Information about the failure, whether it is recoverable or not, is recorded in SYS$ERRORLOG:ERRLOG.SYS. The primary command for examining this log file is SHOW ERROR, and its use in a cluster environment is as follows:

     $ mcr sysman     SYSMAN> set environment/cluster     command environment:             Clusterwide on local cluster             Username SYSTEM       will be used on nonlocal nodes     SYSMAN> do show error     command execution on node BEAVER     Device                           Error Count     PEAO:                                    3     command execution on node LOON     Device                           Error Count     PEA0:                                    2     command execution on node CSWWW     Device                           Error Count     PEA0:                                    2     command execution on node EAGLE     %SHOW-S-NOERRORS, no device errors found 

The previous report shows errors on the cluster communications device, PEA0. Clusters are discussed in Chapter 10.

SHOW ERROR displays a summary of hardware errors and thus alerts the manager to possible hardware failures. It is important to monitor all errors, because hardware usually does not fail catastrophically without warning; rather minor problems are encountered before total failure. So, if an error is observed in the SHOW ERROR display, the manager should use additional programs and commands to investigate the error further.

Device and CPU errors are examined with DIAGNOSE (on OpenVMS/Alpha) and ANALYZE/ ERROR. An illustration of the latter command is as follows. As shown elsewhere in other commands, reporting levels are generally available. Commonly / BRIEF (or /SUMMARY) displays the least amount of information.

     $ ANALYZE/ERROR/BRIEF     Error Log Report Generator                                      Version V7.1      ******************************* ENTRY       1. *******************************      ERROR SEQUENCE 0.                               LOGGED ON:        SID 13002602      DATE/TIME 16-SEP-2002 00:05:35.93                            SYS_TYPE 04130002      SYSTEM UPTIME: 11 DAYS 14:23:12      SCS NODE: BEAVER                                              VAX/VMS V7.1      ERRLOG.SYS CREATED KA49  CPU Microcode Rev # 2.  CONSOLE FW REV# 1.3                           Standard Microcode Patch    Patch Rev # 19.      ******************************* ENTRY       2. *******************************      ERROR SEQUENCE 2343.                            LOGGED ON:        SID 13002602      DATE/TIME 16-SEP-2002 22:55:36.05                            SYS_TYPE 04130002      SYSTEM UPTIME: 12 DAYS 13:12:45      SCS NODE: BEAVER                                              VAX/VMS V7.1      TIME STAMP KA49  CPU Microcode Rev # 2.  CONSOLE FW REV# 1.3                           Standard Microcode Patch    Patch Rev # 19.      ******************************* ENTRY       3.********************************      ERROR SEQUENCE 2344.                           LOGGED ON:         SID 13002602      DATE/TIME 16-SEP-2002 22:57:12.85                            SYS_TYPE 04130002      SYSTEM UPTIME: 12 DAYS 13:14:22      SCS NODE: BEAVER                                              VAX/VMS V7.1      ERL$LOGMESSAGE KA49  CPU Microcode Rev # 2.   CONSOLE FW REV# 1.3                           Standard Microcode Patch     Patch Rev # 19.      NI-SCS SUB-SYSTEM, _BEAVER$PEA0:            PORT HAS CLOSED VIRTUAL CIRCUIT      ******************************* ENTRY       4. ********************************      ERROR SEQUENCE 2350.                            LOGGED ON:         SID 13002602     DATE/TIME 16-SEP-2002 23:55:36.05                       SYS_TYPE 04130002     SYSTEM UPTIME: 12 DAYS 14:12:45     SCS NODE: BEAVER                                             VAX/VMS V7.1     TIME STAMP KA49 CPU Microcode Rev # 2.  CONSOLE FW REV# 1.3                         Standard Microcode Patch    Patch Rev # 19. 

In my system, the ERRLOG.SYS is reinitialized every day to keep its size manageable. Entries 1, 2, and 4 in the previous display pertain to stopping and restarting the error logger at midnight and identifying the hardware. Entry 3 relates to the PEA0 error. More data about that error is available but not presented here.

In my experience, memory problems start with correctable errors that later may become permanent. In the case of memory errors, OpenVMS will automatically stop using memory pages that have reported errors during the boot memory scan. SHOW MEMORY is the primary program used to examine memory status, although ANALYZE/ERROR can also be used. SHOW MEMORY is illustrated as follows:

     $ SHOW MEMORY                   System Memory Resources on 21-SEP-2002 18:51:31.05     Physical Memory Usage (pages):     Total        Free      In Use    Modified       Main Memory (256.00Mb)           32768       28089        4357         322     Virtual I/O Cache (Kbytes):        Total        Free      In Use       Cache Memory                      3200          16        3184     Granularity Hint Regions (pages):  Total        Free      In Use    Released       Execlet code region                512           0         506           6       Execlet data region                 96           4          92           0       S0/S1 Executive data region        477           0         477           0       S2 Executive data region           160           0         160           0       Resident image code region         512           0         319         193     Slot Usage (slots):                Total        Free    Resident     Swapped       Process Entry Slots                 35          15          20           0       Balance Set Slots                   33          15          18           0     Dynamic Memory Usage (bytes):      Total        Free      In Use     Largest       Nonpaged Dynamic Memory        4243456      373824     3869632        5952       Paged Dynamic Memory           4661248     1119936     3541312     1069792     Buffer object Usage (pages):                  In Use        Peak       32-bit System Space Windows (S0/S1)              0           0       64-bit System Space Windows (S2)                 0           0     Memory Reservations (pages):                Reserved      In Use       Type     Total (0 Mb reserved)                              0           0     Paging File Usage (blocks):                     Free  Reservable      Total       DISK$ALPHASYS:[SYS0.SYSEXE]SWAPFILE.SYS       4480        4480       4480       DISK$ALPHASYS:[SYS0.SYSEXE]PAGEFILE.SYS     532480      479472     532480     Of the physical pages in use, 2847 pages are permanently allocated to OpenVMS. 

This command presents a plethora of statistics about various aspects of how the memory is used. This data can be used for performance tuning, described later. It will also include error reports, if there are any.




Getting Started with OpenVMS System Management
Getting Started with OpenVMS System Management (HP Technologies)
ISBN: 1555582818
EAN: 2147483647
Year: 2004
Pages: 130
Authors: David Miller

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net