Flylib.com

Books Software

 
 
 

Chapter Review

     

Chapter Review

The various diagnostic monitoring tools we have looked at so far have allowed us to get our hands on critical information regarding the state of our system. Being able to get to this information quickly and passing it on to a qualified HP engineer can assist in diagnosing potential problems, especially if the problem involves some form of system crash. In doing so, we can help to maintain system availability by planning any system outages, as necessary, before they happen unexpectedly.

     

Test Your Knowledge

1:

kcweb has a facility to monitor the usage of kernel parameters with the help of the kcmond process. This process in fact sets up an EMS HA Monitor resource to monitor the specific kernel parameters. When an alarm is activated, kcmond reports the event to the specified destination. True or False?

2:

Every time syslogd is started up, it renames the original log file(s) listed in /etc/syslog.conf and then starts a new logfile. True or False?

3:

EMS hardware monitors notify utilities such as ServiceGuard of the change of status of monitored devices. If appropriate, ServiceGuard can alter the status of packages that are under its control. True or False.

4:

Some would say that the resls command is somewhat inconvenient in navigating the list of resources that can be monitored. In order to set up EMS HA Monitors, it is more appropriate to use the monconfig command. True or False?

5:

An HPMC is caused by an underlying hardware problem. Although a system crashdump is created under /var/adm/crash , the HP engineer assigned to our hardware call will need immediate access to the tombstone file created as a result of the HPMC in order to diagnose the cause of the problem. True or False?

     

Answers to Test Your Knowledge

A1:

True. kcmond is an add-on kernel resource monitor for the EMS subsystem.

A2:

False. It is the startup script /sbin/init.d/syslogd that renames existing logfiles. If we add any additional logfiles to /etc/syslog.conf , it may be appropriate to update or create a new startup script to rename them.

A3:

False. EMS hardware monitors simply monitor resources. When an event occurs, the monitor will simply report the event. The monitor has no memory of what state the device was in before and, hence, the monitor cannot make a decision as to whether the status of the device has changed. It is the job of the Peripheral Status Monitor to report whether a device has changed status. PSM receives messages from the hardware monitors and makes decisions accordingly .

A4:

False. The monconfig command cannot be used to set up EMS HA Monitors. The monconfig command is used to set up basic hardware monitors.

A5:

False. Not all HPMCs are caused by hardware problems. While the tombstone file is a useful source of information, the resulting crashdump is vital in order for the HP engineer to fully diagnose the cause of the problem.

     

Chapter Review Questions

1:

Is it possible to set the kernel parameter nproc to equal 5? If so, what would be the result after the next system reboot?

2:

You have reconfigured your kernel and rebooted your system. Unfortunately, the new kernel keeps causing your system to PANIC. You have booted from your backup kernel, and you decide to leave the kernel changes to another day. Your system is currently booted from the kernel /stand/vmunix.prev . You are wondering what will happen if you let the system run with this kernel image. Why is it important that the kernel you boot from and consequently the kernel referenced by the device file /dev/kmem be the same as the file /stand/vmunix ?

3:

Your system has been up and running for over 12 months without the need to reboot. You notice that syslog.log is now over 15MB in size . You decide that it would be a good idea to back up and then return the syslog.log file to zero bytes in size without rebooting or shutting down the syslog daemon. Comment on the following commands to perform these tasks :



# tar cvf /dev/rmt/0m /var/adm/syslog/syslog.log

# rm /var/adm/syslog/syslog.log

# touch /var/adm/syslog/syslog.log

4:

You have noticed a sequence of messages in the syslog.log file of the following form:




Oct 29 17:48:02 hpeos003 vmunix:


LVM: vg[0]: pvnum=0 (dev_t=0x1f01f000) is POWERFAILED


Oct 29 17:48:02 hpeos003 vmunix:


SCSI: Write error -- dev: b 31 0x01f000


, errno: 126,
graphics/ccc.gif
resid: 10240,

Oct 29 17:48:02 hpeos003 vmunix:        blkno: 2438, sectno: 4876, offset: 2496512, bcount
graphics/ccc.gif
: 10240.

The disk appears to be working okay at the moment, but you suspect that there is a problem with the disk and after using STM diagnostics, you establish that there was in fact a write error logged for the disk at the time specified. What should you do next?

5:

Your system has experienced a system crash. You have looked at the crashdump INDEX file and have seen a panic string of the form:



"panic: Data page fault."

There is no tombstone file in /var/tombstones . What should you do next?