Error/Fault Data for Router Processors
The format of this section is identical to the performance sections, except that in addition to MIBs and CLI commands, we also will present relevant SNMP traps and syslog messages.
SNMP traps require you have a trapd daemon running on some SNMP server and the router must point to that server configured via the snmp server-host configuration command. Syslog messages can be stored on a syslog server or on the router itself, either on the console or in a buffer. If stored in the buffer, executing the command show logging will show the same values as seen on a syslog server. This feature is especially important to use in addition to a syslog server when the network is down or when the route to the syslog server is down or unavailable. See "Setting Up SNMP" in Chapter 18 for more information on best practices for SNMP configuration.
MIB Variables for Memory Leaking or Depletion
The bufferNoMem variable, from OLD-CISCO-MEMORY-MIB, is a counter of the number of buffer create failures due to no free memory. If there is not enough system memory to allocate an appropriate size buffer, the buffernoMem counter is incremented. Increments in buffernoMem can also provide you with locations of some of your network issues. If this counter increases at all, then a packet probably is being dropped.
In "Performance Data for Router Processors," we focused on looking at the CPU processes and the memory usage. Here, we look at the way system buffers affect memory. System buffers are dynamic in nature and constantly changing, either by creating or trimming, thus affecting the buffers' interaction with system memory. Ideally, you want to have the trims and creates stay fairly constant. We'll look at the trims and creates in the "show" command output a little later in this section.
The buffernoMem MIB can be used in correlation with the memory MIBs ciscoMemoryPoolFree, freeMem, or ciscoMemoryPoolLargestFree.
The recommended baseline threshold for BufferNoMem is that the value should be relative to sysUptime. But even a value of 1 can be an indication to start looking at where the misses are occurring because misses cause creates, and create failures cause "no memory" conditions.
A related MIB object from OLD-CISCO-MEMORY MIB is bufferFail, which is a count of the number of buffer allocation failures. This MIB is really a superset of bufferNoMem variable, and is typically seen as the same value as bufferNoMem.
CLI Commands for Analyzing Memory Usage
See "CLI Commands for Buffer Usage" for details on the show buffers command and output (see Example 11-7). For details on the show memory command and output (see Example 11-5), see "Using the show memory Command."
Syslog Messages Relating to Memory Issues
A number of syslog messages are useful for memory fault management, and apply directly to the MIB objects and CLI commands previously discussed. They are collected in Table 11-3.
MIB Variables for Identifying Router Reloads
The WhyReload MIB, from OLD-CISCO-SYSTEM MIB, contains a printable octet string that contains the reason why the system was last restarted. Reasons include things such as "power on," user-initiated reload, exception, or some other error.
The whyReload MIB can help you track change management windows, such as scheduled powerdowns and possible IOS defects, when values such as exceptions are seen for reasons. Used in conjunction with the SNMP trap reload, whyReload can provide further insight. For example, based on a reload trap seen on the snmp server-host, you can trigger an SNMP poll of the whyReload MIB variable to find the reason why the router reloaded.
The recommended baseline threshold is that any value other than "power-on" or "reload" should be flagged because it can identify possible software or hardware errors.
CLI Commands for Analyzing Reload Crash Conditions
The show stacks command is useful for troubleshooting software-forced crashes on the router that caused the reload SNMP trap to initiate. The result of the whyReload MIB can lead you to look at this command output. See Chapter 10 for details on the output from the show stack command in the section "Router Health from show stack."
SNMP Traps Relating to Reload Conditions
The reload trap, from CISCO-GENERAL-TRAPS, indicates that your router reloaded for some reason. It is sent when the router detects that it is booting because a trap is unlikely to successfully get sent when the router is in the act of rebooting itself. The following section displays the syslog messages relating to reload conditions. When a reload syslog message is reported, the reload trap will follow and correspond to that message. Refer to your network management vendor for details on the format of the reload trap.
Syslog Messages Relating to Reload Conditions
A number of syslog messages are useful for analyzing why routers reload, and apply directly to the MIB objects and CLI commands previously discussed. They are collected in Table 11-4.
The two syslog messages in Table 11-4 are good baseline or "threshold" points to monitor, once for when the reload was requested and once for when the router is back online. Also, you can determine how long it takes for the router to boot up from these two syslog messages, from the reload request to the "restarted" message.