This section looks in detail at the following performance issues:
MIB Variables for Modem Status MonitoringThe Modem Management MIB, from CISCO-MODEM-MGMT-MIB, is arranged into three groups:
However, from the cmGroupInfo group, the variables cmSlotIndex and cmPortIndex are the modem feature card slot and port number, respectively, in the group. Each of the following variables is a counter. They are correlated with the CLI show modem command in Example 17-3. These are some of the most significant variables for monitoring the status of your modems. These objects exist only for modems that have cmManageable to be true.
The recommended threshold value for success is => 95 percent. The incoming and outgoing calls need to be observed and measured for your environment. At that point, you should be able to come up with threshold values for your network. In the preceding cmLineInfo group variables, the one variable not defined is the percentage of success or failure. It would not be difficult to write a script to calculate a percentage of in-or-out success and fail values that could be used to set a threshold in your Network Management platform. The same threshold value of => 95 percent would be appropriate. CLI Commands for Modem Status MonitoringThe CLI command related to the MIBs just discussed is the show modem command. This command displays the number of calls received by each modem in the NAS. The show modem command is useful for both fault and performance management, as well as when troubleshooting the NAS. It contains several valuable data points in the display output that can help you identify possible problem areas with the modems or in the network itself. Looking at the number of failed modem connections can assist in determining the number of calls that the NAS is receiving and attempting to process. If all the modems show similar failure characteristics over time, the number of calls being received probably exceeds the NAS' capability to handle them. The modem data also can help you to diagnose modems that need to be replaced due to hardware failure. Example 17-3 shows output from the show modem command. Example 17-3 Obtaining modem status information with the show modem command. AS5300# show modem Inc calls Out calls Busied Failed No Succ MdmA UsageB SuccC FailD SuccE FailF OutG DialH AnswerI Pct.J * 1/0 17% 74 3 0 0 0 0 0 96% b 1/1 15% 80 4 0 0 0 1 1 95% * 1/2 15% 82 0 0 0 0 0 0 100% 1/3 21% 62 1 0 0 0 0 0 98% B 1/4 21% 49 25 0 0 0 0 0 50% * 1/5 18% 65 3 0 0 0 0 0 95% d 1/6 19% 58 2 0 0 0 0 0 96% * 1/7 17% 67 5 0 0 0 1 1 93% * 1/8 20% 68 3 0 0 0 0 0 95% 1/9 16% 67 2 0 0 0 0 0 97% 1/10 18% 56 2 0 0 0 1 1 96% * 1/11 15% 76 3 0 0 0 0 0 96% * 1/12 16% 62 1 0 0 0 0 0 98% 1/13 17% 51 4 0 0 0 0 0 92% D1/14 16% 51 5 0 0 0 0 0 91% 1/15 17% 65 0 0 0 0 0 0 100% 1/16 15% 73 3 0 0 0 0 0 96% T 1/17 17% 67 2 0 0 0 0 0 97% T 1/18 17% 61 2 0 0 0 0 0 96% * 1/19 17% 74 2 0 0 0 0 0 97% 1/20 16% 65 1 0 0 0 0 0 98% * 1/21 16% 58 3 0 0 0 0 0 95% * 1/22 18% 56 4 0 0 0 0 0 93% * 1/23 20% 60 4 0 0 0 0 0 93% See Table 17-1 for a description of the modem states under the "Mdm" column of Example 17-3. These states are important in understanding what is occuring with your modems. For instance, the "b" and "B" are important in telling whether a modem is operational or not, and can be gathered through either the CLI or SNMP. The annotated items from Example 17-3 are as follows:
Some of the data from show modem can be very useful. First, the number of modems that are bad should be tabulated (cmBad and cmState). It may be that the entire bank of modems cannot be replaced for a single or even multiple bad modems, but keeping track of the ones that are bad can help with troubleshooting, capacity, and performance management. In an environment where the NAS only has incoming calls, then all of the "out" data can be discarded. The incoming data, both success and fail (cmIncomingConnectionCompletions and cmIncomingConnectionfailures), can help you identify problem areas before a complete failure occurs. The same is true of dial-out connections, if your environment allows them. You can produce a useful table by collecting and summarizing information from show modem commands from each of the servers (see Table 17-2). This information could also be collected through SNMP with the following variables:
However, an easier way to gather the data is through the use of the show modem summary command. Example 17-4 shows sample output from show modem summary. Example 17-4 Obtaining aggregated data on modem status with show modem summary.NASPPP011#sh mod sum Incoming calls Outgoing calls Busied Failed No Succ Usage Succ Fail Avail Succ Fail Avail Out Dial Ans Pct. 0% 39887 1843 48 0 0 48 2 0 65 95% NASDU04#sh mod sum Incoming calls Outgoing calls Busied Failed No Succ Usage Succ Fail Avail Succ Fail Avail Out Dial Ans Pct. 0% 40109 1745 48 0 0 48 0 0 52 96% The data in Example 17-4 corresponds to that of Tables 17-2 and 17-3, but is obtained through the use of a single command rather than summing the data from all of the modems.
From Table 17-2, you can see (Out Calls Succ, Out Calls Fail, and Failed Dial) that this environment does not allow for dial-out from the NAS. Therefore, we can do a little improvement on the chart by removing those columns and adding others.
By removing the dial-out information that was not important in this particular scenario, adding the data on "no answer" and "success percentage," and using "time" to look at the number of calls by hour and day, the information becomes much more useful. Here, the No Answer column is bold because of its importance to the analysis. In rows 4 and 7 of Table 17-3, the number of failed calls (2262 and 6935, respectively) is quite a bit higher than for the other servers. In those same rows, the success percentages (74 and 81) also indicate that there is some problem with these two servers. Two other potential problems are indicated in the table. First, NASPPP04 (row 4) has a relatively low number of "no answer" calls (17) and a moderately high number of "average calls per day" (2120) when compared to the other servers. NASPPP04 should be investigated for possible modem problems hardware or firmware. Second, NASPPP07's data of 145 "no answer," 3.85 "average calls per day," and 4430 "average calls per box per day" would seem to indicate that this server cannot handle the call volumne that it's receiving. Therefore, you would want to investigate capacity issues. A threshold value of 95 percent for "success %" is recommended. Experience has shown that once this threshold is exceeded, the "success %" often falls rapidly. Table 17-4 may not help isolate the cause, but it is the starting place to look at standard NAS health and may even indicate that some standard router health checking might help isolate the problem. MIB Variables for Monitoring of Connection StatisticsFrom CISCO-MODEM-MGMT-MIB, the cmLineStatisticsTable is the table that contains the variables to poll for most of the connection statistics data. The entries in cmLineStatisticsTable indicate the number of connection completions, failures, and other important statistical data needed to monitor the modem performance. From the cmLineStatusTable, the cmDisconnectReason is one of the most important variables. It indicates the reason that the last connection or call attempt disconnected. Because this variable is considered essential to monitoring performance of the NAS, the meaning of each "error type" is explained in Table 17-4. The 15 most common error types have been highlighted. Also, the fields from the CLI show modem statistics command have been mapped into the last column as appropriate.
The cmDisconnectReason variable should be mapped using a table like Table 17-4 and into each of the different error types. Then, the cmDisconnectReason or the data from the CLI show modem statistics command could be used to correlate the SNMP result (column 2) with the error type. This correlated data could then be graphed for each modem or aggregated for each NAS against the different types of reasons for disconnection. Finally, the use of another column in the mapping that contained the summation of the reason that a disconnect type occurred among all the modems in the NAS would provide excellent data on the modem health of the NAS. The next section maps the variable cmDisconnectReason and the entries from the cmLineStatisticsTable to the show modem call-stat data. CLI Commands for Monitoring of Connection StatisticsUse the command show modem call-stat to get modem statistical data (see Example 17-5). Specifically, look for the category of disconnect reasons that can happen in either a dial-in or dial-out scenario. This command is used to find out why a modem ended its connection or why a modem is not operating at peak performance. Example 17-5 Sample output from the show modem call-stat command.tanr#sh modem call-stat dial-in/dial-out call statistics lostCarr rmtLink wdogTimr compress retrain inacTout linkFail moduFail Mdm # % # % # % # % # % # % # % # % * 2/0 1 20 13 3 0 0 0 0 0 0 0 0 0 0 0 0 * 2/1 0 0 14 4 0 0 0 0 0 0 0 0 0 0 0 0 * 2/2 0 0 15 4 0 0 0 0 0 0 0 0 0 0 0 0 . . . * 2/29 0 0 15 4 0 0 0 0 0 0 0 0 0 0 0 0 Total 5 338 0 0 0 0 0 0 dial-out call statistics noCarr abort noDitone busy dialStrg autoLgon dialTout rmtHgup Mdm # % # % # % # % # % # % # % # % * 2/0 0 0 6 1 0 0 0 0 0 0 0 0 0 0 0 0 * 2/1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 * 2/2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 * 2/3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . . Total 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 dial-out call statistics The description of fields from show modem call-stat is in the last column of Table 17-4. It is included there to provide you with the correlation between the SNMP variable and the show command fields. The lostCarr, rmtLink, and inacTout errors all result from errors external to the router. The wdogTimr errors are a result of errors within the router. Finally, compress, retrain, linkFail, and moduFail errors need further investigation to determine which side has the problem. In Example 17-5, the number of times an error occurred on a specific modem is displayed (see the # column). The % column (which does not have an SNMP variable equivalent) shows the percentage of occurences of a specific disconnect reason charged to a specific modem, compared with all the modems. For example, out of all the times the lostCarr error occurred on all the modems in the system, the lostCarr error occurred 20 percent of the time on modem 2/8. As you can see in Table 17-4, there is a good deal of overlap between the cmDisconnectReason and the show modem call-stat command. Again, this data should be looked at in a tabular form and possibly graphed to provide information that can be used for trending and fault analysis. After the data has been collected in the user environment for a time, the use of RMON events and alarms for specific variables could provide a proactive approach to the problems seen. For instance, suppose you found that only when the access server was experiencing particularly heavy traffic volumes did users receive autologin errors. Using RMON, you could trigger a trap on this particular variable. See the section "Remote Monitoring MIB and Related MIBs" in Chapter 8, "Understanding Network Management Protocols," for more information on how to use RMON for thresholding and proactive fault management. MIB Variables for Monitoring Modem Connection SpeedsAs noted earlier, the entries in cmLineStatisticsTable are the primary set of variables that indicate the number of connection completions, failures, and other important statistical data needed to monitor the modem performance. This table also contains variables to assist you in monitoring the connection speed of your modems. This data is very useful in troubleshooting the modem connections. The following are the connection speed variables:
If connections are consistently being made at 2400 or less, or even between 2400 and 14000, you can be pretty sure that there is some type of problem in the system. It could be with the line or with the "training" between the client and NAS modem. One problem with using SNMP rather than a script that gathers the information through the CLI command is the granularity involved. The CLI commands allow an administrator to gather data on the transmit and receive speed counters in the following range: 75, 300, 600, 1200, 2400, 4800, 7200, 9600, 12000, 14400, 16800, 19200, 21600, 24000, 26400, 28800, 31200, 33600, 32000, 34000, 36000, 38000, 40000, 42000, 44000, 46000, 48000, 50000, 52000, 54000, and 56000 bps. The SNMP variables do not provide that level of detail. Although this much detail may not be necessary in every environment, it has come into use in troubleshooting a particularly difficult issue with modems that would not synch except at 300 2400. In this particular scenario, there were users that still had 2400 bps modems and the additional granularity down to 300 was needed. Later, as the users were upgraded to 56 KB modems, the administrator wanted to see and report to the executive level exactly the speeds at which the users were connecting to the servers. In the following secions on the CLI commands, several tables and a graph are provided as examples of how to use this data. The variables cm2400OrLessConnections, cm2400To14400Connections, and cmGreaterThan14400Connections could also be used in a similar manner. Again, the greatest difference is the amount of granularity that you get using the CLI command versus the SNMP variables. If this is not an issue in your environment, these might meet your needs. CLI Commands for Monitoring Modem-Connection SpeedsThe show modem connect-speeds command displays a log of connection speed statistics, starting from the last time the access server or router was power cycled or the clear modem counters command was issued. Because most terminal screens are not wide enough to display the entire range of connection speeds at one time (for example, 75 to 56000 bps), the max-speed variable is used. This variable specifies the contents of a shifting baud-rate window, which provides you with a snapshot of modem connection speeds for your system. To display a complete picture of all the connection speeds and counters on the system, you must enter a series of commands. Each time you issue the show modem connect-speeds max-speed command, only 9 bps rate columns can be displayed at the same time. To gather all the data for connection speeds up to 56,000, the following four commands would have to be used: show modem connect-speeds 56000 show modem connect-speeds 38000 show modem connect-speeds 21600 show modem connect-speeds 12000 Example 17-6 shows output from the show modem connect-speeds 56000 command. For brevity, the other show commands are not shown. Example 17-6 Obtaining modem connection speed information with the show modem connect-speeds command. as5800-as5300#show modem connect-speeds 56000 transmit connect speeds MdmA 48000B 49333 50000 50667 52000 53333 54000 54667 56000 TotCntC 1/3/00 0 1 0 0 0 0 0 0 0 1 1/3/01 0 0 0 1 0 0 0 0 0 1 1/3/02 0 0 0 1 0 0 0 0 0 1 B 1/5/13 0 0 0 0 0 0 0 0 0 0 B 1/5/14 0 0 0 0 0 0 0 0 0 0 B 1/5/15 0 0 0 0 0 0 0 0 0 0 TotD 0 1 0 8 0 0 0 0 0 13 Tot %E 0 7 0 61 0 0 0 0 0 receive connect speeds Mdm 48000 49333 50000 50667 52000 53333 54000 54667 56000 TotCnt 1/3/00 0 0 0 0 0 0 0 0 0 1 1/3/01 0 0 0 0 0 0 0 0 0 1 1/3/02 0 0 0 0 0 0 0 0 0 1 1/3/03 0 0 0 0 0 0 0 0 0 1 B 1/5/13 0 0 0 0 0 0 0 0 0 0 B 1/5/14 0 0 0 0 0 0 0 0 0 0 B 1/5/15 0 0 0 0 0 0 0 0 0 0 Tot 0 0 0 0 0 0 0 0 0 13 Tot % 0 0 0 0 0 0 0 0 0
The data (speed counter/modems) can be gathered and reported on a daily, weekly, monthly, and quarterly basis. The data is usually summarized on the weekly, monthly, and quarterly reports for the entire NAS, as shown in Tables 17-5, 17-6 and 17-7. This type of report can help the remote access administrator foresee and prevent problems with user connectivity. MIB Variables for Measuring Modem Utilization of the NASThe same variables discussed in the section "MIB Variables for Modem Status Monitoring" in the CISCO-MODEM-MGMT-MIB, cmLineInfoGroup, provide the variables for modem utilization. The line status data provides the basic information needed to begin managing both modem and NAS utilization. As you've seen, the other variables within this MIB provide data and statistics that increase your ability to effectively manage the modems, server, calls, and lines. Through this information, you can monitor for server, modem, and line congestion which can result in a severe degradation of the network in both response time and throughput. CLI Commands for Measuring Utilization of the NASThe three show commands that can provide the data needed to manage the NAS utilization are show modem (see Example 17-3), show modem connect-speeds (see Example 17-5), and show modem summary. By summarizing the data gathered from the show modem connect-speeds commands into tables, you will have something like Tables 17-5, 17-6 and 17-7.
By using a relatively simple bar chart to graph the data from all three tables (see Figure 17-1), you have the connection speeds for the NAS (over some specified period of time). This type of data can help you represent the different connection speeds and their frequency. One interesting item to note here is the connections at 4800 bps. This particular environment has deployed 33.6 KB and 56 KB modems. In this sample network, all of the lower-speed modems at all of the remote sites have been eliminated. Figure 17-1. NAS Connection SpeedsIn addition to the preceding data, the aggregated data in Tables 17-2 and 17-3, collected from the show modem summary command, are useful in looking at the total number of successful and failed calls over time. As in Table 17-3, the data can then be used to come up with the number of calls per port per hour or the average calls per server per day. Both of these go directly to the question of utilization. Figure 17-1 is a graphical representaion of the data in the previous tables. It shows at a glance that the users are grouped into three basic connection speeds: 4800 to 9600, 26.4 KB to 33.6 KB, and 42 KB to 48 KB. As stated earlier, this user environment only has 33.6 to 56 KB modems. However, the figure shows an interesting grouping in the 4.8 KB to 9.6 KB range. At this point, the RADIUS log could be used to find the telephone numbers of all users that have connected at this speed. With this list of phone numbers, the network administrator can identify the user locations that have been having the slowest connectivity to the network and begin the process of finding out the reasons for this slowness. MIB Variables for Measuring ISDN UtilizationFrom CISCO-CALL-HISTORY-MIB, ciscoCallHistoryTable is the table that provides much of the data needed for call status and utilization. Specifically, the following variables can be of value. The explanation of each variable is taken from the CISCO-CALL-HISTORY-MIB; comments are from the authors.
In Table 17-8, the variables that were non-zero values for an example environment are placed in a simple format. From this type of table, the network administrator is now able to begin looking at the total number of calls that an interface receives, the number of packets/bytes transmitted and received, and the amount of time the circuits are being utilized.
The call history table provides the basic data needed to begin managing ISDN call utilization. As you can see, the variables within this MIB do provide some useful data. There are several cost variables that normally do not: ciscoCallHistoryRecordedUnits, ciscoCallHistoryCurrency, ciscoCallHistoryCurrencyAmount, and ciscoCallHistoryMultiplier. Another repository for this type of data is the AAA server log. Through the Call History Table and the AAA logs, an administrator is able to monitor for number of calls, the number that the calls are made to (or in the case of the AAA logs, the IP addresses) and begin looking at ISDN utilization on the NAS. The use of an AAA server should be considered a best practice for dial access networks. In particular, the AAA server provides a logging function each time a call is authenticated. This log can be used to gather much the same information as the Call History MIB and provides an excellent source for data validation. Just as with modem management, using the MIB or the show commands to collect data on the number of calls, duration of calls, errors, and type of traffic passed enables you to understand the performance of your server, make decisions on capacity, and actively manage users' perceptions of the network. CLI Commands for Measuring ISDN UtilizationThe show isdn history command displays historic and current call information, including the called number, the time until the call is disconnected, AOC (Advice of Charge) charging time units used during the call, and whether the AOC information is provided during calls or at the end of calls. Example 17-7 shows sample output from show isdn history. Example 17-7 Obtaining ISDN utilization information with show isdn history. rtp-isdn>show isdn history -------------------------------------------------------------------------------- ISDN CALL HISTORY -------------------------------------------------------------------------------- History table has a maximum of 100 entries.A History table data is retained for a maximum of 15 Minutes.B -------------------------------------------------------------------------------- Call Calling or Called Remote Seconds Seconds Seconds Recorded Charges TypeC Phone numberD Node NameE UsedF Left Idle Units/Currency -------------------------------------------------------------------------------- In 9194670812 +uller-isdn 242138 0 Out 3625083 +lansk-isdn 184062 0 0 In 9197725168 +riend-isdn 158229 0 In 9198515120 +rinho-isdn 104170 0 In 9195622974 +ltman-isdn 74795 0 In 9198722308 +nkins-isdn 70787 0 In 9198722309 +nkins-isdn 68906 0 In 9195574922 +moore-isdn 36556 0 Out 3626757 +gugan-isdn 28826 0 0 In 9193621042 +odwin-isdn 28435 0 In 9193628901 +hanco-isdn 26133 0 In 9193628902 +hanco-isdn 26127 0 In 9192864873 +lliot-isdn 25767 294 5 In 9194685397 +aylor-isdn 25432 0 Out 4683721 +iralt-isdn 24634 0 0 In 9193872921 +alton-isdn 20892 0 In 9194672574 +brown-isdn 14240 0 In 9197852332 ruthm-isdn 14075 0 In 9198549851 +lland-isdn 8133 0 In 9193045178 jamng-isdn 8066 0 In 9199687239 +aniel-isdn 6964 292 7 In 9197723502 +spain-isdn 6679 0
One of the most important new items in Example 17-7 is the addition of Calling or Called Phone number. By knowing the numbers for business-critical applications and significant users, and times when activity is at its highest, you can begin to build thresholds using these numbers. Another approach would be to identify the numbers that have the most consistently low connection rates and then focus on solving their problem (if there is one). After this has been accomplished, use this group as a "threshold" by having an alert sent if another number is added to the set. This would take some relatively sophisticated scripting, but would be worth the time and effort. |