HealthMon is an add-on tool for SMS 2.0 that offers the SMS administrator a real-time at-a-glance view of the status of any Windows NT 4.0 or Windows 2000 computer. As with status messages, the "health" of a system is graphically represented by a green check mark for OK, a yellow triangle for Warning, and a red "X" for Critical status.
HealthMon includes several built-in monitored objects that look suspiciously like the Performance Monitor objects we saw in the preceding section. In fact, HealthMon uses Performance Monitor objects and counters to collect data and determine status thresholds. These status thresholds are known as monitoring policies and are, of course, configurable by you, the SMS administrator. HealthMon also includes monitors specific to the Microsoft BackOffice application services. Table 6-3 lists the HealthMon objects and the Performance Monitor counters associated with them—or, in the case of a BackOffice application, the services associated with them.
Table 6-3. HealthMon objects and their associated counters
Object | Counter or Service |
---|---|
Processor | Interrupts Per Second Percent Total System Time |
Memory | Available Memory Bytes Page Reads Per Second Pages Per Second Percent Committed Bytes to Limit Pool Non-Paged Bytes |
Paging File | Percent Peak Usage Percent Usage |
Logical Disk | Percent Free Disk Space |
Physical Disk | Disk Queue Length Diskperf Driver Started Percent Disk Time |
Network Interface | Excessive Network Traffic Bytes Total/Sec |
Server Work Queues | Context Blocks Queued/Sec Processor Queue Length |
Security | Errors Access Permission Errors Logon |
Fault | Pool Non-Paged Failures Pool Paged Failures Sessions Errored Out |
SQL Server | MSDTC Service Started MSSQL Server Service Started |
IIS Server | IIS Service Started |
Exchange Server | MSEXCHANGEDS Service Started MSEXCHANGEIS Service Started MSEXCHANGEMTA Service Started MSEXCHANGESA Service Started |
SNA Server | Host Connection Status SNABASE Service Started |
SMS | SMS_Executive Service Started SMS_Site_Component_Manager Service Started SMS_SQL_Monitor Service Started |
HealthMon is not installed as part of the regular SMS 2.0 setup; instead, it is included on the SMS 2.0 CD as an add-on utility. HealthMon is itself a Microsoft Management Console (MMC) snap-in and as such could be added later to the SMS Administrator Console to keep all your SMS 2.0 utilities in one place. You can add the HealthMon snap-in to the SMS Administrator Console when you run the console in the Author mode. For more information on adding an MMC snap-in, refer to Chapter 16 and to the MMC online documentation.
HealthMon consists of two components: the client agent and the console. Each system that will be monitored must have the HealthMon Agent installed on it. You will most likely install the HealthMon Console on the SMS administrator's Windows NT workstation.
To install the HealthMon Agent and HealthMon Console, follow these steps:
Now that you've installed the HealthMon Agent and HealthMon Console successfully, let's look at how to run HealthMon on the monitored system.
The first time you run HealthMon, the HealthMon Console will open but no systems will be displayed, as shown in Figure 6-8.
Figure 6-8. The HealthMon Console.
To specify the Windows NT systems that you want the utility to monitor, follow these steps:
Figure 6-9. The New Monitored System dialog box.
Figure 6-10 shows the HealthMon Console with an SMS site server added to it. The server's component status is displayed. In this example, a Warning icon appears next to the Logical Disk entry. As you can see, each component in the Components window corresponds to a HealthMon object listed in Table 6-3.
Figure 6-10. The HealthMon Console with an SMS site system named Scruffy1 added.
A value of 100 appears in the Percent Warning column. If we switch to the Events folder for the same system, as shown in Figure 6-11, the same Warning icon will appear, but with a more understandable message—namely, that drive C has less than 10 percent free disk space available. Normally, you would want to open the Components folder for a quick scan of components and then switch to the Events folder for a more detailed description of a specific event or message. With the HealthMon Console open and the Events folder in view, events will be displayed dynamically as they occur.
Figure 6-11. The HealthMon Events folder for a Windows NT system.
HealthMon and WBEMThe HealthMon Agent, which is installed on all systems that need to be monitored, uses Windows Management Instrumentation (WMI) to collect data through Performance Monitor. WMI is, of course, Microsoft's implementation of Web-Based Enterprise Management (WBEM). Because HealthMon is a WBEM-compliant application, it is capable of monitoring activity from any device or application that includes a WBEM provider to collect health data. For information about WBEM providers and other WBEM-related components and terms, refer to Chapter 1.
Research into creating and marketing WBEM-compliant hardware such as computer motherboards is currently underway. Such devices will expose to the WMI layer on Windows 2000 systems events such as the computer chassis being opened, the fan stopping, or the processor chip exceeding acceptable heat levels. Such technology will make it possible for network administrators to closely monitor all aspects of their systems, from resource utilization to hardware functionality and specification.
As mentioned, the components you monitor can also be configured by you, the administrator. You can enable the components you want to monitor using one of two methods and then configure the components you enabled. The first method is shown here:
Figure 6-12. The System Properties window.
The second method is shown here:
Regardless of which technique you choose to enable components, configuring a component's properties will always be done in the Properties window for that component. For example, you would configure the logical disk from the Logical Disk Properties window, shown in Figure 6-13.
Figure 6-13. The Thresholds tab.
The General tab of the Logical Disk Properties Window simply lists the Performance Monitor counters that are used to generate events and gives a brief explanation of the purpose of each counter. The Thresholds tab, however, provides your configuration options. Here you can set the Critical and Warning alarm thresholds for each component counter.
In this example, if the Logical Disk counter detects that less than 10 percent of free disk space is available on any logical drive, a Warning message will be generated. If the value subsequently goes above 15 percent, the warning is reset. If you delete or move files from the monitored drive to create more free space, you can eventually get the Warning message to go away. Similarly, if the amount of free space falls below 5 percent, a Critical message will be generated. The Duration value represents the number of seconds over which the condition must be met before the messages are generated. This setting ensures that messages are not generated for momentary spikes and other anomalies. In this case, the value is set to 0 because any loss of free space on a disk—especially on an SMS site server—can be detrimental and ought to be investigated.
NOTE
Enabling the Physical Disk object requires that you first enable the disk performance counters in Performance Monitor using the DISKPERF command at a Windows NT command prompt. (See the section "Specific Objects and Counters" earlier in this chapter for details.)
Network Interface counters are disabled by default. To enable them, you should install the Windows NT 4.0 Resource Kit counters component from the Windows NT 4.0 CD or install the SNMP Service using the Network option in the Control Panel. If you install SNMP, remember to also reapply the Windows NT Service Pack 4.0 or later to your system.
When you enable Microsoft BackOffice objects, remember that you are simply monitoring to determine whether the application's services are running. Be sure that you have enabled the appropriate services through the application before you enable monitoring of those services—for example, be sure to install Microsoft Exchange Server before trying to monitor its services. Otherwise, you will always display critical error messages.
PLANNING
If you installed SMS 2.0 Service Pack 1, SMS Setup disables Performance Monitor counters for SQL Server 6.5 if SQL Server and SMS Provider are located on the same computer. Because of this, Performance Monitor, HealthMon, and other applications will be unable to access SQL Server counters. To enable the counters, you will need to modify the following registry key: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MSSQLServer\Performance. Double-click on the parameter entry Library, and change the value to Sqlctr60.dll to enable the SQL performance counters. To disable these performance counters when you have finished your analysis, reset the Library value to "" by double-clicking on the reference to Sqlctr60.dll and then deleting it.Making this change may result in random WMI or SMS Administrator Console errors. Restarting the SMS Administrator Console should clear up the problem. However, for this reason, you should only enable the SQL Server performance counters when it is necessary to analyze a suspected performance-related issue. SQL Server 6.5 with Service Pack 5 and SQL Server 7.0 are not affected by this change.