The default monitors that are installed during Setup fall in the category of Synchronized Monitors, which means that they are replicated to all the cluster members. If you delete or disable a monitor in this category, that action will be reflected across the cluster, provided that you do so on the cluster controller. By the same token, if you create a new monitor on the controller and then add it to the Synchronized Monitors group, your new monitor will be synchronized across the cluster. Table 9.3 summarizes the Application Center Synchronized Monitors category at the data collector level.
NOTE
As shown in Figure 9.6, the parent data group for the Synchronized Monitors is Application Center Monitors. Some, but not all, data collectors are members of a child data group.
Table 9.3 Synchronized Monitors Installed During Set Up
Name | Type | Enabled | Threshold | Description |
---|---|---|---|---|
Log Agent Job Failure | Windows Event Monitor | No | # of instances collected > 0 | Looks for event identifier 208, which indicates a job failure in the Event Log for SQLAgent$MSAC. |
Log Agent Service | Service Monitor | No | Started != true | Monitors the Started, State, and Status properties of the SQLAgent$MSAC Service. |
Log Database Service | Service Monitor | No | Started != true | Monitors the Started, State, and Status properties of the MSSQL$MSAC Service. |
Log Database Size | Performance Monitor | No | Data file(s) size (KB) > 100000 | Monitors the Data File(s) Size (KB) counter for the ACLog instance of the MSSQLMSACDatabases object. |
Cluster Service | Service Monitor | Yes | Started != true | Monitors the Started, State, and Status properties of the ACCLUSTER Service. |
Health Monitor Action Failure | WMI Event Query | Yes | # of instances collected > 0 | Monitors the ErrorCode, ErrorDescription, and Event properties of the _ConsumerFailureEvent class in the root\CIMV2\ MicrosoftHealthMonitor namespace. |
Request Forwarding Failure | WMI Event Query | Yes | Type = 1 | Monitors the Type property of the MicrosoftAC_RequestForwarding_Initialize_Event class in the root\MicrosoftApplicationCenter namespace. |
Server Offline | WMI Event Query | Yes | EventId = 4015 or 4016 | Monitors the EventId property of the MicrosoftAC_Cluster_Load-balancing_Event class in the root\MicrosoftApplicationCenter namespace. |
Synchronization Session Failure | WMI Event Query | Yes | EventId=5037 or 5038 | Monitors the EventId property of the MicrosoftAC_Replication_Session_General_Event class in the root\MicrosoftApplicationCenter namespace. |
W3Svc (Web Service) | WMI Instance | Yes | Started != true | Monitors the Started, State, and Status properties of the W3SVC instance of the Win32_Service class in the root\CIMV2 namespace. |
Logical Disk | Performance Monitor | No | % free space < 10 | Monitors the Logical Disk object. |
Memory | Performance Monitor | Yes | Pages/sec > 500 | Monitors the Pages/sec counter for the Memory object. |
Processor | Performance Monitor | Yes | % Processor Time > 90 | Monitors the counter for the _Total instance of the Processor object. |
http://127.0.0.1 | HTTP Monitor | No | HTTP monitor response time > 30 seconds HTTP monitor status code >= 400 | Monitors the Response Time and Status Code for the URL http://127.0.0.1/ with a timeout of 30 seconds. |
NOTE
All of the default monitors that are installed for Application Center also check the value of the Error Code (From WMI) property and verify that it is not equal to zero. An error code greater than zero indicates a failure in WMI, which is considered a critical failure.
By default, Application Center creates the Online/Offline Monitors data group. This group contains the W3SVC data collector and its threshold, which will trigger the AC.exe command line Action to set a member offline when the collector's threshold is exceeded. Conversely, when the threshold falls within operational parameters, the AC.exe action will set the member back online. If you want to configure data collectors as online/offline monitors, simply copy or move these data collectors to the Online/Offline Monitors folder. (These data collectors and their associated actions are synchronized across the cluster.)
NOTE
This data group cannot be removed. If it is deleted, the replication service will create it on the next full synchronization.
If you click on the Monitors node for a member in the console tree, the current state of the member's monitors is presented in the Monitors view. As you can see in Figure 9.8, the Monitors view provides both summary (the name, status, and last modified date for the monitor), as well as more detailed information for a specific monitor. This detailed information, displayed in the lower part, comprises the monitor's thresholds, threshold status, and threshold values (Last, Average, Minimum, and Maximum). In addition to providing a report on a member's monitors, the Monitors view also allows you to disable/enable a monitor and force an immediate evaluation of a monitor's state.
Figure 9.8 Monitors view for a cluster member
If you look at the Synchronization Session Failure monitor in the upper part of the details pane shown in Figure 9.8, you'll see that its current status is Collecting. By design, the default event-driven monitors might remain in a collecting state.
Because their threshold is set for a specific condition, these monitors will continue collecting until a specific EventID or Type is received. When this condition is met, the monitor is put into a Critical state and will stay there until the threshold condition no longer exists, at which point the monitor returns to normal. The collecting state should be considered Ok.
A good example of this continuous collecting state is the Request Forwarding monitor. This monitor listens for events from the MicrosoftAC_RequestForwarding_Initialize_Event class, which is a container class for seven events. The first six of these events are represent a failure condition; the seventh is the Request Forwarder started event. The events in this class are:
There is one threshold for this class, the property Type = 1, which is an error event. The data collector stays in the collecting state until it receives an event in this class. When an error event comes in, the monitor turns Critical until its state changes, at which time the system will automatically reset its status to Ok. After the reset, the monitor receives the MicrosoftAC_RequestForwarding_Initialize_Started_Event event, which is not an error event, and then collection resumes. The monitor will also go to Ok if the started event is received first.