|< Day Day Up >|| |
To provide a solid foundation for operations management solutions, information technology (IT) operations teams need production-quality operational management tools that they can fully integrate into the existing enterprise IT infrastructure. Proactive operations management requires continuous monitoring of systems and network components.
In too many IT operations groups, the first step when a problem occurs is to begin collecting information to analyze the problem. They must determine what information might be relevant and where the relevant information is located. They waste considerable time searching for information on different systems. Often, the problem is intermittent and the information that is needed for analysis has disappeared-until the problem resurfaces in the future. Ask yourself the following questions as to how you troubleshoot a problem:
How much time do you spend collecting data?
How many times did you not know what to look for?
How many times is the problem intermittent, and has the needed information disappeared?
How many times is the needed information on a different system?
Once you collect the data, how much time do you spend looking for related information?
Microsoft has always supplied basic monitoring and management tools with their operating systems and applications. Although these were sometimes adequate for small organizations, larger enterprises seldom used them-especially not for production environments. Until the introduction of Microsoft Operations Manager (MOM), Microsoft had traditionally left monitoring of production environments to third-party tools.
MOM is a comprehensive and scalable server monitoring solution that provides proactive, real-time monitoring and automatic problem resolution for systems running Microsoft server operating systems (Windows 2000 or later versions) and certain Windows server-based applications. This event management, performance monitoring, and reporting tool improves the performance, availability, and security of Windows-based networks and applications by continuously monitoring user actions, application software, and servers throughout the enterprise.
MOM includes a Base Management Pack that includes predefined management modules that are needed to monitor and to manage Windows and key components that are part of a networked Windows environment, including the Windows operating system, Active Directory, Internet Information Server (IIS), Domain Name System (DNS), Windows Internet Naming System, Dynamic Host Configuration Protocol, Windows Terminal Server, Microsoft Systems Management Server, Routing and Remote Access Service, Distributed Transaction Coordinator, Message Queue, Microsoft Transaction Server, and (of course) MOM. Microsoft also provides a separate Application Management Pack that includes MOM support for other products, including Exchange 5.5, Exchange 2000, and Exchange 2003. Each Management Pack module provides complete predefined, but tailorable and extensible, support for a specific application or service.
To provide configuration flexibility and efficient management and monitoring, MOM uses a distributed, three-tier architecture. MOM includes several components and interfaces that each serve a specific function. These components and interfaces fit into one of the three tiers: the presentation layer, the business logic layer, or the data layer.
The presentation layer consists of the interfaces that provide access to the collected data and configuration functionality. These are the MOM Administrator Console, the Web Console, and MOM Reporting. The components in the business logic layer provide the MOM product functionality and include Agents, Consolidator/Agent Managers (CAMs), and Data Access Servers (DASs). The data layer consists of the SQL database along with the various data providers. These components are shown in Figure 12.1.
Figure 12.1: Microsoft Operations Manager logical model
The CAM, DAS, and database components are designed so that you can place them on separate physical servers. However, in the current MOM release, MOM only supports the following two physical models:
You can place the CAM, DAS, and database on the same physical server.
You can place the CAM and DAS on the same physical server, with the database on a separate physical server.
The first of these physical models (i.e., the one with all components on the same server) is useful only for small environments. For most production environments, you will want to place the SQL database components on a separate physical server.
An Agent is a component of the business logic layer and is the MOM service that runs on each server you want to monitor. The Agent collects data from the monitored server, applies processing rules to the collected data, and sends the data to a Consolidator.
As the Agent collects data, it performs actions defined by the processing rules. The Agent can change a state variable, consolidate multiple events into one event, execute a script or a command file, filter events, generate a Simple Network Management Protocol (SNMP) trap, send an alert based on an occurrence of an event, send an alert when a performance threshold is crossed, and/or send data to a MOM Consolidator.
The ability to generate an SNMP trap is commonly used to integrate MOM with enterprise management frameworks.
Agents temporarily store collected data in a buffer before sending it to the Consolidator. This allows Agents to continue to collect data during temporary network outages. No data will be lost as long as there is room in the buffer. At regular intervals, the Agent contacts a Consolidator and uses a guaranteed delivery mechanism to send the accumulated data. By default, MOM compresses and encrypts the data to reduce network bandwidth requirements and to increase data security. Agents also send a periodic heartbeat to the Consolidator to let the Consolidator know that the Agent is still operational. In response to the heartbeat, the Consolidator lets the Agent know whether its rules need to be updated.
MOM can collect data from many different sources. Using processing rules, you can specify how MOM collects, handles, and responds to information. MOM can collect data from the following types of data provider:
Event logs. Monitored servers log events in specific Windows event logs- i.e., the Applications, System, Security, DNS, File Replication, and Directory Service logs that you can view using the Windows Event Viewer.
Application-specific log files. Some applications create their own text log files. MOM can collect data from some application-specific log files.
Timed event providers. These provide events generated by MOM at scheduled times.
Windows Management Instrumentation (WMI) event providers.
Provide WMI events, such as service status changes or SNMP traps sent to the server.
Missing events. A missing event is an event that is supposed to occur within a specified time interval but does not, such as when an automated daily Exchange backup procedure fails to complete.
Performance data. MOM measures performance by sampling numeric data from performance counters and from WMI numeric values. MOM also can monitor for performance thresholds and generate an alert when the threshold value is crossed.
The Consolidator and its associated Agent Manager are part of the business logic layer. The CAM services are considered a unit and always run on the same physical server. You can have multiple CAM servers, depending on factors such as the number of managed servers, network traffic patterns, and organizational considerations.
The primary functions of the Consolidator are to collect data sent by the Agents, perform actions specified by processing rules (e.g., running a script or notifying a system administrator of a detected condition), and forward the collected data to a DAS.
The Consolidator also serves as the Agent for the server on which the Consolidator is installed. In more complex, hierarchical MOM implementations, you can configure a Consolidator to forward alerts to another configuration group.
The Agent Manager automatically installs, configures, and uninstalls Agents on the managed servers. If a processing rule changes, the Agent Manager automatically sends the revised rule to the affected Agents, ensuring that Agents always have the latest processing rules.
The DAS provides centralized database access and query support. The DAS controls the flow of data to and from Consolidators, the MOM Administrator Console, the MOM Web Console, and the database. All requests to insert data into the database and most requests to retrieve data from the database go through the DAS. The DAS maintains data consistency, maintains logging, provides shared caching of Agent and event information, and provides pooled connections to the database.
As with the CAM services, the exact number of DAS servers depends on factors such as the number of managed servers, network traffic patterns, and organizational considerations.
The database is part of the data layer and uses Microsoft SQL Server to provide the central storage for configuration information, rules, scripts, and collected data (i.e., events, alerts, performance data) for a MOM configuration group. As with any SQL database, the MOM database contains tables, indexes, views, and stored procedures. The database also has an associated transaction log.
Collecting data serves little purpose if you cannot access the collected data. MOM provides three user interfaces to the MOM database.
Web Console. The Web Console allows you to view and monitor the data stored in the database from any Windows platform that supports Microsoft Internet Explorer. The Web Console provides preconfigured views of the collected events, alerts, computers, and performance. You also can create custom views to match your requirements. The Web Console provides read-only access, meaning that you cannot modify rules or make MOM configuration changes using the Web Console.
MOM Reporting. MOM Reporting (Figure 12.2) allows you to generate preconfigured operations reports and graphs (including HyperText Markup Language reports for viewing with an Internet browser) based on the collected data in the MOM database. The available reports depend on the Management Pack modules you have implemented but generally include operations reports, availability reports, security audit reports, capacity planning graphs, and performance analysis graphs. MOM Reporting generates the reports using a run-time version of Microsoft Access. If you have the full version of Microsoft Access, you can customize the reports to meet your specific needs.
Figure 12.2: Microsoft Operations Manager Reporting
MOM Administrator Console. The Administrator Console (Figure 12.3) runs on any Windows system and provides the central monitoring and configuration point for MOM, allowing you to view and monitor the data stored in the database and to make configuration changes. The Administrator Console consists of three Microsoft Management Console snap-ins.
Figure 12.3: Microsoft Operations Manager Administrator Console
Configuration snap-in. You use the Configuration snap-in to configure Agents, Consolidators, and Agent Managers.
Rules snap-in. You use the Rules snap-in to create new computer attributes, computer grouping rules, notification groups, processing rules, and scripts.
Monitor snap-in. The Monitor snap-in provides the functionality to create views for alerts, computer attributes, computer groups, events, and performance data.
MOM uses Windows-based groups to restrict access to these interfaces. Only accounts within certain MOM groups can perform tasks associated with the role.
A MOM configuration group is a collection of associated MOM business logic and data components. A MOM configuration group consists of the components shown in Figure 12.4:
Figure 12.4: Microsoft Operations Manager configuration group
One-and only one-database: The database provides a central storage location for all data collected from the configuration group. This database includes alerts, rules, scripts, and configuration data.
One or more DASs.
Up to 6 Consolidators and associated Agent Managers: Each CAM can manage up to 700 agents.
Multiple agents: Each MOM configuration group supports a theoretical limit of 4,200 agents (i.e., 6 CAMs each supporting 700 agents).
You also can implement multiple configuration groups if you need more Agents than MOM will allow in a single configuration group or to meet your specific geographic, organizational, or network bandwidth requirements.
Another configuration option is to implement a hierarchical MOM infrastructure. This may be particularly appropriate if you have delegated management responsibility to regional teams. The monitored servers in each region send their collected data to a Consolidator in the regional MOM configuration group where MOM makes all of the data available to the regional operations team. Using a feature known as alert forwarding, the regional Consolidator can send just the alerts to a Consolidator in the enterprise's master configuration group.
You can use the Monitor snap-in (Figure 12.5) to view database information about agents, alerts, components, computers, computer groups, events, performance, and service level exceptions. The Monitor snap-in provides the following default views.
Figure 12.5: Administrator Console - Monitor snap-in
The All Computers view shows summary information from every monitored server in the MOM configuration group. The details pane contains one entry for each server. Each entry includes an icon representing the highest currently unresolved alert for the monitored server. You can double-click a monitored server entry to view all open alerts for the server.
The All Agents view shows the same information as the All Computers view.
The All Computer Group view shows all computer groups in the MOM configuration group. The details pane contains one entry for each computer group, including an icon identifying the highest currently unresolved alert for servers in the computer group. You can double-click a computer group entry to display a list of all servers in the group.
The All Open Alerts view shows all unresolved alerts for servers in the configuration group. This can include multiple alerts from each monitored server. The details pane contains one entry for each unresolved alert, with an icon indicating the alert severity. You can double-click an alert entry to view the alert properties. The alert properties provide information about the alert to help you determine how to resolve the reported problem. Each alert includes the alert severity, the name of the monitored server that generated the event that caused the alert, the current resolution state, the resolution history, knowledge base information about this type of alert, and custom alert fields you may have created.
The resolution state indicates the current status of your efforts to resolve the alert condition. MOM comes preconfigured with the following resolution states:
New. This indicates that the alert has not yet been addressed.
Acknowledged. This indicates that someone has read and acknowledged the alert, but no one has been assigned responsibility for the alert.
Level 1: Assigned to help desk or local support. This indicates that the help desk or local support has been assigned responsibility for the alert.
Level 2: Assigned to subject matter expert. This indicates that a subject matter expert has been assigned responsibility for the alert.
Level 3: Requires scheduled maintenance. This indicates that correcting the alert condition requires maintenance, which has been scheduled.
Level 4: Assigned to external group or vendor. This indicates that an external group has been assigned responsibility for the alert.
Resolved. This indicates that the alert has been resolved.
Except for the 'New' and 'Resolved' resolution states, you can modify or delete any predefined definitions to create resolution states that meet the needs of your organization.
The knowledge base information about this type of alert provides outofthe-box information to help resolve the problem that caused the alert. You also can add information to provide information specific to your environment.
Each resolution state definition has an associated Service Level Agreement (SLA) duration. If an alert remains in the resolution state longer than the SLA period, MOM marks the alert as a service level exception and adds it to the All Service Level Exceptions view. For example, if your policy requires that someone acknowledge all new alerts within 5 minutes, MOM would mark any alert remaining in the New resolution state for longer than 5 minutes as a service level exception. The All Service Level Exceptions view shows all Service Unavailable alerts and all alerts that have been in their current resolution state beyond the SLA time. The details pane contains one entry for each alert, with an icon representing the alert severity. You can double-click an entry in the details pane to view the alert properties.
The Recent Performance view shows performance measurements taken for each monitored server. The details pane contains one entry for each monitored server. You can double-click an entry in the details pane to view a list of performance counter values for the monitored server.
The All Other Events view shows all events from other data providers. The details pane contains one entry for each event. You select an event entry to view the properties of the event.
In addition to using the default views, you can create your own public or private views to display information about alerts, events, computer performance, performance data, computers, computer groups, or computer attributes. Private views are stored in the My Views folder and are available only by the user who created them.
MOM Management Packs provide several default public views that are stored in the Public Views folder and are available to anyone who has access to a MOM Administrator Console or Web Console.
The Components folder provides configuration information about the MOM Agents, Consolidators, and Agent Managers. The Agents folder lists all servers in the MOM configuration group on which MOM has installed an Agent. To remove an Agent from a monitored server, you can mark the agent, and the CAM will remove the Agent the next time the CAM evaluates rules. The Consolidators and Agent Managers folders list all servers in the configuration group on which you have installed a Consolidator or Agent Manager, respectively.
You can use the Configuration snap-in (Figure 12.6) to manage the global MOM configuration settings, pending agent installations, and agent managers.
Figure 12.6: Administrator Console - Configuration snap-in
The Global Settings folder allows you to configure settings that apply to components throughout the MOM configuration group. You can configure settings in the following areas:
Custom Alert Fields. Custom alert fields are fields you can create that MOM will display for any generated alerts.
Alert Resolution States. You can use these settings to delete or to modify most of the default alert resolution states or to create your own alert resolution states. For each alert resolution state, you can specify the maximum time an alert can remain in the resolution state before MOM raises a service level exception.
Electronic mail (e-mail) Server. These configuration settings allow you to specify the settings used by the Consolidator to send e-mail responses.
Web Addresses. You can use these settings to define the web addresses for the Web Console and published reports saved to your intranet.
License. The License settings contain MOM acknowledgment and copyright information and allow you to apply a new license file.
Communications. You can use these settings to specify the TCP/IP port that Agents will use when communicating with the Consolidators. By default, MOM uses port 51515 for unencrypted communications and port 1270 for encrypted communications.
Database Grooming. These settings allow you to specify when data should be automatically deleted from the MOM database.
Notification Command Format. You can use these settings to configure MOM to use a third-party paging application for paging responses.
Auditing. You can use the Auditing settings to enable or to disable auditing MOM rules and configuration changes. Auditing is central processing unit (CPU) intensive and stores a considerable amount of data in the database. However, you cannot generate Configuration Changes reports without collecting this data.
Agent Managers. The Agent Manager periodically checks for changes to computer groups, computer grouping rules, Managed Computers rules, and system configurations (e.g., adding or removing Exchange on a managed system) that might require the Agent Manager to install or to remove an agent. This scan finds systems that match the managed computer criteria, collects system attributes, evaluates computer group membership, and installs or uninstalls agents where needed. You can use the Agent Manager configuration settings to specify the time and frequency for the Agent Managers to scan for changes, the service account that Agents will use, and whether the Agent Manager automatically installs agents or first adds them to the Pending Installation list for operator approval. You also can specify how long the Agent Manager should wait before uninstalling agents that are no longer needed.
Consolidators. These configuration settings allow you to specify how often Consolidators poll for rule changes, the number of responses (e.g., scripts, e-mail, and paging) that can run simultaneously, and how the Consolidator temporarily stores data. Also, if you are using hierarchical MOM configuration groups, you also can specify the name of the configuration group to which Consolidators will forward all alerts.
Agents. You can use these settings to specify heartbeat parameters, including how often Agents check for processing rule updates and reports availability of the managed computers, how often Agents check for service status changes, and how long Agents buffer events, data, and performance data before sending the data to the Consolidator.
You can specify whether MOM automatically installs new agents or first adds them to the Pending Installation list for operator approval (see 'Agent Managers' of the 'Global Settings' on the previous page). The Pending Installations folder lists the agent installation and deinstallation that are pending. You can approve or cancel the pending actions. For actions you approve, you can select whether MOM processes the installation or deinstallation immediately or waits until the next scheduled managed computer scan.
The Agent Managers folder lists all Agent Manager servers in the configuration group and allows you to change the configuration settings for specific Agent Managers. Using the Managed Computer rules, you can specify the servers that are managed by each Agent Manager. You can specify the schedule for managed computer scans. You can specify the service account for Agents installed by each Agent Manager and whether the Agent Manager automatically installs Agents or first adds them to the Pending Installation list for operator approval.
MOM is a rules-based system, and the rules snap-in (Figure 12.7) includes the computer group rules, processing rules, and notification group rules that are the heart of MOM. Each MOM Management Pack module includes predefined computer group rules and processing rules for monitoring specific applications or environments. In addition to using the predefined rules, you can use the rules snap-in to create new rules or modify the predefined rules to group similar systems, to process collected information, and to designate operators to be notified when specified conditions occur. You also can create or modify scripts and specify data providers that are used by the processing rules.
Figure 12.7: Administrator Console - Rules snap-in
Each Computer Group defines a collection of servers that all serve a similar function, such as all servers running Exchange. Computer groups minimize your management effort by allowing you to manage the group rather than each individual server. The Exchange 2003 Management Pack module includes a predefined computer group for monitoring Exchange 2003 servers. When MOM evaluates the computer grouping rule, it finds any servers with the specified Exchange 2003 attribute and includes the server in the computer group.
MOM uses the Managed Computer rules to generate a dynamic list of servers in each computer group. The Agent Manager periodically checks for changes to computer grouping rules (e.g., changing the selection criteria) and system configurations (e.g., adding or removing Exchange on a managed server) and, if necessary, recreates the list of servers in each computer group. MOM can then automatically install, reconfigure, or uninstall agents as required.
MOM can group servers on the basis of the server's domain, system name, or system attributes, such as the operating system version or applications that are installed on the server. You can group servers using their domain or computer names by entering the domain and computer name on the Computers tab of the Computer Group rule properties (Figure 12.8). You also can group servers using attributes that the servers have in common, such as the version of Exchange that is installed on the server. MOM typically does this by testing attributes found in Registry keys.
Figure 12.8: Administrator Console - Computer groups
The auto-discovery feature of MOM usually finds all servers that match the computer grouping rules. However, you also can manually add a server to the appropriate computer groups if necessary. You also can manually exclude specific servers even though they would normally match the computer grouping rules.
The Notification Groups folder defines lists of operators who MOM will notify when a specified event, alert, or threshold occurs. An individual can belong to more than one notification group, and a notification group can be associated with multiple processing rules. Typically, you specify the same notification group for similar processing rules. For example, you probably want MOM to notify the security group for all security alerts.
For each operator to be notified, you can specify the notification method (e-mail, page, and/or external command) and schedule for notification. Operator schedules indicate the days of the week and hours of the day when MOM can reach the person by e-mail, page, or external command notification. When MOM detects a condition that warrants notification, MOM only notifies the operators who are currently available.
The processing rules allow you to specify how MOM collects, evaluates, and responds to events, alerts, and performance data. You can create new processing rules to respond to events or to generate alerts (event rules), to filter the events passed to the Consolidator (filtering rules), to detect the absence of expected events (missing event rules), to consolidate similar events ( consolidation rules), or to collect specific data (collection rules). Processing rule groups (Figure 12.9) allow you to categorize processing rules for easy management.
Figure 12.9: Administrator Console - Processing rule groups
The Exchange Management Pack module includes predefined processing rules and processing rule groups for monitoring an Exchange 2003 environment. These processing rules include specific knowledge base information that defines the purpose, features, and configuration of the Exchange processing rule group. These processing rules are automatically associated with the Exchange Computer Group so that the Exchange processing rules will only be evaluated on Exchange servers.
Event Processing Rules. You use event processing rules to specify how MOM will collect specific event information and respond to the event. The Event Processing Rules folder for each processing rule group includes all event processing rules for the processing rule group. Each event processing rule includes the following information:
Data provider. The data provider is the source of the data or events to be matched by the rule. Typical data providers include Windows event logs, application-specific log files, timed event providers, WMI event providers, such as SNMP traps, WMI numeric data providers, and MOM-generated events, such as when an agent heartbeat does not occur on time.
Criteria. The criteria are the event properties (e.g., the source of the event, the event ID, the event type) that MOM will compare for a match. For example, the rule to see whether the Exchange Information Store service has stopped watches for an event with a source of 'MSExchangeIS' and an ID of '1006' (Figure 12.10).
Figure 12.10: Event processing rules
Schedule. You can define the schedule for processing the rule. For most rules, information is always processed rather than processed according to a schedule. However, for missing events, you need to specify when MOM should expect the event.
Filtering Actions. When you create a new filtering rule, you can specify the actions to be taken when MOM detects the event. You can choose not to evaluate further processing rules and not to insert the event into the database (prefilter), to continue evaluating processing rules, but not to insert the event into the database (database filter), or to continue evaluating processing rules and insert the event into the database only if another processing rule matches (conditional filter).
Event Consolidation Policy. When you create a consolidation rule, you can specify which event fields must be identical for events to be consolidated and the timeframe in which identical events must be detected to be consolidated.
Parameter Storage. When you create a collection rule, you can specify the event parameters that MOM will store in the database.
Alert. You can specify whether an alert is generated when an event match is detected. You can specify the alert severity, the alert owner, the initial resolution state, the alert source, and the alert description. You also can specify criteria to help suppress identical alerts that are detected within a short time. MOM will combine the duplicate alerts into a single alert with a count indicating the number of duplicate events.
Responses. You can define automatic responses to events, including launching scripts, generating SNMP traps, sending notifications to notification groups, executing commands or batch files, or updating state variables. You can use automatic responses to help resolve issues without requiring assistance from the operations staff.
Knowledge Base. The predefined knowledge base information provides additional information about the event. You also can add your own company knowledge base information for the event.
Alert Processing Rules. An alert processing rule allows you to specify a response for alerts that have a specific event source, a specific event severity, and/or alerts generated by specific rule groups. For example, you might create an alert processing rule to page the Mail Administrators Notification Group for all Critical Error alerts generated by the processing rules in the Exchange processing rule group. The Exchange Management Pack module includes predefined alert processing rules that you can modify to meet your requirements.
Performance Processing Rules. You can use performance processing rules to monitor servers for performance thresholds and resource usage using WMI numeric data or performance counters. For measuring rules, MOM will periodically sample the specified performance data. For threshold rules, MOM will compare the sampled value with the specified threshold and generate an alert if warranted.
|< Day Day Up >|| |