Monitoring Active Directory (AD) Infrastructure

 <  Day Day Up  >  

The danger of the " post-migration high" that comes at the completion of the migration is complacency. You might say to your self that the migration took a lot of work and now its done ”time to relax. Note that it isn't called Passive Directory, but Active Directory! This is a very dynamic environment that is in a constant state of flux. One of the troubleshooting rules I teach is "Always assume that no two DCs have the same information at the same point in time." That is, even if you see something on one DC, it won't necessarily be that way on every DC. Each DC has a writeable copy of the AD and can make changes. We then rely on replication to replicate the changes, which takes time to reach a convergence. In this environment, you can't afford to sit back and wait for the users to call the help desk. That's reactive, and you'll never get caught up.

Every Administrator should be proactive in troubleshooting by actively monitoring the AD. This allows you to notice problems before they become serious. I already mentioned in Chapter 5, "Active Directory Logical Design," about one customer who had a domain controller that had been down for more than two years and he didn't know it. Had he monitored it, he wouldn't have experienced the serious problems he called HP to solve. They could have been resolved at a much earlier time, reducing the impact to the infrastructure and probably saving a lot of money.

Microsoft has, until recently, been AWOL when it came to good monitoring tools for AD. This paved the way for a number of third-party companies, including NetPRO, NetIQ, BindView, and Quest Software to name a few, to come out with their own monitoring tools. HP has recently weighed in with OpenView for Windows (OVOW), which has the Active Directory Topology Viewer (ADTV), undoubtedly the best graphical tool on the market for monitoring and troubleshooting AD replication.

Although it is beyond the scope of this book to do a complete analysis of each tool, I have provided a brief summary of each and an in-depth look at HP's OpenView for Windows.

AD Monitoring and Management Tools

With the maturity of Windows 2000 and the shift to Windows Server 2003, there are plenty of tools to choose from, both from Microsoft (native tools) and from third-party vendors . The trick is to determine which tools meet your needs and budget. I can't tell you the one that will meet your requirements, so I've listed the native, built-in tools here, followed by a summary of third-party tools to help you manage AD.

Native Tools

Microsoft provides a pretty good set of command-line tools and snap-ins for managing or monitoring AD. One thing Microsoft did right was to expose most functions through command-line utilities that can be exploited with scripting tools or simple bat files. Most functions in AD management can be accomplished via a command line. Windows Server 2003 added several powerful built-in command-line utilities that permit object manipulation in the AD:

  • DSadd : This command adds users, groups, contacts, computers, and Organizational Units (OUs) to the AD. Options include the ability to define user attributes such as User Principal Name (UPN), address, display name, manager, pager, and so on.

  • DSget : This command displays the properties of users, groups, OUs, computers, contacts, servers, subnets, and sites. Note that the "servers" option returns properties of a domain controller (DC), while the "computer" option refers to clients and member servers.

  • DSmod : This command allows you to modify properties for multiple objects in the AD, including user, computer, contact, group , server (DC), and OU. This differs from the DSadd command in that it modifies existing properties on existing objects and can act on multiple objects at a time.

  • DSmove : This command moves a single object within a domain (such as from one OU to another), or can be used to rename the object without moving it. Note that this is only effective within a domain. The Movetree command would have to be used to move objects across domains. Unlike other DS commands in this list, DSmove is not limited to a list of object types such as user, computer, contact, and so on. Rather, the argument used is simply the distinguished name (DN) of any object in the AD.

  • DSquery : This command permits you to formulate a query to find types of objects, such as users, computers, groups, contacts, subnets, OUs, sites, and servers (DCs). In addition, a wild card (Dsquery *) can be used to form an LDAP query to find any AD object.

  • DSrm : This command is used to delete objects of any type in the AD.

These commands are well documented on the Microsoft Web site at http://www.microsoft.com/windowsxp/home/using/productdoc/en/default.asp?url=/WINDOWSXP/home/using/productdoc/en/dsadd.asp. The online help is also very descriptive and contains some examples. Simply enter the command followed by /? at a command prompt to get the help file. For instance:

 DSADD /?    Or DSADD Computer /? DSMOD /?    Or DSMOD User 

In terms of monitoring, Windows Server 2003 out of the box provides a couple of good tools, including Replication Monitor and the event logs.

Replication Monitor

Available in Windows 2000, and although not improved in Windows Server 2003, the Replication Monitor is still a great troubleshooting and monitoring tool, and is distributed as part of the Windows 2003 Support Tools (located on the Windows Server 2003 CD in /support/tools directory). Features include

  • Listing of all Global Catalog (GC) servers in the forest.

  • Listing of all Bridgehead Servers (BHS) in the forest.

  • Listing of FSMO role holders; you can click a button to see if the role holder responds.

  • Listing of metadata.

  • Summary of replication errors on all DCs in a domain; outputs to a text file.

  • Listing of unsuccessful replication attempts, and failures of Group Policy applications (and successes).

  • Status Report, which provides a comprehensive dump of replication data, including all DCs, DCs marked for deletion, replication errors, site and connection object information, and Domain Name Server (DNS) status (errors and warnings). It's a report that I often request just because it has so much data that is usually helpful.

Replication Monitor also displays direct and transitive replication partners, which partners are replicating various naming contexts (NCs), when Group Policy was updated (or failed), and current Update Sequence Number (USN) of each server. Replication Monitor can be used to force changes outbound from a server. It is still a very powerful tool for troubleshooting and monitoring AD replication.

Event Logs

These are the old standby and they keep getting better. A number of third-party tools are available on the market to monitor the event logs and report certain events, errors, warnings, and so on, and send e-mail or page the Admin. The Windows Server 2003 and Windows 2000 Resource Kits have utilities for this purpose as well. Windows 2000 put a lot more verbiage in them and Windows Server 2003 has improved on that by adding more problem resolution ideas and links to relevant Microsoft KB articles. They are a good indication of a stable or problematic environment and offer possible solutions other than "See your Administrator." Don't overlook them.

Other Microsoft and Third-Party Tools
Microsoft MOM

Microsoft Operations Manager (MOM) is a serious entry into the management arena. It contains Management Packs for Exchange, SQL, Systems Management Server, Microsoft Identity Information Server (MIIS), SharePoint Services, Server Resource Manager, and other Microsoft products, as well as AD.

MOM is implemented by means of a centralized server from which the Administrator can invoke the MOM console, and the data is stored on a SQL Server database. Each of the monitored servers has an agent installed that communicates with the console. You can get detailed documentation about these management packs at the Microsoft Web site at http://www.microsoft.com/mom/techinfo/productdoc/default.asp.

Originally licensed from NetIQ, the AD management pack can consolidate events, allowing you to filter, prioritize, and take actions on them based on these priorities, such as sending e-mail to an Administrator or in some cases resolving the events based on rules you define. Although it seemed slow to take off, there are a number of plug-in modules or Management Packs from vendors that make MOM attractive. In fact, Chapter 2, "Introduction to ProLiant Servers," discusses MOM plug-ins and agents for HP's Systems Insight Manager (SIM), formerly known as Compaq Insight Manager 7.

MOM includes the following features and attributes:

  • Rules-based : Admins can create their own rules for alerts and actions, such as notification of certain events or types of events, such as errors or warnings, as well as taking actions.

  • Performance monitoring : Rules can be defined to monitor thresholds of key performance counters and take action such as notification.

  • Management packs : These are preconfigured rule sets that allow for monitoring applications and services. Built into MOM are management packs for AD and Internet Information Server (IIS); add-ons for SQL, Exchange, ISA Server, and security event log analysis, and other Microsoft and third-party products, such as antivirus products.

  • Local agents : Monitored servers contain a local agent to communicate with the MOM server.

  • Consoles : A Microsoft Management Console (MMC) and a Web-based console for monitoring events.

  • Reporting : Sophisticated reports .

  • WMI interoperability : With AD's WMI-based functionality, MOM takes advantage of the power and flexibility of WMI by permitting WMI-configured filters and functions to gather data.

  • SNMP interoperability : Able to monitor SNMP traps or to send those traps to third-party SNMP monitoring consoles.

  • Licensing : Licensed per server and per processor, available in the Microsoft Select program, and a 60-day evaluation License is available.

Some cool things about the AD monitoring is MOM's capability to monitor events related to replication, DNS, Group Policy Object (GPO) processing, security, and Time Services. Perhaps the most crucial aspect of maintaining good AD health is keeping an eye on replication, which depends on DNS. Additional information is available on the Microsoft MOM home page at http://www.microsoft.com/mom.

note

At this writing, Microsoft has just announced the new features for MOM 2005. Again, print media make it difficult to keep up on updates like this. Make sure you visit Microsoft's Web site for new features and release dates.


Quest Products

Quest Software, one of the pioneers in Windows 2000 tools, was perhaps the first to dabble in rule-based tools with the Quest ActiveRoles product. Quest actually offers a suite of three products in its Management Suite for Windows (see http://www.quest.com/microsoft/wms/). These products are ActiveRoles, Quest Reporter, and Spotlight On Active Directory.

ActiveRoles uses kind of a holistic approach to AD management. Using native Windows 2000 and Windows Server 2003 security Access Control Lists (ACLs), ActiveRoles allows you to define administration functions for delegation in the tool and apply them logically to actual users and groups.

For instance, you could define a role of Help Desk Level1, assign rights such as Reset password, Read, and Execute, and then assign a security group to that role. You might have security groups in different parts of the company who would use this role, each with appropriate users as members . Security rights can be modified via ActiveRoles on the role, which then sets rights to those security groups without changing each individual group.

Thus, after you define the roles and have users placed in groups by the same logic used to create the roles (that is, by job function, business need, and so on), you simply associate the groups with the roles to get appropriate security.

Quest Reporter is an auditing tool that includes account management, tracking access activity (such as account lockouts), and hotfix and security patch management.

Spotlight On Active Directory features detailed graphs of replication activity, and a graphical view of the replication topology called Active Directory Topology Viewer (ADTV), shown in Figure 10.1. HP's OpenView Operations for Windows (OVOW), described later in this chapter, has a tool that provides a graphic representation of the topology also, but these tools are worlds apart. See the next section on OVOW for details.

Figure 10.1. Quest Software's Spotlight on Active Directory provides cursory monitoring tools and a graphical representation of the topology.

HP OpenView for Windows Overview

With the previous tools getting just a quick review and having dedicated several pages and an in-depth discussion for HP's OpenView Operations for Windows (OVOW), it's obvious where my loyalties lie. Seriously, even if you lined up all the other tools, and every feature were analyzed and compared, and they all had equal feature sets, OVOW's ADTV blows the other tools out of the water. Here, then is an in-depth discussion of how OVOW will help you monitor AD.

HP, which has long been dominant in the network management arena, has aggressively implemented a Windows-based systems management tool based entirely on Microsoft's WMI implementation. OVOW provides a three- tier management infrastructure for distributed Windows enterprise environments.

Within complex AD enterprise environments, OVOW provides in-depth monitoring and analysis tools for

  • Replication topology

  • Site configuration

  • FSMO locations

  • FSMO response times

  • DNS availability/response times

  • DNS consistency (proper record registration)

  • Replication CONSISTENCY (proper object replication in a timely manner)

  • GC availability/response time

  • GC consistency

Additionally, OVOW monitors the health of the underlying hardware through integration with Insight Manager and OpenManage Agents, and instruments the Windows OS for health and performance metrics.

Table 10.1 highlights the three-tier components of OVOW.

Table 10.1. The Three Tier Components of OVOW

Component

Roles

Where Installed

Operations Agent

Performance collector

Performance data storage

Event log collector

WMI collector Custom scripts (VBScript/Perl)

All DCs and critical servers

Operations Management Server

Collects data from distributed agents, correlates messages to application impact and root cause

Generates reports on historical performance data

One or more central servers

Operations Console

Data presentation based on identity Displays AD topology via ADTV

Shows root-cause events via Service Maps At-a-glance status indication on the health of key components monitored by the agents.

Operations staff PCs, laptops, and PDAs (via Terminal Services, Web)


The Operations Console connects to the operations server using a role-based security mechanism based on AD login identity. If allowed, systems and applications you manage are presented to you in the Operations Console.

Troubleshooting with OVOW

OVOW brings a service-impact analysis engine to Windows enterprise management. This allows OVOW to manage services of the enterprise such as the directory service, messaging, and Internet browsing. This conceptually brings the management of the Windows enterprise up a level from base component monitoring (element management), to a holistic view of how the elements interact with each other.

This approach to modeling the AD enterprise provides valuable troubleshooting benefits. From the map in Figure 10.2, you can see that there is a yellow alert flag on the AD node in the tree. Note that because these figures are not printed in color, you can identify the flag as a triangle icon with a bang (!) in it ”a standard warning symbol Windows uses. All references in this section to the "yellow flag" refer to this symbol. Expanding the tree, we see the flag on the Domains node as well as the HPDemoNet node (the name of the domain). Under that is the DC Gilligan, which has a yellow flag as well. So the error on Gilligan was rolled up the tree for a view of the overall AD health. Further, we also know that the problem scope is limited to the Washington site, because on the left side of the tree, we can trace the yellow flag to only DC Gilligan in the Washington site and no other DCs or sites. This saves us considerable time in defining the scope of the problem. Because three other DCs are in that domain, we are not in a critical state, but in a warning state (indicated by a yellow color , rather than red).

Figure 10.2. Here we see a yellow alert flag on DC Gilligan, which in turn rolls up to the domain and AD nodes providing an overall view of the health of the AD

We can now drill down to the root cause of the problem utilizing the root cause analysis tool. In Figure 10.3, orange warnings are at the disk level, reported by OVOW agents to the console. Further, in Figure 10.4, specific properties of the error are reporting that the disk is nearly full. Thus, you not only have a proactive warning of an impending problem, but you know what the problem is so you can easily identify an action to correct it before failure.

Figure 10.3. Here we clearly see that disk health is in question on the DC Gilligan.

Figure 10.4. The flag in the rootcause analysis maps to a specific event detected by the Operations Agent.


The general flow of things in OVOW follows a prescribed path . The agent (independently) detects a fault on a node, and after determining whether this is a new fault (duplicate suppression), sends the message to the OpenView operations server. The operations server has the Service Map model where the event is mapped to the service that it impacts. The model then calculates, based on a relationship map, the impacted elements in the enterprise. When you consider the impact of problems such as a GC failure and on Exchange clients, or failure of a DNS and the impact on client authentication, the ability to get warnings prior to a failure is a tremendous advantage. Being able to see the failure graphically and isolate it to a specific hardware component on a particular DC is a big benefit of OVOW. The design goal of OVOW is to relate events to service delivery.

AD Analysis Tools

The OpenView agents on the DCs determine the health and status of key AD infrastructure items, and send alerts to the operations server for presentation to Operator Consoles (Web-based and MMC). However, skilled AD Administrators will then need to launch, in context with the alerts, into the actual replication topology to understand the relationship between DCs, sites, and roles.

The Active Directory Topology Viewer (ADTV)

The ADTV was originally conceived by a good friend of mine in our Compaq days, and distributed as unsupported freeware called Age of Directories. It lived that way until HP merged with Compaq and immediately saw the value as a component of OVOW. It's impossible to estimate how many times I have used this tool to solve replication problems. I have had customers send me graphic outputs of their topology using other tools, but they are 2D, black and white, and hard to interpret. Comparing ADTV to those tools is like comparing a digital PBX to two tin cans and a string.

Although ADTV has tremendous value in error reporting, its greatest value is in simply portraying the replication topology so that you can see the sites, DCs, connection objects, site links, and so on. With a 3D graphic to look at, replication misconfigurations can be identified by a chimpanzee ”well, almost. I could make a living cleaning up AD replication topologies armed with a laptop and ADTV. Even if the only feature OVOW had was the ADTV, it would be worth the cost.

Figure 10.5 shows a good example of how easy it is to visualize the replication topology with ADTV. Note that the colored squares are sites, all DCs in each site are shown, and connection objects (shown as bold lines between the DCs) are shown for intersite and intrasite connections. Although this graphic doesn't show them for clarity reasons, site links are also shown. You can filter the components to show just intersite connections or just site links, and so on. Hovering over a site link will display its name and cost. Right-click on any object to display properties. This is a powerful troubleshooting tool because it shows design flaws that otherwise might go undetected until you have to start wading through event logs trying to find out what the problem is.

Figure 10.5. ADTV shows replication topology in 3D graphic format, allowing easy diagnosis of a topology misconfiguration.

Some of ADTV's features are shown in Figures 10.6 through 10.9. Figure 10.6 is a close-up of the ADTV graphic showing labels and GC flags identifying GCs. Although the figure isn't in color, the two curved lines are blue connection objects, and the straight line at the right near Li'l Buddy is a site link. Note the partition, site, and site link information in the left pane, which also lists replication errors. Right-click on the Mary Ann server to get properties. Figure 10.7 shows Mary Ann's replication partners, whereas Figure 10.8 shows the failed replication attempts and counts (this one is blank, so it's error-free). Finally, Figure 10.9 shows replication latency between DCs.

Figure 10.6. A close-up of the ADTV graphic output showing the site, DC, GC flag, site link, and connection objects. Partition, site, and site link details are shown in the left pane.

Figure 10.9. Replication latency between DCs is shown in graph format.

Figure 10.7. This tab of the server Properties dialog box shows the server's replication partners.


Figure 10.8. Failed replication attempts are shown here. This server has no changes that have not been replicated and no report of failed operations.


In addition to diagnosing topology configuration problems, ADTV can help diagnose other AD problems such as replication. Let's take a look at a sample problem of this nature. ADTV allows us to see the amount of time it takes for an object to replicate from one DC to another, and be represented in the GC. At first glance, the thick line appears to be an inconsistency ”the DC minnow is taking three times longer to replicate changes than any of the other DCs.

However, if we go back to Figure 10.6, the cause can be quickly determined. Minnow is in the Boston site, whereas all the other DCs are in the Washington site. Because this is a Windows 2000 domain, intrasite replication is 5 minutes (300 seconds), and intersite replication is 15 minutes (900 seconds). This graph validates that replication is performing properly. An ever-increasing line in this graph will indicate replication failures.

As you can see, the Windows Server 2003 AD infrastructure depends on more than just the Windows servers and the AD mechanisms; we also have to take in the underlying hardware and network components, as well as test client response time to measure service levels and delivery of service to clients. OpenView has components that plug in to operations that include

  • Network Management and automated mapping (Network Node Manager)

  • Client-side probing of response time and transactional simulation (OpenView Internet Services)

These additional features tie directly into the operations server to provide visibility of the network status (for example, the WAN link to Boston is down) in the same context as AD replication status. When a network failure such as this occurs, Administrators will know the problem is a network problem rather than an AD configuration problem; this will help stop the waste of resources troubleshooting an AD problem when the underlying cause is the network outage .

 <  Day Day Up  >  


Windows Server 2003 on Proliants. Deployment Techniques and Management Tools for System Administrators
Windows Server 2003 on Proliants. Deployment Techniques and Management Tools for System Administrators
ISBN: B004C77T6A
EAN: N/A
Year: 2004
Pages: 214

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net