Troubleshooting DNS, DHCP, and Active Directory Issues


Troubleshooting DNS, DHCP, and Active Directory Issues

Because of the interdependence of Active Directory and DNS, many DNS issues may actually seem to be Active Directory problems at first glance. The remainder of this chapter provides insight into some commonly experienced problems and troubleshooting tips to help resolve issues quickly.

Troubleshooting Internal DNS Lookup Issues

By far the most common internal network problems occur when the server or a workstation does not have DNS configured correctly. The next two examples identify the behavior seen and describe why the problem occurs.

Server Hangs at Applying Network Settings

With few exceptions, when an SBS server takes a long time to boot, and specifically appears to hang for 20 minutes or longer at the Preparing Network Connections portion of the boot process, the network cards on the server are pointing to an external server for DNS. The output of an ipconfig /all command on a server experiencing this problem might look like this:

[View full width]

C:\Documents and Settings\Administrator>ipconfig /all Windows IP Configuration Host Name . . . . . . . . . . . . : sbs Primary Dns Suffix . . . . . . . : SmallBizCo.local Node Type . . . . . . . . . . . . : Unknown IP Routing Enabled. . . . . . . . : Yes WINS Proxy Enabled. . . . . . . . : Yes DNS Suffix Search List. . . . . . : SmallBizCo.local Ethernet adapter Server Local Area Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : 3Com 3C920 Integrated Fast Ethernet Controller (3C905C-TX Compatible) Physical Address. . . . . . . . . : 00-08-74-40-5B-61 DHCP Enabled. . . . . . . . . . . : No IP Address. . . . . . . . . . . . : 192.168.16.2 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : DNS Servers . . . . . . . . . . . : 192.168.16.2 Primary WINS Server . . . . . . . : 192.168.16.2 Ethernet adapter Network Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Realtek RTL8029(AS) PCI Ethernet Adapter Physical Address. . . . . . . . . : 00-C0-F0-2B-7D-F9 DHCP Enabled. . . . . . . . . . . : No IP Address. . . . . . . . . . . . : 10.10.1.9 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 10.10.1.1 DNS Servers . . . . . . . . . . . : 4.2.2.4 4.2.2.3 Primary WINS Server . . . . . . . : 192.168.16.2 NetBIOS over Tcpip. . . . . . . . : Disabled


In this instance, when the server is in the Preparing Network Connections stage, it is attempting to register all its Active Directory information into the DNS server(s) listed on the NIC. Needless to say, the DNS servers hosted by the ISP are not going to accept any DNS registrations from just any server out on the Net, and it certainly does not recognize the nonroutable DNS suffix. When the attempt is made to register the DNS information with the remote server, the remote server will not respond, and the attempt will eventually time out. Unfortunately, the Active Directory startup routines are persistent and will keep making multiple attempts to register this information with the DNS server until it finally gives up. This process can take 20 minutes or longer, depending on how many external DNS servers are listed for each NIC.

This behavior also occurs when the internal NIC is listed as the DNS server for both NICs, but one of the NICs also has a secondary DNS server listed with an external address. The Active Directory DNS registration process is successful for the internal DNS server, but the server will attempt to register Active Directory information with each DNS server listed in the NIC configuration. The only way to avoid this situation is to have the internal IP address of the SBS server listed as the only DNS server for each network card in the server.

Connect Computer Wizard Fails to Find Users and Computers

Another common error occurs when a workstation is attempting to join the SBS domain using the Connect Computer Wizard. After starting the wizard from the web browser, an error is generated that says "The list of users and computers could not be found on the server. Make sure that the Small Business Server network adapters are configured correctly." The error occurs because the client workstation is not configured correctly, not because the server is misconfigured as the error applies. Microsoft has published KB article 837369 (http://support.microsoft.com/?id=837369) on this error. The KB article also indicates that the problem is the result of the client workstation having a DNS server entry that is not the SBS server's internal IP address. Again, this problem is resolved by modifying the network settings on the workstation so that it points to the SBS server as the only server for DNS.

Note

In SBS 2003 SP1, a different error is generated in this situation (see Figure 5.8). Instead of the cryptic error described previously, the error details the exact problem and gives steps to resolve the problem.

Figure 5.8. The Connect Computer Wizard describes the most common reason for not being able to complete.



Using nslookup to Search for Internal DNS Names

Sometimes you may run across a situation where internal DNS name resolution just doesn't work correctly but with no obvious cause. Perhaps users start reporting that when they open their web browser, the Companyweb page fails to load and generates a Page Cannot Be Displayed error. Or they are suddenly unable to open a share on another server or workstation in the local network. If the problem seems to be isolated to a single machine or a small group of computers, it is unlikely that a problem exists on the SBS server, so troubleshooting should start at the workstation.

The best tool to use to troubleshoot client DNS problems is the command-line tool nslookup. This tool is installed by default on every Windows 2000, Windows XP, and Windows 2003 system. For this type of troubleshooting, we will use nslookup in interactive mode, which is entered by typing nslookup at a command prompt or after choosing Start, Run. In interactive mode, you are presented with a > prompt where you can enter multiple lookup commands. Type exit at the > prompt to exit the interactive mode of nslookup.

To test the DNS lookup of a local system, enter the system name at the nslookup prompt and press Enter. If you enter the name of a local workstation (jimdough01 in this example), you see a result similar to the following listing.

C:\>nslookup
Default Server:  sbs.smallbizco.local
Address:  192.168.16.2

> jimdough01
Server: sbs.smallbizco.local
Address: 192.168.16.2

Name:    jimdough01.SmallBizCo.local
Address:  192.168.16.25
 


When nslookup first enters interactive mode, it displays the name and IP address of the default DNS server being used by the client. In the preceding example, the workstation is pointing to the local SBS server, which is the correct configuration. If you see a different server listed in the initial nslookup output, you know that the default DNS server for the workstation is not set correctly, which is the likely cause of the problem. In this example, the workstation can look up the name of the workstation jimdough01 and get an IP address for the workstation.

Note

If the reverse DNS lookup zones are not properly configured, the initial response from nslookup generates the following output:

*** Can't find server name for address 192.168.16.2: Non-existent domain
Default Server:  UnKnown
Address:  192.168.16.2
 


If you see this response, you are not going to have problems doing DNS lookups through the server. You can resolve this issue by creating the reverse lookup zone for the internal network and adding a pointer (PTR) record for the SBS server in the zone.


When nslookup queries the name of a system that is not in the DNS table of the SBS server, it generates a response similar to the following listing. In this case, you would want to check the DNS entries in the forward lookup zone for the smallbizco.local domain and see whether there is an entry present for companyweb. In this example, there is not:

> companyweb
Server:  sbs.smallbizco.local
Address:  192.168.16.2

*** sbs.smallbizco.local can't find companyweb: Non-existent domain
 


In the following example, nslookup returns an address for companyweb from the server:

> companyweb
Server:  sbs.smallbizco.local
Address:  192.168.16.2

Name:    companyweb.SmallBizCo.local
Address:  192.168.16.8
 


However, the address returned is not the same as the address of the SBS server. In this case, we know that the failure to load companyweb in the workstation's Internet browser is because the DNS record is pointing to the wrong address.

As you can see from these few examples, nslookup can provide quite a bit of information on the local network with just a few commands. The next section takes a deeper look at how to use nslookup to troubleshoot DNS problems on the external network.

When http://Companyweb Resolves to www.companyweb.com

Many web browsers have internal routines they use to try and find websites when a single-label name (such as companyweb is entered in the address bar. First the browser does a lookup on the name companyweb in the default domain. If no site is found, the browser starts guessing what the site might be by looking up the name as a domain name, starting with .com, then .net, then .org, and so on. So if the browser is set to go to http://companyweb when it starts and it cannot find a machine named companyweb in the local domain, it tries to load www.companyweb.com.

Recently this behavior was seen at a company with a heavy Macintosh population. After the SBS server was installed and all the PCs on the network were joined to the domain with the Connect Computer Wizard, the default web page was set to http://companyweb for the PCs. Wanting consistency across all platforms, the IT contact at the company set the default home page on the Macintosh workstations to be http://companyweb as well. Unfortunately, when the Mac web browsers were opened, they were redirected to http://www.companyweb.com instead.

The IT contact used nslookup on the Macintosh workstations to troubleshoot the DNS lookup problems, and even though the interface was slightly different than on the PC, he was able to determine that the Mac was not using the SBS server as the primary DNS server. He was later able to determine that the Macintosh was not set to get an IP address from the SBS DHCP server as he had originally assumed and after he made that change, the Companyweb web page opened as expected on the Macs in the organization. (For more information on Macintosh connectivity issues with SBS, see Chapter 17, "Integrating the Macintosh into a Small Business Server 2003 Environment.")

Some modern browsers no longer use this method for guessing what a user might have meant when a single-label name was given. Instead, if a single-label name cannot be found in the local domain, the web browser redirects the term to a search engine or to the page that comes up first in the search engine's query on that term. To see this behavior in action, use the Firefox browser to go to http://companyweb when the workstation DNS is pointed to a public DNS server. The page that appears is the Microsoft KB article on how to restore Companyweb after the Intranet component has been removed from SBS.


Troubleshooting External DNS Lookup Issues

System administrators learn quickly when there is a problem with the organization's Internet connection. Users tend to be quick to complain when they cannot get to a certain website, but often that call for help is phrased as "The Internet is down!" instead of "I am having trouble reaching this one site in particular even though other sites are working fine." So the first step in troubleshooting DNS problems on the Internet is asking a few pointed questions to determine the scope of the problem.

Certain Sites Have Intermittent Connection Problems

Intermittent problems are often the most difficult to diagnose because they do not always fail or do not fail in the same way every time they are encountered. When users complain that certain sites sometimes work and sometimes do not, but the problem is limited to a particular set of sites whereas others work with no difficulty, you will first want to take a look at EDNS as the source of the problem.

EDNS, often referred to as Extended DNS, is an enhanced DNS query process that has been implemented by default in Windows Server 2003. The EDNS specification allows for larger DNS query responses than standard DNS, and these larger responses can cause problems in some network configurations.

To turn off EDNS on the server and clear the DNS cache, enter the following two commands in a command prompt on the SBS server:

dnscmd /Config /EnableEdnsProbes 0
ipconfig /flushdns
 


The first command tells the server to send standard DNS queries instead of the extended DNS queries. The second command flushes the local DNS lookup cache on the SBS server and forces new lookups on all DNS requests. This modification should resolve the problem of intermittent connection problems to specific websites.

Connections to All External Sites Fail Periodically

The other side of the Internet connectivity coin comes when all access to external sites fails intermittently. If the actual Internet connection itself is good, meaning that you can access the SBS server from the Internet or you can access certain sites by IP address, the next step is to take a look at the DNS server on the SBS server. Again, you can use the nslookup tool to help with the troubleshooting.

The following is a sample nslookup session that attempts to determine whether there is a problem with the SBS DNS server:

C:\>nslookup
Default Server:   sbs.smallbizco.local
Address:   192.168.16.2

> www.google.com
Server:  sbs.smallbizco.local
Address:  192.168.16.2

DNS request timed out.
    timeout was 2 seconds.
*** Request to sbs.smallbizco.local timed-out
> www.sams.com
Server:  sbs.smallbizco.local
Address: 192.168.16.2

DNS request timed out.
    timeout was 2 seconds.
*** Request to sbs.smallbizco.local timed-out
> companyweb
Server:  sbs.smallbizco.local
Address: 192.168.16.2

Name:    sbs.Smallbizco.local
Address: 192.168.16.2
Aliases:  companyweb.Smallbizco.local
 


After starting nslookup, two queries to well-known websites fail. The error, *** Request to sbs.smallbizco.local timed-out, seems to indicate a problem with the DNS Server service on the SBS box. However, a third query for a local name, Companyweb, succeeds, which indicates that the server is working correctly. The next step would be to use nslookup to do a direct query against the DNS server or servers listed in the Forwarders section of the DNS Management Console. The following listing shows an example transcript:

> server 65.97.168.254
DNS request timed out.
    timeout was 2 seconds.
Default Server:  [65.97.168.254]
Address:  65.97.168.254

> www.google.com
Server:     [65.97.168.254]
Address:     65.97.168.254

DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Request to [65.97.168.254] timed-out
 


The first command listed in the example is used to change the DNS server that nslookup will use. In this case, we see a timeout when attempting to change the DNS server. Although this initial response is not unusual when changing DNS servers, it could be an indication that there is a problem with the remote DNS server. The second command is a lookup attempt against a well-known web address. In this case, we get two timeout responses from the request. This is a solid indication that there is a problem with the remote DNS server.

One method to confirm that the DNS servers listed as forwarders are having problems is to do a lookup against a different DNS server. When ISPs provide DNS server information for network connections, they usually provide the addresses for two servers so that in case the first server goes down, the second is available as a backup. Not all ISPs keep their DNS servers on different network segments, so if a network segment fails that prevents connections to one of the servers, it is likely that a connection to the second server will fail as well. To that end, many consultants and IT professionals keep a listing of alternate DNS servers available for use and testing. They may use DNS servers from other ISPs they have used in the past, or they may use well-known DNS servers.

The next step in this troubleshooting process is to test DNS resolution against the secondary server provided by the ISP. The same steps shown in the previous listing will verify whether the secondary server is working. If that test fails as well, try to use the DNS servers at 4.2.2.1 and 4.2.2.2, two well-known public DNS servers. Those servers generally respond to DNS lookup requests unless network routing problems prevent the client site from reaching the servers on the Net. Because these servers do respond to ping requests, a simple ping 4.2.2.1 or ping 4.2.2.2 will determine whether the servers are reachable. If these servers respond to DNS queries, the lookup problems exist with the servers specified as forwarders.

As mentioned earlier in the chapter, the DNS server on SBS does not have to have DNS forwarders specified. If no DNS servers are listed as forwarders, the DNS server uses the root hint servers for lookups. This would be one additional test that could be used to determine whether there is a problem with the forwarders. If the SBS DNS server begins processing lookups correctly when the forwarders are removed, the problem lies with the forwarders.

To resolve a problem with forwarders failing to respond to DNS queries, either remove the forwarders from the DNS Management Console and use the root hints, or configure alternate DNS forwarders in the console. The changes take effect immediately so there is no need to restart the DNS Server service, but it may be necessary to run an ipconfig /flushdns command after making the changes to clear out any bad DNS lookups from the local cache.

Troubleshooting DHCP Configuration Issues

In the SBS world, two main DHCP issues will crop up from time to time. The first is that the DHCP service on the SBS server will stop unexpectedly and generate errors in the event logs. This almost always occurs when a second DHCP server is activated on the internal SBS network. The SBS DHCP Server service detects the second DHCP server and shuts itself down to avoid conflicts. Unfortunately, in doing so the SBS box allows the rogue DHCP server to handle DHCP requests, usually passing on invalid configuration information. This problem often presents itself on the network as though the DHCP server is not configuring the workstations correctly. Only on further review will it become clear that the DHCP Server service on the SBS server is actually shut down and not handing out configuration information at all.

The SBS server generates two errors in the System event log at startup when it detects another DHCP server on the local network. The first error is a 1053 error from DhcpServer. The error description reads:

The DHCP/BINL service on this computer running Windows Server 2003 for
Small Business Server has encountered another server on this network with
IP Address, [IP address], belonging to the domain: .
 


The second error is a 1054 error, also from DhcpServer, reading:

The DHCP.BINL service on this computer is shutting down. See the previous
event log messages for reasons.
 


Note

In a dual-NIC configuration, the SBS server will not complain about an active DHCP server on the external network. In some cases, the server may be configured to get its external IP address from a DHCP server. The only time it will have problems is when it identifies another DHCP server on the internal network.


The second main issue occurs when the internal network IP address is changed on the server without using the Change IP Address Wizard. If the IP address on the internal NIC is changed in the network card configuration directly, the DHCP scope is not updated automatically. In this case, when a workstation boots up, it will not be able to get an address from the DHCP server and will end up with an Automatic Private IP Address (APIPA) in the 169.254.x.x range. This situation presents itself as a workstation no longer able to communicate with the network. Running an ipconfig /all command on the workstation and comparing that output to the output of an ipconfig /all command run on the server will reveal that the workstation and the server are on separate networks or that the workstation is looking to the wrong IP address for the SBS server. You may also see this situation if the server IP address is changed as described previously and the workstation has not had its DHCP lease renewed since the change. Again, the IP address range on the server and the workstation will be different.

To resolve this problem, the DHCP server settings can be modified manually, but the easier route is to run the Change Server IP Address Wizard, which will rebuild the DHCP scope automatically. When this wizard is run, however, the new DHCP scope will be set with the SBS defaults. Any customizations that had been made to the DHCP scope previously will be lost.

Troubleshooting DNS-Related Active Directory Issues

Problems with Active Directory can often be traced back to DNS configuration problems or service errors. Some of these issues have been mentioned earlier in the chapter (NIC settings pointing to an external server for DNS, for example), but a number of other errors that may seem like AD failures are really just problems with the DNS service itself. This last section of the chapter looks at a few ways to quickly recover from the DNS problems that may be causing Active Directory errors.

DNS 4004/4015 Errors

If you encounter a number of DNS 4004 or DNS 4015 errors in the event logs, the first place to check is the DNS configuration for Active Directory in the DNS Management Console. Compare the contents of the Forward Lookup Zone for the internal domain to those shown in Figure 5.1 earlier in the chapter. The main lookup zone must contain at least these four records:

(same as parent folder)

Start of Authority (SOA)

(same as parent folder)

Name Server (NS)

(same as parent folder)

Host (A)

server name

Host (A)


The first two records will have the internal FQDN of the server in the data field, and the last two will have the internal IP address in the data field, an example of which can be seen in Figure 5.1. If one of these records is missing or has incorrect data, the corrections can be made directly within the DNS Management Console by either adding the missing record or by editing a record and correcting any errors.

Netlogon

Figure 5.9 shows a portion of the Active Directory Forward Lookup Zone in the DNS Management Console. As with the Forward Lookup Zone for the internal domain discussed previously, some key elements must be present for Active Directory to function properly. In Figure 5.9, the _msdcs zone contains SOA (Start of Authority) and NS (Name Server) records just like the internal domain, and both of those records point to the internal FQDN of the SBS server. The _msdcs zone also contains an alias record, which points to the FQDN of the SBS server. Under the domains zone, there is a zone for the GUID for the domain as well.

Figure 5.9. The _msdcs forward lookup zone contains records for the server and domain.


Because the DNS service relies on a database for storage of its information, it is subject to database corruption like other systems. One sign of database corruption is that the CNAME record for the server in the _msdcs lookup zone is missing. Fortunately, recovering from this database corruption is not difficult.

The Netlogon service is the component that ties the DNS service in with Active Directory. It maintains the DNS records for AD in two files located in the config directory under system32. The files are netlogon.dns and netlogon.dnb. If these files are missing when the Netlogon process starts, they will be created automatically with the proper DNS information for Active Directory. If the files are present but corrupt, the Netlogon service will start but may produce unexpected results.

The Netlogon databases can be repaired in a single command line. First, set the current directory in a command prompt to C:\Windows\system32\config. Then enter the following command:


net stop netlogon && del netlogon.* && net start netlogon
 


This stops the Netlogon service, deletes the netlogon.dnb and netlogon.dns files from the config folder, and restarts the Netlogon process. If you look in the config folder after running this command, you will find that both the netlogon.dns and netlogon.dnb files have been re-created. When you refresh the DNS Management Console display, you will find that the CNAME record for the server has been re-created if it was missing. This process also re-creates the domains zone under _msdcs if it was missing as well.

Caution

If you find that the netlogon.dns and netlogon.dnb files do not get re-created in the config folder and you see warnings in the Event log (Netlogon 5781), check and make sure that the DNS server listed in the TCP/IP settings for all NICs in the server are pointing to the internal IP address of the SBS server. If the NIC DNS settings point elsewhere, this process will not register the DNS records correctly.


netdiag and dcdiag

Another set of tools that are useful in diagnosing network and Active Directory issues are included in the Support Tools package. Because the Support Tools are not installed by default, the package must be installed before the netdiag and dcdiag tools can be used.

Because the output of both netdiag and dcdiag fills several screen pages, the output from the command should be redirected to a file for ease of searching. Use the following commands at the command prompt to run neTDiag and dcdiag with verbose output, redirect the output to a file, and open the output file in Notepad after the command completes:


netdiag /v > netdiag.txt && netdiag.txt
dcdiag /v > dcdiag.txt && dcdiag.txt
 


When Notepad brings up the output file, search through the file for the terms "fail" and "fatal" to quickly identify problems that the tools have identified. If any problems are found, a Google search on the error messages from the output can help quickly track down the source of the problem, if the problem is not evident from the description in the file itself.

Running neTDiag /fix also recovers any corruption within the Netlogon database files as well. This is effectively the same as stopping the Netlogon service, deleting the Netlogon database files, and restarting Netlogon as described previously.

Best Practice: Install Support Tools

The Support Tools package contains a number of useful diagnostic tools besides netdiag and dcdiag. The Support Tools installer file SUPTOOLS.MSI is located on SBS installation CD #2 in the \SUPPORT\TOOLS folder. Double-click on the installer file and accept all the defaults through the installation process to install the package into the C:\Program Files\Support Tools folder on the SBS server.

Take some time to go through the output from the netdiag and dcdiag commands to see the type of information reported by the tools. After working with the tools for a while, you will develop an understanding of what you should and should not see in the command output that can help you quickly identify problems with the configuration of the server.