The purpose of a network audit is to accurately assess and document the current state of the network, its components, the people involved, and the human processes used. The audit, in effect, documents the purpose and priorities of the network. Without the audit, you must rely on people's memory, hearsay, and possibly out-of-date or inaccurately documented maps and databases.
Without proper documentation and understanding of how things change in the network, you cannot reliably deploy performance and fault network management. You must determine how all devices are connected to each other both physically and logically and where the network components are located. From this information, you can determine which devices, ports, and connections are important for the development of your performance and fault management strategy.
When you are working with outside consultants for network design or management issues, the network audit should be the first action they initiate. Without understanding the components, people, and processes, an outside consultant cannot accurately determine the state of the network and develop a plan of action. Regardless of your company's level of documentation, the consultant must still verify that the information matches the physical reality.
Without a proper understanding of physical connectivity and the location of network components, it will take longer to isolate network problems and you stand a greater chance of mistakenly introducing faults into the network during moves, adds, and changes.
Although commercial auto-discovery and mapping tools do a good job of drawing logically connected networks, they cannot discover on which floor, building, desk, or closet the devices are located. Trace a cable under the floor or between closets at 3 a.m. and you'll never underestimate the importance of a physical map again!
When a portion of a network goes down or becomes unstable, troubleshooting the source of the outage is done through a process of fault isolation. During an outage or fault, network administrators work as quickly as possible to search out and isolate the source of the problem. In order to do so, they typically begin somewhere in the middle or at the edge of the affected area, and work to reduce the fault domain or area of affected devices. The goal is to get as much of the network operating around the fault domain as possible. With proper documentation, this goal is much easier to achieve. In addition, in a well-documented network, the network manager knows which applications and users are affected by a problem, and can proactively notify the user community.
In a poorly documented network, fault isolation becomes a game of finding a needle in the haystack. The goal of fault isolation is to reduce the affected area; how can you do so if there is no documentation? Some administrators resort to brute force by splitting networks in half or randomly plugging and unplugging connections until they narrow down the affected areas. This prolongs the time to resolution and increases the chance of introducing additional faults. It also frustrates and angers users (and your boss) as they witness flaky, intermittent service.
All in all, the inconvenience of maintaining useful network documentation will be most appreciated during outages.
If you have not previously documented your network as described in this chapter, you will need to begin the process by auditing the physical network and its connectivity. Through the audit, you will learn and document which devices are in your network, where they are located, how they are connected, and who is responsible for the device. This will be the starting point for your network documentation.
As part of the audit, you will identify those ports that are critical to the successful operation of the network. Critical ports tend to be those with routers, switches, hubs, servers, channel service unit/data service units (CSU/DSUs), and the key users (such as the CEO) connected to them.
Monitoring all ports and connections in a network can be overkill and cause over-management of the network. Monitoring generates traffic load on the network and sucks up network device resources (memory, CPU). Is it really necessary to monitor user ports? Probably not, although you may want to monitor the traffic performance on key user ports in order to use them as a baseline for their floor or workgroup.
If a device or port goes down and nobody cares, don't manage the port any longer. Manage only those devices and ports that are critical to the operation of the network.
In order to select which ports to monitor, you must know how your network is connected, both physically and logically. Without this information, the importance of a port cannot be determined. The network management infrastructure can become crippled with information from devices (such as user PCs) that have no impact on the operation of the network. You must determine where network devices and servers are located, how they are connected, and who is affected if they become slow or unavailable.