Hack 8. Know When to Use Packet Sniffing
Network data collectors, or "packet sniffers," create an alternative data source that has a handful of benefits, provided that you maintain their upkeep. Users respond not only to a site's content, but also to its delivery, which include factors such as speed, quality, and reliability. Together, content and delivery influence what users choose to view, how long they view it, how they navigate through the site, and ultimately whether they will return. All the compelling content in the world won't save a web site that can't deliver it well. Many options exist to get information about what content was served. But to get the delivery information, in order to get a complete picture of user behavior, you can turn to collecting data at the network level. This is commonly referred to as network collection (or using a sniffer, but Sniffer® is a registered trademark of Network General Corporation to describe its line of protocol analyzers, so we won't use that term here). 1.9.1. The Ugly, Ugly DetailsBecause of the design of the network layers in a computer system, the low-level details about the network packets are unavailable to the web serverthis is a good thing, as it allows the web server to concentrate on serving web content. However, by the time the server sees the transaction, much of the underlying performance data is lost, or has been modified into something less useful. For instance, a web server can log when it sent some content, but cannot know if the client actually got it, or if the client didn't get it, how much it got before the client stopped the transaction. Using collection methods such as page tagging, you can capture more granular information about page deliveries, but cannot determine why a transaction was slow or failed. In general, application-level loggers (web logs, page tagging, server plug-ins, etc.) cannot report:
A network collector (usually just software running on off-the-shelf hardware) is a specialized packet grabber that "knows" all about web traffic. It passively watches the traffic flow across the network and keeps a recordsimilar to a web server log lineof what it sees. Because a network collector lives on the network, however, it can see and report what application level loggers cannot. Imagine an observer on a freeway overpass. If the observer is fast enough, it can count all the cars that go by. If it's really fast, it can log information like the car's color, make, model, or number of people in each car. All this can happen without disrupting traffic flow (Figure 1-8). Figure 1-8. Typical placement of a network data collector on a hub in front of your web architectureFrom this viewpoint, a network collector can view traffic for all web servers on a particular network. One network collector can gather statistics for many web servers simultaneously, reducing the cost of manually administering logfiles on each web server. This wealth of information becomes a great foundation on which to analyze a web site. The combination of content and delivery makes for powerful analysis. For example, what happens to the average number of pages a web surfer sees at a site when the server's response time goes from under two seconds to over 10 seconds? Which content is most abandoned during download? What is the relationship between users' connection speeds and page views? Is there a particular CGI program that should be tuned because it's taking too long to run? 1.9.2. How to Use a Network Collector in a Switched EnvironmentA network collector relies on having a machine watch the network traffic between browsers and the web server, so it must be on the network path between the two. In a shared media environment (such as a hub), the sniffer collects all the information because every port on the hub sees traffic to and from every other port on the hub. In the more common switched environment, each port on the switch transmits only traffic that is supposed to be for the machine (the web server) plugged into that port. Putting a network collector on a switch port means that it sees only traffic for itself, and not for the web servers. This isn't very useful! There are several approaches you can take to allow network collectors to work in a switched environment. Each has its advantages and disadvantages.
Figure 1-10. Network data collector attached to hubs in front of each web server1.9.3. Using a Network Collector with Encrypted (SSL) TrafficGenerally, you'd put an SSL frontend device before the web servers. That offloads the web servers from having to manage the encrypted packets. It also allows you to put a network collector behind the SSL box and in front of the web servers, where the traffic is unencrypted. If there is no good place to monitor all incoming and outgoing traffic due to network layout, a network collector won't work, and you'll forfeit delivery information. In these cases, consider using a network collector for some part of your trafficsay, a particular set of machinesin order to "spot check" some of your delivery data that's not available though other collection mechanisms. Bob Page and Eric T. Peterson |