7.8. Per-Process Network StatisticsIn this section, we explore tools to monitor network usage by process. We build on DTrace to provide these tools. In previous versions of Solaris it was difficult to measure network I/O by process, just as it was difficult to measure disk I/O by process. Both of these problems have been solved with DTracedisk by process is now trivial with the io provider. However, at the time of this writing, a network provider has yet to be released. So while network-by-process measurement is possible with DTrace, it is not straightforward.[4]
7.8.1. tcptop Tooltcptop, a DTrace-based tool from the freeware DTraceToolkit, summarizes TCP traffic by system and by process. # tcptop 10 Sampling... Please wait. 2005 Jul 5 04:55:25, load: 1.11, TCPin: 2 Kb, TCPout: 110 Kb UID PID LADDR LPORT FADDR FPORT SIZE NAME 100 20876 192.168.1.5 36396 192.168.1.1 79 1160 finger 100 20875 192.168.1.5 36395 192.168.1.1 79 1160 finger 100 20878 192.168.1.5 36397 192.168.1.1 23 1303 telnet 100 20877 192.168.1.5 859 192.168.1.1 514 115712 rcp See DTraceToolkit The first line of the above report contains the date, CPU load average (one minute), and two TCP statistics, TCPin and TCPout. These are from the TCP (MIB); they track local host traffic as well as physical network traffic. The rest of the report contains per-process data and includes fields for the PID, local address (LADDR), local port (LPORT), remote address (FADDR[5]), remote port (FPORT), number of bytes transferred during sample (SIZE), and process name (NAME). tcptop retrieves this data by tracing TCP events
This particular version of tcptop captures these per-process details for connections that were established while tcptop was running and could observe the handshake. Since TCPin and TCPout fields are for all traffic, a large discrepancy between them and the per-process details may suggest that we missed observing handshakes for busy sessions.[6]
It turns out to be quite difficult to kludge DTrace to trace network traffic by process such that it identifies all types of traffic correctly 100% of the time. Without a network provider, the events must be traced from fbt. The fbt provider is an unstable interface, meaning that probes may change for minor releases of Solaris.[7]
The greatest problem with using DTrace to trace network traffic by process is that both inbound and outbound traffic are asynchronous to the process, so we can't simply look at the on-CPU PID when the network event occurred. From user-land, when the PID is correct, there is no one single way that TCP traffic is generated, such that we could simply trace it then and there. We have to contend with many other issues; for example, when tracing traffic to the telnet server, we would want to identify in.telnetd as the process responsible (principle of least surprise?). However, in.telnetd never steps onto the CPU after establishing the connection, and instead we find that telnet TRaffic is caused by a plethora of unlikely suspects: ls, find, date, etc. With enough D code, though, we can solve these issues with DTrace. 7.8.2. tcpsnoop ToolThe tcpsnoop tool is the companion to tcptop. It is also from the DTraceToolkit and prints TCP packet details live by process. # tcpsnoop UID PID LADDR LPORT DR RADDR RPORT SIZE CMD 100 20892 192.168.1.5 36398 -> 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 <- 192.168.1.1 79 66 finger 100 20892 192.168.1.5 36398 -> 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 -> 192.168.1.1 79 56 finger 100 20892 192.168.1.5 36398 <- 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 <- 192.168.1.1 79 606 finger 100 20892 192.168.1.5 36398 -> 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 <- 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 -> 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 -> 192.168.1.1 79 54 finger 100 20892 192.168.1.5 36398 <- 192.168.1.1 79 54 finger 0 242 192.168.1.5 23 <- 192.168.1.1 54224 54 inetd 0 242 192.168.1.5 23 -> 192.168.1.1 54224 54 inetd 0 242 192.168.1.5 23 <- 192.168.1.1 54224 54 inetd 0 242 192.168.1.5 23 <- 192.168.1.1 54224 78 inetd 0 242 192.168.1.5 23 -> 192.168.1.1 54224 54 inetd 0 20893 192.168.1.5 23 -> 192.168.1.1 54224 57 in.telnetd 0 20893 192.168.1.5 23 <- 192.168.1.1 54224 54 in.telnetd 0 20893 192.168.1.5 23 -> 192.168.1.1 54224 78 in.telnetd ... In the above output we can see a PID column and packet details, the result of tracking TCP traffic that has travelled on external interfaces. While running, tcpsnoop captured the details of an outbound finger command and an inbound telnet. As with tcptop, this version of tcpsnoop examines newly connected sessions (while tcpsnoop has been running). This behavior can be useful because when the tcpsnoop tool is run over an existing network session (like ssh), it doesn't trace its own output. |