Section 7.8. Per-Process Network Statistics


7.8. Per-Process Network Statistics

In this section, we explore tools to monitor network usage by process. We build on DTrace to provide these tools.

In previous versions of Solaris it was difficult to measure network I/O by process, just as it was difficult to measure disk I/O by process. Both of these problems have been solved with DTracedisk by process is now trivial with the io provider. However, at the time of this writing, a network provider has yet to be released. So while network-by-process measurement is possible with DTrace, it is not straightforward.[4]

[4] The DTraceToolkit's TCP tools are the only ones so far to measure tcp/pid events correctly. The shortest of the tools is over 400 lines. If a net provider is released, that script might be only 12 lines.

7.8.1. tcptop Tool

tcptop, a DTrace-based tool from the freeware DTraceToolkit, summarizes TCP traffic by system and by process.

# tcptop 10 Sampling... Please wait. 2005 Jul  5 04:55:25,  load: 1.11,  TCPin:      2 Kb,  TCPout:    110 Kb  UID    PID LADDR           LPORT FADDR           FPORT      SIZE NAME  100  20876 192.168.1.5     36396 192.168.1.1        79      1160 finger  100  20875 192.168.1.5     36395 192.168.1.1        79      1160 finger  100  20878 192.168.1.5     36397 192.168.1.1        23      1303 telnet  100  20877 192.168.1.5       859 192.168.1.1       514    115712 rcp                                                                       See DTraceToolkit 


The first line of the above report contains the date, CPU load average (one minute), and two TCP statistics, TCPin and TCPout. These are from the TCP (MIB); they track local host traffic as well as physical network traffic.

The rest of the report contains per-process data and includes fields for the PID, local address (LADDR), local port (LPORT), remote address (FADDR[5]), remote port (FPORT), number of bytes transferred during sample (SIZE), and process name (NAME). tcptop retrieves this data by tracing TCP events

[5] We chose the name "FADDR" after looking too long at the connection structure (struct conn_s).

This particular version of tcptop captures these per-process details for connections that were established while tcptop was running and could observe the handshake. Since TCPin and TCPout fields are for all traffic, a large discrepancy between them and the per-process details may suggest that we missed observing handshakes for busy sessions.[6]

[6] A newer version of tcptop is in development to examine all sessions regardless of connection time (and has probably been released by the time you are reading this). The new version has an additional command-line option to revert to the older behavior.

It turns out to be quite difficult to kludge DTrace to trace network traffic by process such that it identifies all types of traffic correctly 100% of the time. Without a network provider, the events must be traced from fbt. The fbt provider is an unstable interface, meaning that probes may change for minor releases of Solaris.[7]

[7] Not only can the fbt probes change, but they have done so; a recent change to the kernel has changed TCP slightly, meaning that many of the DTrace TCP scripts need updating.

The greatest problem with using DTrace to trace network traffic by process is that both inbound and outbound traffic are asynchronous to the process, so we can't simply look at the on-CPU PID when the network event occurred. From user-land, when the PID is correct, there is no one single way that TCP traffic is generated, such that we could simply trace it then and there. We have to contend with many other issues; for example, when tracing traffic to the telnet server, we would want to identify in.telnetd as the process responsible (principle of least surprise?). However, in.telnetd never steps onto the CPU after establishing the connection, and instead we find that telnet TRaffic is caused by a plethora of unlikely suspects: ls, find, date, etc. With enough D code, though, we can solve these issues with DTrace.

7.8.2. tcpsnoop Tool

The tcpsnoop tool is the companion to tcptop. It is also from the DTraceToolkit and prints TCP packet details live by process.

# tcpsnoop   UID    PID LADDR           LPORT DR RADDR           RPORT  SIZE CMD   100  20892 192.168.1.5     36398 -> 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 <- 192.168.1.1        79     66 finger   100  20892 192.168.1.5     36398 -> 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 -> 192.168.1.1        79     56 finger   100  20892 192.168.1.5     36398 <- 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 <- 192.168.1.1        79    606 finger   100  20892 192.168.1.5     36398 -> 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 <- 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 -> 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 -> 192.168.1.1        79     54 finger   100  20892 192.168.1.5     36398 <- 192.168.1.1        79     54 finger     0    242 192.168.1.5        23 <- 192.168.1.1     54224     54 inetd     0    242 192.168.1.5        23 -> 192.168.1.1     54224     54 inetd     0    242 192.168.1.5        23 <- 192.168.1.1     54224     54 inetd     0    242 192.168.1.5        23 <- 192.168.1.1     54224     78 inetd     0    242 192.168.1.5        23 -> 192.168.1.1     54224     54 inetd     0  20893 192.168.1.5        23 -> 192.168.1.1     54224     57 in.telnetd     0  20893 192.168.1.5        23 <- 192.168.1.1     54224     54 in.telnetd     0  20893 192.168.1.5        23 -> 192.168.1.1     54224     78 in.telnetd ... 


In the above output we can see a PID column and packet details, the result of tracking TCP traffic that has travelled on external interfaces. While running, tcpsnoop captured the details of an outbound finger command and an inbound telnet.

As with tcptop, this version of tcpsnoop examines newly connected sessions (while tcpsnoop has been running). This behavior can be useful because when the tcpsnoop tool is run over an existing network session (like ssh), it doesn't trace its own output.




Solaris Performance and Tools(c) Dtrace and Mdb Techniques for Solaris 10 and Opensolaris
Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris
ISBN: 0131568191
EAN: 2147483647
Year: 2007
Pages: 180

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net