Section 7.7. Systemwide Statistics


7.7. Systemwide Statistics

The following tools allow us to observe network statistics, including statistics for TCP, IP, and each network interface, throughout the system.

7.7.1. netstat Command

The Solaris netstat command is the catch-all for a number of different network status programs.

$ netstat -i Name  Mtu  Net/Dest      Address        Ipkts  Ierrs Opkts  Oerrs Collis Queue lo0   8232 localhost     localhost      191    0     191    0     0      0 ipge0 1500 waterbuffalo  waterbuffalo   31152163 0     24721687 0     0      0 $ netstat -i 3     input   ipge0     output       input  (Total)    output packets errs  packets errs  colls  packets errs  packets errs  colls 31152218 0     24721731 0     0      31152409 0     24721922 0     0 $ netstat -I ipge0 -i 3     input   ipge0     output       input  (Total)    output packets errs  packets errs  colls  packets errs  packets errs  colls 31152284 0     24721797 0     0      31152475 0     24721988 0     0 


netstat -i, mentioned earlier, prints only packet counts. We don't know if they are big packets or small packets, and we cannot use them to accurately determine how utilized the network interface is. Other performance monitoring tools plot this as a "be all and end all" valuethis is wrong.

Packet counts may help as an indicator of activity. A packet count of less than 100 per second can be treated as fairly idle; a worst case for Ethernet makes this around 150 Kbytes/sec (based on maximum MTU size).

The netstat -i output may be much more valuable for its error counts, as discussed in Section 7.5.

netstat -s dumps various network-related counters from kstat. This shows that Kstat does track at least some details in terms of bytes.

$ netstat -s | grep Bytes         tcpOutDataSegs      =37367847   tcpOutDataBytes      =166744792         tcpRetransSegs      =153437     tcpRetransBytes      =72298114         tcpInAckSegs        =25548715   tcpInAckBytes        =148658291         tcpInInorderSegs    =35290928   tcpInInorderBytes    =3637819567         tcpInUnorderSegs    =324309     tcpInUnorderBytes    =406912945         tcpInDupSegs        =152795     tcpInDupBytes        =73998299         tcpInPartDupSegs    =  7896     tcpInPartDupBytes    =5821485         tcpInPastWinSegs    =    38     tcpInPastWinBytes    =971347352 


However, the byte values above are for TCP in total, including loopback traffic that didn't travel through the network interfaces. These statistics can still be of some value, especially if large numbers of errors are observed. For more details on these and a reference table, see Section 7.9.

netstat -k on Solaris 9 and earlier dumped all kstat counters.

From the output we can see that there are byte counters (rbytes64, obytes64) for the hme0 interface, which is just what we need to measure per-interface traffic. However netstat -k was an undocumented switch that has now been dropped in Solaris 10. This is fine since there are better ways to get to kstat, including the C library, which is used by tools such as vmstat.

$ netstat -k | awk '/^hme0/,/^$/' hme0: ipackets 70847004 ierrors 6 opackets 73438793 oerrors 0 collisions 0 defer 0 framing 0 crc 0 sqe 0 code_violations 0 len_errors 0 ifspeed 100000000 buff 0 oflo 0 uflo 0 missed 6 tx_late_collisions 0 retry_error 0 first_collisions 0 nocarrier 0 nocanput 0 allocbfail 0 runt 0 jabber 0 babble 0 tmd_error 0 tx_late_error 0 rx_late_error 0 slv_parity_error 0 tx_parity_error 0 rx_parity_error 0 slv_error_ack 0 tx_error_ack 0 rx_error_ack 0 tx_tag_error 0 rx_tag_error 0 eop_error 0 no_tmds 0 no_tbufs 0 no_rbufs 0 rx_late_collisions 0 rbytes 289601566 obytes 358304357 multircv 558 multixmt 73411 brdcstrcv 3813836 brdcstxmt 1173700 norcvbuf 0 noxmtbuf 0   newfree 0 ipackets64 70847004 opackets64 73438793 rbytes64 47534241822 obytes64 51897911909 align_errors 0 fcs_errors 0   sqe_errors 0 defer_xmts 0 ex_collisions 0 macxmt_errors 0 carrier_errors 0 toolong_errors 0 macrcv_errors 0 link_duplex 0 inits 31 rxinits 0 txinits 0 dmarh_inits 0 dmaxh_inits 0 link_down_cnt 0 phy_failures 0 xcvr_vendor 524311 asic_rev 193 link_up 1 


7.7.2. kstat Command

The Solaris Kernel Statistics framework tracks network usage, and as of Solaris 8, the kstat command fetches these details (see Chapter 11). This command has a variety of options for selecting statistics and can be executed by non-root users.

The -m option for kstat matches on a module name. In the following example, we use it to display all available statistics for the networking modules.

$ kstat -m tcp module: tcp                             instance: 0 name:   tcp                             class:    mib2         activeOpens                     803         attemptFails                    312         connTableSize                   56 ... $ kstat -m ip module: ip                              instance: 0 name:   icmp                            class:    mib2         crtime                          3.207830752         inAddrMaskReps                  0         inAddrMasks                     0 ... $ kstat -m hme module: hme                             instance: 0 name:   hme0                            class:     net name:   hme0                            class:     net         align_errors                    0         allocbfail                      0 ... 


These commands fetch statistics for ip, tcp, and hme (our Ethernet card). The first group of statistics (others were truncated) from the tcp and ip modules states their class as mib2: These statistic groups are maintained by the TCP and IP code for MIB-II and then copied into kstat during a kstat update.

The following kstat command fetches byte statistics for our network interface, printing output every second.

$ kstat -p 'hme:0:hme0:*bytes64' 1 hme:0:hme0:obytes64     51899673435 hme:0:hme0:rbytes64     47536009231 hme:0:hme0:obytes64     51899673847 hme:0:hme0:rbytes64     47536009709 ... 


Using kstat in this manner is currently the best way to fetch network interface statistics with tools currently shipped with Solaris. Other tools exist that take the final step and print this data in a more meaningful way: Kbytes/sec or percent utilization. Two such tools are nx.se and nicstat.

7.7.3. nx.se Tool

The SE Toolkit provides a language, SymbEL, that lets us write our own performance monitoring tools. It also contained a collection of example tools, including nx.se which helps us calculate network utilization.

$ se nx.se 1 Current tcp RtoMin is 400, interval 1, start Sun Oct  9 10:36:42 2005 10:36:43 Iseg/s Oseg/s InKB/s OuKB/s Rst/s  Atf/s  Ret%  Icn/s  Ocn/s tcp      841.6    4.0  74.98   0.27   0.00   0.00   0.0   0.00   0.00 Name    Ipkt/s Opkt/s InKB/s OuKB/s IErr/s OErr/s Coll% NoCP/s Defr/s hme0     845.5  420.8 119.91  22.56  0.000  0.000   0.0   0.00   0.00 10:36:44 Iseg/s Oseg/s InKB/s OuKB/s Rst/s  Atf/s  Ret%  Icn/s  Ocn/s tcp      584.2    5.0  77.97   0.60   0.00   0.00   0.0   0.00   0.00 Name    Ipkt/s Opkt/s InKB/s OuKB/s IErr/s OErr/s Coll% NoCP/s Defr/s hme0     579.2  297.1 107.95  16.16  0.000  0.000   0.0   0.00   0.00 


Having KB/s lets us determine how busy our network interfaces are. Other useful fields include collision percent (Coll%), no-can-puts per second (NoCP/s), and defers per second (Defr/s), which may be evidence of network saturation. nx.se also prints useful TCP statistics above the interface lines.

7.7.4. nicstat Tool

nicstat, a tool from the freeware K9Toolkit, reports network utilization and saturation by interface. It is available as a C or Perl kstat consumer.

$ nicstat 1     Time   Int   rKb/s   wKb/s   rPk/s   wPk/s    rAvs    wAvs   %Util      Sat 10:48:30  hme0    4.02    4.39    6.14    6.36  670.73  706.50    0.07     0.00 10:48:31  hme0    0.29    0.50    3.00    4.00   98.00  127.00    0.01     0.00 10:48:32  hme0    1.35    4.23   14.00   15.00   98.79  289.00    0.05     0.00 10:48:33  hme0   67.73   19.08  426.00  207.00  162.81   94.39    0.71     0.00 10:48:34  hme0  315.22  128.91 1249.00  723.00  258.44  182.58    3.64     0.00 10:48:35  hme0  529.96   67.53 2045.00 1046.00  265.37   66.11    4.89     0.00 10:48:36  hme0  454.14   62.16 2294.00 1163.00  202.72   54.73    4.23     0.00 10:48:37  hme0   93.55   15.78  583.00  295.00  164.31   54.77    0.90     0.00 10:48:38  hme0   74.84   32.41  516.00  298.00  148.52  111.38    0.88     0.00 10:48:39  hme0    0.76    4.17    7.00    9.00  111.43  474.00    0.04     0.00                                                  See K9Toolkit; nicstat.c or nicstat.pl 


In this example output of nicstat, we can see a small amount of network traffic, peaking at 4.89% utilization.

The following are the switches available from version 0.98 of the Perl version of nicestat.

$ nicstat -h USAGE: nicstat [-hsz] [-i int[,int...]] | [interval [count]]    eg, nicstat               # print a 1 second sample        nicstat 1             # print continually every 1 second        nicstat 1 5           # print 5 times, every 1 second        nicstat -s            # summary output        nicstat -i hme0       # print hme0 only 


The utilization measurement is based on the current throughput divided by the maximum speed of the interface (if available through kstat). The saturation measurement is a value that reflects errors due to saturation if kstat found any.

This method for calculating utilization does not account for other per-packet costs, such as Ethernet preamble. These costs are generally minor, and we assume they do not greatly affect the utilization value.

7.7.5. SNMP

It's worth mentioning that useful data is also available in SNMP, which is used by software such as MRTG (a popular freeware network utilization plotter). A full install of Solaris 10 provides Net-SNMP, putting many of the commands under /usr/sfw/bin.

Here we demonstrate the use of snmpget to fetch interface statistics.

$ snmpget -v1 -c public localhost ifOutOctets.2 ifInOctets.2 IF-MIB::ifOutOctets.2 = Counter32: 10016768 IF-MIB::ifInOctets.2 = Counter32: 11932165 


The .2 corresponds to our primary interface. These values are the outbound and inbound bytes. In Solaris 10 a full description of the IF-MIB statistics can be found in /etc/sma/snmp/mibs/IF-MIB.txt.

Other software products fetch and present data from the IF-MIB, which is a valid and desirable approach for monitoring network interface activity. Solaris 10's Net-SNMP supports SNMPv3, which provides User-based Security Module (USM) for the creation of user accounts and encrypted sessions; and View-based Access Control Module (VACM) to restrict users to view only the statistics they need. When configured, they greatly enhance the security of SNMP. For information on each, see snmpusm(1M) and snmpvacm(1M).

Net-SNMP also provides a version of netstat called snmpnetstat. Besides the standard output using -i, snmpnetstat has a -o option to print octets (bytes) instead of packets.

$ snmpnetstat -v1 -c public -i localhost Name      Mtu Network   Address        Ipkts Ierrs Opkts Oerrs Queue lo0      8232 loopback  localhost       6639     0  6639     0     0 hme0     1500 192.168.1 titan         385635     0 86686     0     0 hme0:1   1500 192.168.1 192.168.1.204      0     0     0     0     0 $ $ snmpnetstat -v1 -c public -o localhost Name    Network   Address        Ioctets   Ooctets lo0     loopback  localhost            0         0 hme0    192.168.1 titan          98241462 55500788 hme0:1  192.168.1 192.168.1.204        0         0 


Input bytes (Ioctets) and output bytes (Ooctets) can be seen. Now all we need is an interval for this information to be of real value.

[View full width]

# snmpnetstat -v1 -c public -I hme0 -o localhost 10 input (hme0) output input (Total) output packets errs packets errs colls packets errs packets errs colls 386946 0 88300 0 0 395919 0 97273 0 0 452 0 797 0 0 538 0 883 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 844 0 1588 0 0 952 0 1696 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 548 0 965 0 0 656 0 1073 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^C


Even though we provided the -o option, by also providing an interval (10 seconds), we caused the snmpnetstat command to revert to printing packet counts. Also, the statistics that SNMP uses are only updated every 30 seconds. Future versions of snmpnetstat may correctly print octets with intervals.

7.7.6. checkcable Tool

Sometimes network performance problems can be caused by incorrect auto-negotiation that selects a lower speed or duplex. There is a way to retrieve the settings that a particular network card has chosen, but there is not one way that works for all cards. It usually involves poking around with the ndd command and using a lookup table for your particular card to decipher the output of ndd.

Consistent data for network cards should be available from Kstat, and Sun does have a standard in place. However many of the network drivers were written before the standard existed, and some were written by third-party companies. The state of consistent Kstat data for network cards is improving and at some point in the future should boil down to a few well understood one-liners of the kstat command, such as:kstat -p | grep <interfacename>.

In the meantime, it is not always that easy. Some data is available from kstat, much of it from ndd. The following example demonstrates fetching ndd data for an hme card.

# ndd /dev/hme link_status 1 # ndd /dev/hme link_speed 1 # ndd /dev/hme link_mode 1 


These numbers indicate a connected or unconnected cable (link_status), the current speed (link_speed), and the duplex (link_mode). What 1 or some other number means depends on the card. A list of available ndd variables for this card can be listed with ndd -get /dev/hme \? (the -get is optional).

SunSolve has Infodocs to explain what these numbers mean for various cards. If you have mainly one type of card at your site, you eventually remember what the numbers mean. As a very general rule, "1" is often good, "0" is often bad; so "0" for link_mode probably means half duplex.

The checkcable tool, available from the K9Toolkit, deciphers many card types for you.[3] It uses both kstat and ndd to retrieve the network settings because not all the data is available to either kstat or ndd.

[3] checkcable is Perl, which can be read to see supported cards and contribution history.

# checkcable Interface    Link Duplex  Speed  AutoNEG hme0          UP   FULL    100        ON # checkcable Interface    Link Duplex  Speed  AutoNEG hme0         DOWN   FULL    100       ON 


The first output has the hme0 interface as link-connected (UP), full duplex, 100 Mbits/sec, and auto-negotiation on; the second output was with the cable disconnected. The speed and duplex must be set to what the switch thinks they are set to so that the network link functions correctly.

There are still some cards that checkcable is unable to view. The state of card statistics is slowly getting better; eventually, checkcable will not be needed to translate these numbers.

7.7.7. ping Tool

ping is the classic network probe tool; it uses ICMP messages to test the response time of round-trip packets.

$ ping -s mars PING mars: 56 data bytes 64 bytes from mars (192.168.1.1): icmp_seq=0. time=0.623 ms 64 bytes from mars (192.168.1.1): icmp_seq=1. time=0.415 ms 64 bytes from mars (192.168.1.1): icmp_seq=2. time=0.464 ms ^C ----mars PING Statistics---- 3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms)  min/avg/max/stddev = 0.415/0.501/0.623/0.11 


So we discover that mars is up and that it responds within 1 millisecond. Solaris 10 enhanced ping to print three decimal places for the times. ping is handy to see if a host is up, but that's about all.

7.7.8. traceroute Tool

traceroute sends a series of UDP packets with an increasing TTL, and by watching the ICMP time-expired replies, we can discover the hops to a host (assuming the hops actually decrement the TTL):

$ traceroute www.sun.com traceroute: Warning: Multiple interfaces found; using 260.241.10.2 @ hme0:1 traceroute to www.sun.com (209.249.116.195), 30 hops max, 40 byte packets  1  tpggate (260.241.10.1)  21.224 ms  25.933 ms  25.281 ms  2  172.31.217.14 (172.31.217.14)  49.565 ms  27.736 ms  25.297 ms  3  syd-nxg-ero-zeu-2-gi-3-0.tpgi.com.au (220.244.229.9)  25.454 ms  22.066 ms  26.237 ms  4  syd-nxg-ibo-l3-ge-0-2.tpgi.com.au (220.244.229.132)  42.216 ms *  37.675 ms  5  220-245-178-199.tpgi.com.au (220.245.178.199)  40.727 ms  38.291 ms  41.468 ms  6  syd-nxg-ibo-ero-ge-1-0.tpgi.com.au (220.245.178.193)  37.437 ms  38.223 ms  38.373 ms  7  Gi11-2.gw2.syd1.asianetcom.net (202.147.41.193)  24.953 ms  25.191 ms  26.242 ms  8  po2-1.gw1.nrt4.asianetcom.net (202.147.55.110)  155.811 ms  169.330 ms  153.217 ms  9  Abovenet.POS2-2.gw1.nrt4.asianetcom.net (203.192.129.42)  150.477 ms  157.173 ms * 10  so-6-0-0.mpr3.sjc2.us.above.net (64.125.27.54)  240.077 ms  239.733 ms  244.015 ms 11  so-0-0-0.mpr4.sjc2.us.above.net (64.125.30.2)  224.560 ms  228.681 ms  221.149 ms 12  64.125.27.102 (64.125.27.102)  241.229 ms  235.481 ms  238.868 ms 13  * *^C 


The times may provide some idea of where a network bottleneck is. We must also remember that networks are dynamic and that this may not be the permanent path to that host (and could even change as traceroute executes).

7.7.9. snoop Tool

The power to capture and inspect network packets live from the interface is provided by snoop, an indispensable tool. When network events don't seem to be working, it can be of great value to verify that the packets are actually arriving in the first place.

snoop places a network device in "promiscuous mode" so that all network traffic, addressed to this host or not, is captured. You ought to have permission to be sniffing network traffic, as often snoop displays traffic contentsincluding user names and passwords.

# snoop Using device /dev/hme (promiscuous mode)      jupiter -> titan        TCP D=22 S=36570 Ack=1602213819 Seq=1929072366 Len=0 Win=49640       titan -> jupiter      TCP D=36570 S=22 Push Ack=1929072366 Seq=1602213819 Len=128 Win=49640      jupiter -> titan        TCP D=22 S=36570 Ack=1602213947 Seq=1929072366 Len=0 Win=49640 ... 


The most useful options include the following: don't resolve hostnames (-r), change the device (-d), output to a capture file (-o), input from a capture file (-i), print semi-verbose (-V, one line per protocol layer), print full-verbose (-v, all details), and send packets to /dev/audio (-a). Packet filter syntax can also be applied.

By using output files, you can try different options when reading them (-v, -V). Moreover, outputting to a file incurs less CPU overhead than the default live output.

7.7.10. TTCP

Test TCP is a freeware tool that tests the throughput between two hops. It needs to be run on both the source and destination, and a Java version of TTCP runs on many different operating systems. Beware, it floods the network with traffic to perform its test.

The following is run on one host as a receiver. The options used here made the test run for a reasonable durationaround 60 seconds.

$ java ttcp -r -n 65536 Receive: buflen= 8192  nbuf= 65536 port= 5001 Then the following was run on the second host as the transmitter, $ java ttcp -t jupiter -n 65536 Transmit: buflen= 8192  nbuf= 65536 port= 5001 Transmit connection:   Socket[addr=jupiter/192.168.1.5,port=5001,localport=46684]. Transmit: 536870912 bytes in 46010 milli-seconds = 11668.57 KB/sec (93348.56 Kbps). 


This example shows that the speed between these hosts for this test is around 11.6 megabytes per second.

It is not uncommon for people to test the speed of their network by transferring a large file around. This may be better than it sounds; any test is better than none.

7.7.11. pathchar Tool

After writing TRaceroute, Van Jacobson wrote pathchar, an amazing tool that identifies network bottlenecks. It operates like TRaceroute, but rather than printing response time to each hop, it prints bandwidth between each pair of hops.

# pathchar 192.168.1.1 pathchar to 192.168.1.1 (192.168.1.1)  doing 32 probes at each of 64 to 1500 by 32  0 localhost  |    30 Mb/s,   79 us (562 us)  1 neptune.drinks.com (192.168.2.1)  |    44 Mb/s,   195 us (1.23 ms)  2 mars.drinks.com (192.168.1.1) 2 hops, rtt 547 us (1.23 ms), bottleneck  30 Mb/s, pipe 7555 bytes 


This tool works by sending "shaped" traffic over a long interval and carefully measuring the response times. It doesn't flood the network like TTCP does.

Binaries for pathchar can be found on the Internet, but the source code has yet to be released. Some open source versions, based on the ideas from pathchar, are in development.

7.7.12. ntop Tool

ntop sniffs network traffic and issues comprehensive reports through a web interface. It is very useful, so long as you can (and are allowed to) snoop the traffic of interest. It is driven from a web browser aimed at localhost:3000.

# ntop ntop v.1.3.1 MT [sparc-sun-solaris2.8] listening on [hme0,hme0:0,hme0:1]. Copyright 1998-2000 by Luca Deri <deri@ntop.org> Get the freshest ntop from http://www.ntop.org/ Initialising... Loading plugins (if any)... WARNING: Unable to find the plugins/ directory. Waiting for HTTP connections on port 3000... Sniffying... 


7.7.13. NFS Client Statistics: nfsstat -c

$ nfsstat -c Client rpc: Connection oriented: calls      badcalls   badxids    timeouts   newcreds   badverfs   timers 202499     0          0          0          0          0          0 cantconn   nomem      interrupts 0          0          0 Connectionless: calls      badcalls   retrans    badxids    timeouts   newcreds   badverfs 0          0          0          0          0          0          0 timers     nomem      cantsend 0          0          0 Client nfs: calls     badcalls  clgets    cltoomany 200657    0         200657    7 Version 2: (0 calls) null     getattr  setattr  root     lookup   readlink read     wrcache 0 0%     0 0%     0 0%     0 0%     0 0%     0 0%     0 0%     0 0% write    create   remove   rename   link     symlink  mkdir    rmdir 0 0%     0 0%     0 0%     0 0%     0 0%     0 0%     0 0%     0 0% readdir  statfs 0 0%     0 0% Version 3: (0 calls) null        getattr     setattr     lookup      access      readlink 0 0%        0 0%        0 0%        0 0%        0 0%        0 0% read        write       create      mkdir       symlink     mknod 0 0%        0 0%        0 0%        0 0%        0 0%        0 0% remove      rmdir       rename      link        readdir     readdirplus 0 0%        0 0%        0 0%        0 0%        0 0%        0 0% fsstat      fsinfo      pathconf    commit 0 0%        0 0%        0 0%        0 0% 


Client statistics printed include retransmissions (retrans), unmatched replies (badxids), and timeouts. See nfsstat(1M) for verbose descriptions.

7.7.14. NFS Server Statistics: nfsstat -s

The server version of nfsstat prints a screenful of statistics to pick through. Of interest are the value of badcalls and the number of file operation statistics.

$ nfsstat -s Server rpc: Connection oriented: calls      badcalls   nullrecv   badlen     xdrcall    dupchecks  dupreqs 5897288    0          0          0          0          372803     0 Connectionless: calls      badcalls   nullrecv   badlen     xdrcall    dupchecks  dupreqs 87324      0          0          0          0          0          0 ... Version 4: (949163 calls) null                compound 3175 0%             945988 99% Version 4: (3284515 operations) reserved            access              close               commit 0 0%                72954 2%            199208 6%           2948 0% create              delegpurge          delegreturn         getattr 4 0%                0 0%                16451 0%            734376 22% getfh               link                lock                lockt 345041 10%          6 0%                101 0%              0 0% locku               lookup              lookupp             nverify 101 0%              145651 4%           5715 0%             171515 5% open                openattr            open_confirm        open_downgrade 199410 6%           0 0%                271 0%              0 0% putfh               putpubfh            putrootfh           read 914825 27%          0 0%                581 0%              130451 3% readdir             readlink            remove              rename 5661 0%             11905 0%            15 0%               201 0% renew               restorefh           savefh              secinfo 30765 0%            140543 4%           146336 4%           277 0% setattr             setclientid         setclientid_confirm verify 23 0%               26 0%               26 0%               10 0% write               release_lockowner   illegal 9118 0%             0 0%                0 0% ... 





Solaris Performance and Tools(c) Dtrace and Mdb Techniques for Solaris 10 and Opensolaris
Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris
ISBN: 0131568191
EAN: 2147483647
Year: 2007
Pages: 180

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net