15.7 Network Performance

This section concludes our look at performance monitoring and tuning on Unix systems. It contains a brief introduction to network performance, a very large topic whose full treatment is beyond the scope of this book. Consult the work by Musameci and Loukides for further information.

15.7.1 Basic Network Performance Monitoring

The netstat -s command is a good place to start when examining networkperformance. It displays network statistics. You can limit the display to a single network protocol via the -p option, as in this example from an HP-UX system:

$ netstat -s -p tcp                    Output shortened. tcp:     178182 packets sent           111822 data packets (35681757 bytes)           30 data packets (3836 bytes) retransmitted           66363 ack-only packets (4332 delayed)     337753 packets received           89709 acks (for 35680557 bytes)           349 duplicate acks           0 acks for unsent data           284726 packets (287618947 bytes) received in-sequence           0 completely duplicate packets (0 bytes)           3 packets with some dup, data (832 bytes duped)           11 out of order packets (544 bytes)           5 packets received after close           11 out of order packets (544 bytes)

The output gives statistics since the last boot.^[32]

^[32] Or most recent counter reset, if supported.

Network operations are proceeding nicely on this system. The highlighted lines are among those that would indicate transmission problems if the values in them rose to appreciable percentages of the total network traffic.

More detailed network performance data can be determined via the various network monitoring tools we considered in Section 8.6.

15.7.2 General TCP/IP Network Performance Principles

Good network performance depends on a combination of several components working properly and efficiently. Performance problems can arise in many places and take many forms. These are among the most common:

Network interface problems, including insufficient speed and high error rates due to failing or misconfigured hardware. This sort of problem shows up as poor performance and/or many errors on a particular host.
Network adapters, hubs, switches, and network devices in general seldom fail all at once, but rather produce increasing error rates and/or degrading performance over time. These metrics should be monitored regularly to spot problems before they become severe. Degradation can also occur due to aging drop cables.

Hardware device setup errors, including half/full duplex mismatches, cause high error and collision rates and result in hideous performance.
Overloaded servers can also produce poor network response. Servers can have several kinds of shortfalls: too much traffic for its interface to handle, too little memory for the network workload (or an incorrect configuration), and insufficient disk I/O bandwidth. The server's performance will need to be investigated to determine which of these are relevant (and hence where the most attention to the problem should be paid).
Insufficient network bandwidth for the workload. You can recognize such situations by the presence of slow response and/or significant timeouts on systems throughout the local network, which is not alleviated by the addition of another server system. The best solution to such problems is to use high-performance switches. If this is not possible, another, much less desirable, solution is to divide the network into multiple subnets that separate systems requiring distinct network resources from one another.

All of these problem types are best addressed via by correcting or replacing hardware and/or reallocating resources rather than configuration-level tuning.

15.7.2.1 Two TCP parameters

TCP operations are controlled by a very large number of parameters. Most of them should not be modified by nonexperts. In this subsection, we'll consider two that are most likely to produce significant improvements with little risk.

The maximum segment size (MSS) determines the largest "packet" size that the TCP protocol will transmit across the network. (The actual size will be 40 bytes larger due to the IP and TCP headers.) Larger segments result in fewer transmissions to transfer a given amount of data and usually provide correspondingly better performance on Ethernet networks.^[33] For Ethernet networks, the maximum allowed size, 1460 bytes (1500 minus 40), is usually appropriate.^[34]

^[33] Note that this will often not be the case for slow network links, especially for applications that are very sensitive to network transmission latencies.

^[34] When is it inappropriate? When the headers are larger than the minimum and using a size this large causes packet fragmentation and its resultant overhead. For example, a value of 1200-1300 is more appropriate when, say, the PPP over Ethernet protocol is used, as would be the case on a web server accessed by cable modem users.
Socket buffer sizes. When an application sends data across the network via the TCP protocol, it is first placed in a buffer. From there, the protocol will divide it as needed and create segments for transmission. Once the buffer is full, the application generally must wait for the entire buffer to be transmitted and acknowledged before it is allowed to queue additional data.
On faster networks, a larger buffer size can improve application performance. The tradeoff here is that each buffer consumes memory, so the system must have sufficient available memory resources to accommodate all of the buffers for (at least) the usual network load. For example, using read and write socket buffers of 32 KB for each of 500 network connections would require approximately 32 MB of memory on the network server (32 x 2 x 500). This would not be a problem on a dedicated network server but might be an issue on busy, general-purpose systems.

On current systems with reasonable memory sizes and no other applications with significant memory requirements, socket buffer sizes of 48 to 64 KB are usually reasonable.

Table 15-7 lists the relevant parameters for each of our Unix versions, along with the commands that may be used to modify them.

Table 15-7. Important TCP parameters
Version	Command	Socket Buffers [default in KB]	MSS [default in bytes]
AIX	`no -o` `param=value`	tcp_sendspace [16]tcp_recvspace [16]	tcp_mssdflt [512]
FreeBSD	`sysctl` `param=value`(also /etc/sysctl.conf)	net.inet.tcp.sendspace [32]net.inet.tcp.recvspace [64]	net.inet.tcp.mssdflt [512]
HP-UX	`ndd -set /dev/tcp` `param` `value`(also /etc/rc.config.d/nddconf)	tcp_recv_hiwater_def [32]tcp_xmit_hiwater_def [32]	tcp_mss_def [536]
Linux2.4 kernel	`echo "value" > /proc/sys/net/core/fileecho "values" > /proc/sys/net/ipv4/file`(holds 3 values: min, default, max)	rmem_max [64]wmem_max [64]tcp_rmem [~85]tcp_wmem [16]	not tunable
Solaris	`ndd -set /dev/tcp` `param` `value`	tcp_recv_hiwat [48]rcp_xmit_hiwat [48]	tcp_mss_def_ipv4 [512]
Tru64	`sysconfig -r inet` `param=value`(also /etc/sysconfigtab)	tcp_sendspace [60]tcp_recvspace [60]	tcp_mssdflt [536]

The remaining sections will consider performance issues associated with two important network subsystems: DNS and NFS.

15.7.3 DNS Performance

DNSperformance is another item that is easiest to affect at the planning stage. The key issues with DNS are:

Sufficient server capacity to service all of the clients
Balancing the load among the available servers

At the moment, the latter is best accomplished by specifying different name server orderings within the /etc/resolv.conf files on groups of client systems. It is also helpful to provide at least one DNS server on each side of slow links.

Careful placement of forwarders can also be beneficial. At larger sites, a two-tiered forwarding hierarchy may help to channel external queries through specific hosts and reduce the load on other internal servers.

Finally, use separate servers for handling internal and external DNS queries. Not only will there be performance benefits for internal users, it is also the best security practice.

DNS itself can also provide a very crude sort of load balancing via the use of multiple A records in a zone file, as in this example:

docsrv    IN   A  192.168.10.1           IN   A  192.168.10.2           IN   A  192.168.10.3

These records define three servers with the hostname docsrv. Successive queries for this name will receive each IP address in turn.^[35]

^[35] Actually, each query will receive each IP address as the first entry in the list that is returned. Most clients pay attention only to the top entry.

This technique is most effective when the operations that are requested from the servers are all essentially equivalent, and so a simple round robin distribution of them is appropriate. It will be less successful when requests can vary greatly in size or resource requirements. In such cases, manual assigning servers to the various clients will work better. You can do so my editing the nameserver entries in /etc/resolv.conf.

15.7.4 NFS Performance

The Network File System is a very important Unix network service, so we'll complete our discussion of performance by considering some of its performance issues.

Monitoring NFS-specific network traffic and performance is done via the nfsstat command. For example, the following command lists NFS client statistics:

$ nfsstat -rc Client rpc: tcp:       calls    badxids   badverfs   timeouts   newcreds                0          0          0          0          0              ... udp:       calls    badxids   badverfs   timeouts   newcreds    retrans           302241          7          0          3          0          0         badcalls     timers      waits                7         22          0

This system performs NFS operations using the UDP protocol (the traditional method), so the TCP values are all 0. The most important items to consider in this report are the following:

timeouts: Operations that failed because the server failed to respond in time. Such operations must be repeated.
badxids: Duplicate replies received for operations that were retransmitted (indicating a "false positive" timeout).

If either of these values is appreciable, there is probably an NFS bottleneck somewhere. If badxids is within a factor of, say, 6-7 of timeouts, the responsiveness the remote NFS server is the source of the client's performance problems. On the other hand, if there are many more timeouts than badxids, then general network congestion is to blame.

The nfsstat command's -s option is used to obtain NFS server statistics:

$ nfsstat -s Server nfs:       calls   badcalls    badprog    badproc    badvers    badargs       59077          0          0          0          0          0  unprivport   weakauth           0          0 Server nfs V2: (54231 out of 59077 calls)        null    getattr    setattr       root     lookup   readlink       read      0   0%    30   0%    12   0%     0   0%    68   0%     0   0% 30223  55%     wrcache      write     create     remove     rename       link    symlink      0   0% 23776  43%     4   0%     4   0%     0   0%     0   0%     0   0%       mkdir      rmdir    readdir     statfs      1   0%     0   0%    42   0%    71   0% Server nfs V3: (4846 out of 59077 calls)        null    getattr    setattr     lookup     access   readlink       read      0   0%   366   7%     0   0%  3096  63%   711  14%     0   0%     0   0%       write     create      mkdir    symlink      mknod     remove      rmdir      0   0%     0   0%     0   0%     0   0%     0   0%     0   0%     0   0%      rename       link    readdir   readdir+     fsstat     fsinfo   pathconf      0   0%     0   0%    47   0%   345   7%   166   3%    12   0%   103   2%      commit      0   0%

The first section of the report gives overall NFS server statistics. The remainder of the report serves to break down NFS operations by type. This server supports both NFS Versions 2 and 3, so we see values in both of the final two sections of the report.

15.7.4.1 NFS Version 3 performance improvements

Many Unix systems are now providingNFS Version 3 instead of or in addition to Version 2. NFS Version 3 has many benefits in several areas; reliability, security, performance are among them. The following are the most important improvements provided by NFS Version 3:

TCP versus UDP: Traditionally, NFS uses the UDP transport protocol. NFS Version 3 uses TCP as its default transport protocol.^[36] Doing so provides NFS operations with both flow control and packet-level retransmission. By contrast, when using UDP, any network failure requires that the entire operation be repeated. Thus, using TCP often results in smaller performance hits when there are problems.

^[36] Some NFS Version 2 implementations can also optionally use TCP instead of UDP.
Two-phase writes: Previously, NFS write operations were performed synchronously, meaning that a client had to wait for each write operation to be completed before starting another one. Under NFS Version 3, write operations are performed in two parts:
- The client queues a write request, which the server acknowledges immediately. Additional write operations can be queued once the acknowledgement is received.
- The client commits the write operation (possibly after some intermediate modifications), and the server commits it to disk (or requests its retransmission if the data is no longer available (e.g., if there was an intervening system crash).
The maximum data block size is increased (the previous limit was 8 KB). The actual maximum value is determined by transport protocol; for TCP, it is 32 KB. In addition to reducing the number of packets, a larger block size can result in fewer disks seeks and faster sequential file access. The effect is especially noticeable with high-speed networks.

15.7.4.2 NFS performance principles

The following points are important to keep in mind with respect to NFS server performance, especially in the planning stages:

Mounting NFS filesystems in the background (i.e., with the bg option) will speed up boots.
Use an appropriate number of NFS daemon processes. The rule of thumb is 2 per expected simultaneous client process. In contrast, if there are idle NFS daemons on a server, you can reduce the number and release their (albeit small) memory resources.
Very busy NFS servers will benefit from a multiprocessor computer. CPU resources are almost never an issue for NFS, but the context switches generated by very large numbers of clients can be significant.
Don't neglect the usual system memory and disk I/O performance considerations, including the size of the buffer cache, filesystem fragmentation, and data distribution across disks.
NFS searches remote directories sequentially, entry by entry, so avoid remote directories with large numbers of file.
Remember that not every task is appropriate for remote files. For example, compiling a program such that the object files are written to a remote filesystem will run very slowly indeed. In general, source files may be remote, but object files and executables should be created on the local system. In general, for best network performance, avoid writing large amounts of data to remote files (although you may to sacrifice disk and network I/O performance in order to use the CPU resources of a fast remote system).

Resources for You

After all of this discussion of system resources, it's worth spending a little time considering ones for yourself. Resources for system administrators come in many varieties: books and magazines, web sites and news groups, conferences and professional organizations, and humor and fun (all work and no play won't do anything positive for your performance).

Here are some of my favorites:

An excellent Unix internals book: UNIX Internals: The New Frontier by Uresh Vahalia (Prentice-Hall).
Sys Admin magazine, http://www.sysadminmag.com
Useful web sites: http://www.ugu.com, http://www.lwn.net, http://www.slashdot.com (the last for news and rumors).
LISA: an annual conference for system administrators run by Usenix and Sage (see http://www.usenix.org/events).
UNIX Hater's Handbook, ed. Simson Garfinkel, Daniel Weise, and Steve Strassmann (IDG Books) This is still the funniest book I've read in a long time. You can expect to waste a few hours at work if you start reading it there because you won't be able to put it down.