Identifying Resources | Hacking Ubuntu: Serious Hacks Mods and Customizations (ExtremeTech)

Your system has a lot of different resources that can be used by processes. These resources include CPU processing time, disk space, disk I/O, RAM, graphic memory, and network traffic. Fortunately, there are ways to measure each of these resources.

What's Up, /proc?

Linux provides a virtual file system that is mounted in the /proc directory. This directory lists system resources and running processes. For example:

 $ ls -F /proc 1/     3910/  4133/  4351/     bus/          iomem      partitions 1642/  3930/  4135/  4352/     cmdline       ioports    pmu/ 1645/  3945/  4137/  4363/     cpuinfo       irq/       scsi/ 1650/  3951/  4167/  4364/     crypto        kallsyms   self@ 1736/  3993/  4220/  4382/     devices       kcore      slabinfo 1946/  4/     4224/  5/        device-tree/  key-users  stat 2/     4009/  4237/  54/       diskstats     kmsg       swaps 20/    4027/  4250/  55/       dma           loadavg    sys/ 3/     4057/  4270/  56/       driver/       locks      sysrq-trigger 3310/  4072/  4286/  57/       execdomains   mdstat     sysvipc/ 3333/  4073/  4299/  6/        fb            meminfo    tty/ 3335/  4081/  4347/  651/      filesystems   misc       uptime 3356/  4091/  4348/  apm       fs/           modules    version 3402/  4092/  4349/  asound/   ide/          mounts@    vmstat 3904/  4127/  4350/  buddyinfo interrupts    net/       zoneinfo

The numbered directories match every running process. In each directory, you will find the actual running command-line and running environment. Device drivers and the kernel use non-numeric directories. These show system resources. For example, /proc/iomem shows the hardware I/O map and /proc/cpuinfo provides information about the system CPUs.

Although /proc is useful for debugging, applications should be careful when depending on it. In particular, everything is dynamic: process directories may appear and vanish quickly and some resources constantly change.

Measuring CPU

The CPU load can be measured in a couple of ways. The uptime command provides a simple summary. It lists three values: load averages for 1 minute, 5 minutes, and 15 minutes. The load is a measurement of queue time. If you have one CPU and the load is less than 1.0, then you are not consuming all of the CPU resources. A load of 2.0 means all resources are being consumed and you need twice as many CPUs to reduce any wait-time. If you have two CPUs, then a load of 1.0 indicates that both processors are operating at maximum capacity. Although a load of 1.0 won't seem sluggish, a load of 5.0 can be noticeably detectable because commands may need to wait a few seconds few moments before being processed.

While uptime provides a basic metric, top gives finer details. While running top, you can press 1 to see the load per CPU at the top of the screen and you can see which processes are consuming the most CPU resources. The command ps aux also shows CPU resources per process.

Measuring Disk Space

The commands df and du are used to identify disk space. The disk-free command (df, also sometimes called disk-full or disk-file system) lists every mounted partition and the amount of disk usage. The default output shows the information in blocks. You can also see the output in a human-readable form (-h) and see the sizes in kilobytes or megabytes: df -h. The df command also allows you to specify a file or directory name. In this case, it will show the disk usage for the partition containing the file (or directory). For example to see how much space if in the current directory, use:

 $ df .     # default output Filesystem           1K-blocks      Used Available Use% Mounted on /dev/hda1            154585604  72737288  73995748  50% / $ df -h .    # human readable form Filesystem            Size  Used Avail Use% Mounted on /dev/hda1             148G   70G   71G  50% /

You can also use the System Monitor (System Administration System Monitor) to graphically show the df results (see Figure 7-2).

image from book
Figure 7-2: System Monitor showing available disk space

The disk-usage (du) command shows disk usage by directory. When used by itself, it will display the disk space in your current directory and every subdirectory. If you specify a directory, then it starts there instead. To see the biggest directories, you can use a command like du | sort -rn | head. This will sort all directories by size and display the top 10 biggest directories. Finally, you can use the -s parameter to stop du from listing the sizes from every subdirectory. When I am looking for disk hogs in my directory, I usually use du -s * | sort -rn | head. This lists the directories in size order. I can then enter the biggest directory and repeat the command until I find the largest files.

Tip

The du command looks at every file in every subdirectory. If you have thousands of files, then this could take a while. When looking for large directories, consider the ones that take the longest to process. If every directory takes a second to display and one directory takes a minute, then you can press Ctrl+C because you probably found the biggest directory.

Measuring Disk I/O

All processes that access a disk do so over the same I/O channel. If the channel becomes clogged with traffic, then the entire system may slow down. It is very easy for a low-CPU application to consume most of the disk I/O. While the system load will remain low, the computer will appear sluggish.

If the system seems to be running slowly, you can use iostat (sudo apt-get install sysstat) to check the performance (see Listing 7-2). Besides showing the system load, the I/O metrics from each device are displayed. I usually use iostat with the watch command in order to identify devices that seem overly active.

 watch --interval 0.5 iostat

Listing 7-2: Installing and Using iostat

 $ sudo apt-get install sysstat  # install iostat $ iostat Linux 2.6.15-26-686 (chutney)   09/30/2006 avg-cpu:  %user   %nice %system %iowait  %steal    %idle            0.22    0.00    0.13    0.10    0.00    99.55 Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn hda               1.98        19.46        23.12    2805161     3332264 hdb               0.18         1.39         0.46     200903       66000 sda               0.01         0.06         0.01       8612        1536 sdb               0.01         0.07         0.01       9518        1536 md0               0.01         0.11         0.01      15450        1392

After finding which device is active, you can identify where the device is mounted using the mount command:

 $ mount /dev/hda1 on / type ext3 (rw,errors=remount-ro) proc on /proc type proc (rw) /sys on /sys type sysfs (rw) varrun on /var/run type tmpfs (rw) varlock on /var/lock type tmpfs (rw) udev on /dev type tmpfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) devshm on /dev/shm type tmpfs (rw)

Now that you know which device is active and where it is used, you can use lsof to identify which processes are using the device. For example, if device hda is the most active and it is mounted on /, then you can use lsof / to list every process accessing the directory. If a raw device is being used, then you can specify all devices with lsof /dev or a single device (for example, hda) using lsof /dev/hda.

Note

Unfortunately, there is no top-like command for disk I/O. You can narrow down the list of suspected applications using lsof, but you cannot identify which application is consuming most of the disk resources.

Measuring Memory Usage

RAM is a limited resource on the system. If your applications allocate all available RAM, then the kernel will begin swapping memory to disk. Although swap space can allow you to run massively large applications, swap is also very slow compared to just using RAM. There are a couple of ways to view swap usage. The command swapon -s will list the available swap space and show the usage. There is usually a little swap space used, but if it is very full then you either need to allocate more swap space, install more RAM, or find out what is consuming the available RAM. The System Monitor (System Administration System Monitor) enables you to graphically view the available memory usage and swap space and identify if it is actively being used (see Figure 7-3).

image from book
Figure 7-3: The System Monitor displaying CPU, memory, swap, and network usage

To identify which applications are consuming memory, use the top or ps aux commands. Both of these commands show memory allocation per process. In addition, the pmap command can show you memory allocations for specific process IDs.

Measuring Video Memory

The amount of memory on your video card will directly impact your display. If you have an old video card with 256 KB of RAM, then the best you can hope for is 800x600 with 16 colors. Most high-end video cards today have upwards of 128 MB of RAM, allowing monster resolutions like 1280x1024 with 32 million colors. More memory also eases animation for games and desktops. While one set of video memory holds the main picture, other memory sections can act as layers for animated elements.

There is no simple way to determine video memory. If you have a PCI memory card, then the command lspci -v will show you all PCI cards (including your video card) and all memory associated with the card. For example:

 $ lspci -v | more 0000:01:00.0 VGA compatible controller: nVidia Corporation NV18 [GeForce4 MX 400 0 AGP 8x] (rev c1) (prog-if 00 [VGA])         Subsystem: Jaton Corp: Unknown device 0000         Flags: bus master, 66MHz, medium devsel, latency 248, IRQ 177         Memory at fa000000 (32-bit, non-prefetchable) [size=16M]         Memory at f0000000 (32-bit, prefetchable) [size=128M]         Expansion ROM at fbee0000 [disabled] [size=128K]         Capabilities: <available only to root>

This listing shows an NVIDIA NV18 video card with 128 MB of video RAM.

Tip

On large supercomputers, lspci not only shows what is attached but also where. For example, if you have eight network cards then it can identify which slot each card is in. This is extremely useful for diagnostics in a mission-critical environment with fail-over hardware support. One example is to use (lspci -t ; lspci -v) | less to show the bus tree and each item's details.

Measuring Network Throughput

Just as disk I/O can create a performance bottleneck, so can network I/O. While some applications poll the network for data and increase CPU load when the network is slow, most applications just wait until the network is available and do not impact the CPU's load.

If the computer seems sluggish when accessing the network, then you can check the network performance using netstat -i inet:

 $ netstat -i inet Kernel Interface table Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR     TX-OK TX-ERR TX-DRP TX-OVR Flg eth0   1500 0    338386      0      0      0   737350      0      0       0 BMRU lo    16436 0       786      0      0      0      786      0      0       0 LRU vmnet  1500 0         0      0      0      0      465      0      0       0 BMRU vmnet  1500 0         0      0      0      0      465      0      0       0 BMRU

This shows the amount of traffic on each network interface as well as any network errors, dropped packets, and overruns. This also shows the name of the network interface (for example, eth0). When checking network usage, I usually use netstat with the watch command so I can see network usage over time:

 watch --interval 0.5 netstat -i inet

Tip

The netstat -i inet command shows the number of packets from every interface. You can also use ifconfig (for example, ifconfig eth0) to see more detail; ifconfig shows the number of packets and number of bytes from a particular network interface.

The netstat -t and netstat -u commands allow you to see which network connections are active. The -t option shows TCP traffic, and -u shows UDP traffic. There are many other options including –protocol=ip to show all IP (IPv4) connections, and IPv6 connections are listed with –protocol=ip6.

To identify which processes are using the network, you can use lsof. The -i4 parameter shows which processes have IPv4 connections, -i6 displays IPv6, -i tcp lists TCP, and -i udp displays applications with open UDP sockets:

 $ lsof -i4 -n  # show network processes and give IP addresses as numbers COMMAND  PID  USER   FD   TYPE DEVICE SIZE NODE NAME ssh     8699 mark     3u  IPv4 120398       TCP 10.3.1.5:41525->10.3.1.3:ssh (ESTABLISHED) ssh     8706 mark     3u  IPv4 120576       TCP 10.3.1.5:41526->10.3.7.245:ssh (ESTABLISHED)