|
Workloads have a tendency to consume all available memory. Linux provides reasonably efficient access to physical memory and provides access to potentially huge amounts of "virtual" memory. Virtual memory is usually little more than a capability of an operating system to offload less frequently used data to disk storage while presenting the illusion that the system has an enormous amount of physical memory. Unfortunately, the price for offloading memory can be ten or a hundred times more expensive in terms of application latency. Those high latencies can impact application response times dramatically if the memory that is paged out to disk is the wrong memory, or if the application's active memory footprint is larger than the size of physical memory. Many performance problems are caused by insufficient memory, which triggers system swapping. Thus, it is useful to have tools that monitor memory utilizationfor example, how the kernel memory is consumed per process or per thread, and how the memory is consumed by the kernel data structures along with their counts and sizes. As with CPU utilization, understanding how both the system and individual processes are behaving is key to tracking down any performance problems caused by memory shortages. /proc/meminfo and /proc/slabinfoLinux provides facilities to monitor the utilization of overall system memory resources under the /proc file systemnamely, /proc/meminfo and /proc/slabinfo. These two files capture the state of the physical memory. A partial display of /proc/meminfo is as follows: MemTotal: 8282420 kB MemFree: 7942396 kB Buffers: 46992 kB Cached: 191936 kB SwapCached: 0 kB HighTotal: 7470784 kB HighFree: 7232384 kB LowTotal: 811636 kB LowFree: 710012 kB SwapTotal: 618492 kB SwapFree: 618492 kB Mapped: 36008 kB Slab: 36652 kB MemTotal gives the total amount of physical memory of the system, whereas MemFree gives the total amount of unused memory. Buffers corresponds to the buffer cache for I/O operations. Cached corresponds to the memory for reading files from the disk. SwapCached represents the amount of cache memory that has been swapped out in the swap space. SwapTotal represents the amount of disk memory for swapping purposes. If an IA32-based system has more than 1GB of physical memory, HighTotal is nonzero. HighTotal corresponds to memory greater than ~860MB of the physical memory. LowTotal is the memory used by the kernel. Mapped corresponds to the files that are memory-mapped. Slab corresponds to the memory used for the kernel data structures. By capturing /proc/meminfo periodically, you can establish a pattern of memory utilization. With the aid of simple scripts and graphics tools, the pattern can be also summarized visually. To understand kernel memory consumption, examine /proc/slabinfo. A partial display of /proc/slabinfo is as follows: tcp_bind_bucket 56 224 32 2 2 1 tcp_open_request 16 58 64 1 1 1 inet_peer_cache 0 0 64 0 0 1 secpath_cache 0 0 32 0 0 1 flow_cache 0 0 64 0 0 1 The first column lists the names of the kernel data structures. To further describe tcp_bind_bucket, there is a total of 224 tcp_bind_bucket objects, 56 of which are active. Each data structure takes up 32 bytes. There are two pages that have at least one active object, and there is a total of two allocated pages. Moreover, one page is allocated for each slab. This information highlights certain data structures that merit more focus, such as those with larger counts or sizes. Thus, by capturing meminfo and slabinfo together, you can begin to understand what elements of the operating system are consuming the most memory. If the values of LowFree or HighFree are relatively small (or smaller than usual), that might indicate that the system is running with more requests for memory than usual, which may lead to a reduction in overall performance or application response times. psTo find out how the memory is used within a particular process, use ps for an overview of memory used per process: $ ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 1528 528 ? S 15:24 0:00 init [2] root 2 0.0 0.0 0 0 ? SN 15:24 0:00 [ksoftirqd/0] root 3 0.0 0.0 0 0 ? S< 15:24 0:00 [events/0] root 4 0.0 0.0 0 0 ? S< 15:24 0:00 [khelper] root 5 0.0 0.0 0 0 ? S< 15:24 0:00 [kacpid] root 48 0.0 0.0 0 0 ? S< 15:24 0:00 [kblockd/0] root 63 0.0 0.0 0 0 ? S 15:24 0:00 [pdflush] root 64 0.0 0.0 0 0 ? S 15:24 0:00 [pdflush] The output of the ps aux command shows the total percentage of system memory that each process consumes, as well as its virtual memory footprint (VSZ) and the amount of physical memory that the process is currently using (RSS). You can also use top(1) to sort the process listing interactively to see which processes are consuming the most memory and how that consumption changes as the system runs. After you have identified a few processes of interest, you can look into the specific allocations of memory that the process is using by looking at the layout of the processes' virtual address space. /proc/pid/maps, where pid is the process ID of a particular process as found through ps(1) or top(1), contains all mappings of the processes' address spaces and their sizes. Each map shows the address range that is allocated, the permissions on the page, and the location of the backing store associated with that address range (if any). /proc/pid/maps is not a performance tool per se; however, it provides insight into how memory is allocated. For example, for performance purposes, you can confirm whether a certain amount of shared memory is allocated between 1GB and 2GB in the virtual address space. The preceding map can be used to examine its utilization. The following output is for process ID 3162:
vmstatvmstat was introduced in the section on CPU utilization. However, its primary purpose is to monitor memory availability and swapping activity, and it provides an overview of I/O activity. vmstat can be used to help find unusual system activity, such as high page faults or excessive context switches, that can lead to a degradation in system performance. A sample of the vmstat output is as follows: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 18 8 0 5626196 3008 122788 0 0 330403 454 2575 4090 91 8 1 0 18 15 0 5625132 3008 122828 0 0 328767 322 2544 4264 91 8 0 0 17 12 0 5622004 3008 122828 0 0 327956 130 2406 3998 92 8 0 0 22 2 0 5621644 3008 122828 0 0 327892 689 2445 4077 92 8 0 0 23 5 0 5621616 3008 122868 0 0 323171 407 2339 4037 92 8 1 0 21 14 0 5621868 3008 122868 0 0 323663 23 2418 4160 91 9 0 0 22 10 0 5625216 3008 122868 0 0 328828 153 2934 4518 90 9 1 0 The memory-related data reported by vmstat includes the following:
For I/O-intensive workloads, you can monitor bi and bo for the transfer rate, and in for the interrupt rate. You can monitor swpd, si, and so to see whether the system is swapping. If so, you can check on the swapping rate. Perhaps the most common metric is CPU utilization and the monitoring of us, sy, id, and wa. If wa is large, you need to examine the I/O subsystem. You might come to the conclusion that more I/O controllers and disks are needed to reduce the I/O wait time. |
|