Section 6.18. Swap Space


6.18. Swap Space

In this section we look at how swap is allocated and then discuss the statistics used for monitoring swap. We refer to swap space as seen by the processes as virtual swap space and real (disk or file) swap space as physical swap space.

6.18.1. Swap Allocation

Swap space allocation goes through distinct stages: reserve, allocate, and swap-out. When you first create a segment, you reserve virtual swap space; when you first touch and allocate a page, you "allocate" virtual swap space for that page; then, if you encounter a memory shortage, you can "swap out" a page to swap space. Table 6.6 summarizes the swap states.

Table 6.6. Swap Space Allocation States

State

Description

Reserved

Virtual swap space is reserved for an entire segment. Reservation occurs when a segment is created with private/read/write access. The reservation represents the virtual size of the area being created.

Allocated

Virtual swap space is allocated when the first physical page is assigned to it. At that point, a swapfs vnode and offset are assigned against the anon slot.

Swapped out (used swap)

When a memory shortage occurs, a page may be swapped out by the page scanner. Swap-out happens when the page scanner calls swapfs_putpage for the page in question. The page is migrated to physical (disk or file) swap.


Swap space is reserved each time a heap segment is created. The amount of swap space reserved is the entire size of the segment being created. Swap space is also reserved if there is a possibility of anonymous memory being created. For example, mapped file segments that are mapped MAP_PRIVATE (like the executable data segment) reserve swap space because at any time they could create anonymous memory during a copy-on-write operation.

You should reserve virtual swap space up-front so that swap space allocation assignment is done at the time of request, rather than at the time of need. That way, an out-of-swap-space error can be reported synchronously during a system call. If you allocated swap space on demand during program execution rather than when you called malloc(), the program could run out of swap space during execution and have no simple way to detect the out-of-swap-space condition. For example, in the Solaris kernel, we fail a malloc() request for memory as it is requested rather than when it is needed later, to prevent processes from failing during seemingly normal execution. (This strategy differs from that of operating systems such as IBM's AIX, where lazy allocation is done. If the resource is exhausted during program execution, then the process is sent a SIGDANGER signal.)

The swapfs file system includes all available pageable memory as virtual swap space in addition to the physical swap space. That way, you can "reserve" virtual swap space and "allocate" swap space when you first touch a page. When you reserve swap rather than reserving disk space, you reserve virtual swap space from swapfs. Disk swap pages are only allocated once a page is paged out.

With swapfs, the amount of virtual swap space available is the amount of available unlocked, pageable physical memory plus the amount of physical (disk) swap space available. If you were to run without swap space, then you could reserve as much virtual memory as there is unlocked pageable physical memory available on the system. This would be fine, except that often virtual memory requirements are greater than physical memory requirements, and this case would prevent you from using all the available physical memory on the system.

For example, a process may reserve 100 Mbytes of memory and then allocate only 10 Mbytes of physical memory. The process's physical memory requirement would be 10 Mbytes, but it had to reserve 100 Mbytes of virtual swap, thus using 100 Mbytes of virtual swap allocated from available real memory. If we ran such a process on a 128-Mbyte system, we would likely start only one of these processes before we exhausted our swap space. If we added more virtual swap space by adding a disk swap device, then we could reserve against the additional space, and we would likely get 10 or so of the equivalent processes in the same physical memory.

The process data segment is another good example of a requirement for larger virtual memory than for physical memory. The process data segment is mapped MAP_PRIVATE, which means that we need to reserve virtual swap for the whole segment, but we allocate physical memory only for the few pages that we write to within the segment. The amount of virtual swap required is far greater than the physical memory allocated to it, so if we needed to swap pages out to the swap device, we would need only a small amount of physical swap space.

If we had the ideal process that had all of its virtual memory backed by physical memory, then we could run with no physical swap space. Usually, we need something like 0.5 to 1.5 times memory size for physical swap space. It varies, of course, depending on the virtual-to-physical memory ratio of the application. Another consideration is system size. A large multiprocessor Sun Server with 512GB of physical memory is unlikely to require 1TB of swap space. For very large systems with a large amount of physical memory, configured swap can potentially be less than total physical memory. Again, the actual amount of virtual memory required to meet performance goals will be workload dependent.

6.18.2. Swap Statistics

The amount of anonymous memory in the system is recorded by the anon accounting structures. The anon layer keeps track in the kanon_info structure of how anonymous pages are allocated. The kanon_info structure, shown below, is defined in the include file vm/anon.h.

struct k_anoninfo {         pgcnt_t ani_max;         /* total reservable slots on phys disk swap */         pgcnt_t ani_free;        /* # of unallocated phys and mem slots */         pgcnt_t ani_phys_resv;   /* # of reserved phys (disk) slots */         pgcnt_t ani_mem_resv;    /* # of reserved mem slots */         pgcnt_t ani_locked_swap; /* # of swap slots locked in reserved */                                  /* mem swap */ };                                                                          See sys/anon.h 


The k_anoninfo structure keeps count of the number of slots reserved on physical swap space and against memory. This information populates the data used for the swapctl system call. The swapctl() system call provides the data for the swap command and uses a slightly different data structure, the anoninfo structure, shown below.

struct anoninfo {         pgcnt_t ani_max;         pgcnt_t ani_free;         pgcnt_t ani_resv; };                                                                          See sys/anon.h 


The anoninfo structure exports the swap allocation information in a platform-independent manner.

6.18.3. Swap Summary: swap -s

The swap -s command output, shown below, summarizes information from the anoninfo structure.

$ swap -s total: 108504k bytes allocated + 13688k reserved = 122192k used, 114880k available 


The output of swap -s can be somewhat misleading because it confuses the terms used for swap definition. The output is really telling us that 122,192 Kbytes of virtual swap space have been reserved, 108,504 Kbytes of swap space are allocated to pages that have been touched, and 114,880 Kbytes are free. This information reflects the stages of swap allocation, shown in Figure 6.5. Remember, we reserve swap as we create virtual memory, and then part of that swap is allocated when real pages are assigned to the address space. The balance of swap space remains unused.

Figure 6.5. Swap Allocation States


6.18.4. Listing Physical Swap Devices: swap -l

The swap -l command lists the physical swap devices and their levels of physical allocation.

$swap -l swapfile              dev  swaplo blocks   free /dev/dsk/c0t0d0s0    136,0      16 1049312 782752 


The blocks and free are in units of disk blocks, or sectors (512 bytes). This example shows that some of our physical swap slice has been used.

6.18.5. Determining Swapped-Out Threads

The pageout scanner will send clusters of pages to the swap device. However, if it can't keep up with demand, the swapper swaps out entire threads. The number of threads swapped out is either the kthr:w column from vmstat or swpq-sz from sar -q.

The following example is the same system from the previous swap -l example but it has experienced a dire memory shortage in the past and has swapped out entire threads.

$ vmstat 1 2  kthr      memory          page            disk          faults      cpu  r b w   swap  free  re  mf pi po fr de sr dd dd f0 s3   in   sy   cs us sy id  0 0 13 423816 68144  3  16  5  0  0  0  1  0  0  0  0   67   36  136  1  0 98  0 0 67 375320 43040  0   6  0  0  0  0  0  0  0  0  0  406  354  137  1  0 99 $ sar -q 1 SunOS mars 5.9 Generic_118558-05 sun4u    03/12/2006 05:05:36 runq-sz %runocc swpq-sz %swpocc 05:05:37     0.0       0    67.0      99 


Our system currently has 67 threads swapped out to the physical swap device. The sar command has also provided a %swpocc column, which reports the percent swap occupancy. This is the percentage of time that threads existed on the swap device (99% is a rounding error) and is more useful for much longer sar intervals.

6.18.6. Monitoring Physical Swap Activity

To determine if the physical swap devices are currently busy with I/O transactions, we can use the iostat command in the regular manner. We just need to remember that we are looking at the swap slice, not a file system slice.

$ iostat -xnPz 1 ...                   extended device statistics     r/s   w/s  kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device     0.0  27.0   0.0 3452.3  2.1  0.7   78.0   24.9  32  34 c0t0d0s1                   extended device statistics     r/s   w/s  kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device     1.0   0.0   8.0    0.0  0.0  0.0    39.6  36.3   4   4 c0t0d0s0     0.0  75.1   0.0 9609.3  8.0  1.9   107.1  24.7  88  95 c0t0d0s1                   extended device statistics     r/s   w/s  kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device     0.0  61.0   0.0 7686.7  5.4  1.4   88.3   23.6  65  73 c0t0d0s1 ... 


Physical memory was quickly exhausted on this system, causing a large number of pages to be written to the physical swap device, c0t0d0s1.

Swap activity due to the swapping out of entire threads can be viewed with sar -w. The vmstat -S command prints similar swapping statistics.

6.18.7. MemTool prtswap

In the following example, we use the prtswap script in MemTool to list the states of swap to find out where the swap is allocated from. We then use the prtswap command without the -l option for just a summary of the swap allocations.

# prtswap -l Swap Reservations: -------------------------------------------------------------------------- Total Virtual Swap Configured:                            767MB = RAM Swap Configured:                                          255MB Physical Swap Configured:                              +      512MB Total Virtual Swap Reserved Against:                      513MB = RAM Swap Reserved Against:                                      1MB Physical Swap Reserved Against:                        +      512MB Total Virtual Swap Unresv. & Avail. for Reservation:  253MB = Physical Swap Unresv. & Avail. for Reservations:            0MB RAM Swap Unresv. & Avail. for Reservations:        +      253MB Swap Allocations: (Reserved and Phys pages allocated) -------------------------------------------------------------------------- Total Virtual Swap Configured:                            767MB Total Virtual Swap Allocated Against:                     467MB Physical Swap Utilization: (pages swapped out) -------------------------------------------------------------------------- Physical Swap Free (should not be zero!):                 232MB = Physical Swap Configured:                                     512MB Physical Swap Used (pages swapped out):                -      279MB                                                                             See MemTool 


# prtswap Virtual Swap: --------------------------------------------------------------- Total Virtual Swap Configured:                            767MB Total Virtual Swap Reserved:                              513MB Total Virtual Swap Free: (programs will fail if 0)        253MB Physical Swap Utilization: (pages swapped out) --------------------------------------------------------------- Physical Swap Configured:                                 512MB Physical Swap Free (programs will be locked in if 0):     232MB                                                                             See MemTool 


The prtswap script uses the anonymous accounting structure members to establish how swap space is allocated and uses the availrmem counter, the swapfsminfree reserve, and the swap -l command to find out how much swap is used. Table 6.7 shows the anonymous accounting variables stored in the kernel.

Table 6.7. Swap Accounting Information

Field

Description

k_anoninfo.ani_max

The total number of reservable slots on physical (disk-backed) swap.

k_anoninfo.ani_phys_resv

The number of physical (disk-backed) reserved slots.

k_anoninfo.ani_mem_resv

The number of memory reserved slots.

k_anoninfo.ani_free

Total number of unallocated physical slots + the number of reserved but unallocated memory slots.

availrmem

The amount of unreserved memory.

swapfsminfree

The swapfs reserve that won't be used for memory reservations.


6.18.8. Display of Swap Reservations with pmap

The -S option of pmap describes the swap reservations for a process. The amount of swap space reserved is displayed for each mapping within the process. Swap reservations are reported as zero for shared mappings since they are accounted for only once systemwide.

sol9$ pmap -S 15492 15492:  ./maps  Address  Kbytes    Swap Mode   Mapped File 00010000       8       - r-x--  maps 00020000       8       8 rwx--  maps 00022000   20344   20344 rwx--    [ heap ] 03000000    1024       - rw-s-  dev:0,2 ino:4628487 04000000    1024    1024 rw---  dev:0,2 ino:4628487 05000000    1024     512 rw--R  dev:0,2 ino:4628487 06000000    1024    1024 rw---    [ anon ] 07000000     512     512 rw--R    [ anon ] 08000000    8192       - rwxs-    [ dism shmid=0x5] 09000000    8192       - rwxs-    [ dism shmid=0x4] 0A000000    8192       - rwxs-    [ dism shmid=0x2] 0B000000    8192       - rwxsR    [ ism shmid=0x3] FF280000     680       - r-x--  libc.so.1 FF33A000      32      32 rwx--  libc.so.1 FF390000       8       - r-x--  libc_psr.so.1 FF3A0000       8       - r-x--  libdl.so.1 FF3B0000       8       8 rwx--    [ anon ] FF3C0000     152       - r-x--  ld.so.1 FF3F6000       8       8 rwx--  ld.so.1 FFBFA000      24      24 rwx--    [ stack ] -------- ------- ------- total Kb   50464   23496 


You can use the swap reservation information to estimate the amount of virtual swap used by each additional process. Each process consumes virtual swap from a global virtual swap pool. Global swap reservations are reported by the avail field of the swap(1M) command.

It is important to stress that while you should consider virtual reservations, you must not confuse them with physical allocations (which is easy to do since many commands just describe them as "swap"). For example:

# pmap -S 236 236:    /usr/lib/nfs/nfsmapid  Address  Kbytes    Swap Mode    Mapped File 00010000      24       - r-x--   nfsmapid 00026000       8       8 rwx--   nfsmapid 00028000    7768    7768 rwx--     [ heap ] ... FF3EE000       8       8 rwx--   ld.so.1 FFBFE000       8       8 rw---     [ stack ] -------- ------- ------- total Kb   10344    8272 


Process ID 236 (nfsmapid) has a total Swap reservation of 8 Mbytes. Now we list the state of our physical swap devices on this system:

$ swap -l swapfile             dev  swaplo blocks   free /dev/dsk/c0t0d0s1   136,9      16 2097632 2097632 


No physical swap has been used.




Solaris Performance and Tools(c) Dtrace and Mdb Techniques for Solaris 10 and Opensolaris
Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris
ISBN: 0131568191
EAN: 2147483647
Year: 2007
Pages: 180

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net