Tracking Swap in the Kernel Structures

   

In Figure 7-7, we see the relationship between the various kernel-resident data structures required to manage the swap space.

Figure 7-7. Swap Kernel Structures

graphics/07fig07.gif


As we see in the diagram, the hambone is connected to the knee bone. Swap management starts with entries in the swap device table, swdevt, and the file system device table, fswdevt. Pointers from the two priority arrays are used to order the searching of the swap devices. A swap device is broken down into individual swap chunks. The kernel tunable maxswapchunks sets the systemwide number and determines the size of the swap table. Each swap chunk is sized by the kernel-tunable swchunk, which defaults to 2 MB, enough room for 512 page-outs.

Individual pages have entries in the chunk's swap map. This map also contains a linked free list for its pages. The swap maps are pointed to by entries in the system swap table, swaptab. The swap table is filled in the order the swap devices were enabled but is searched according to the order implied by the priority table pointers. If two devices share a priority, their chunks are searched in round-robin fashion.

Let's examine these structures in Listings 7.1 through 7.6.

Listing 7.1. q4> fields struct devpri
 These structures are maintained in the array swdev_pri[NSWPRI] (default value is 11) Pointer to the first swap device at this priority  0 0 4 0 * first Pointer to next device at this priority to allocate from  4 0 4 0 * curr 

Listing 7.2. q4> fields struct fspri
 These structures are maintained in the array swfs_pri[NSWPRI] (default value is 11) Pointer to first file system swap at this priority  0 0 4 0 * first Pointer to next swap area at this priority to allocate from  4 0 4 0 * curr 

Listing 7.3. q4> fields swdev_t
 The swap device number   0 0 4 0 int  sw_dev The swap device flags (i.e. SW_ENABLE)  4 0 4 0 int  sw_flags The Kbyte (DEV_BSIZE) offset to the beginning of the swap area on the disk device  8 0 4 0 long sw_start Number of blocks on the device 12 0 4 0 long sw_nblksavail Number of blocks enabled for swap 16 0 4 0 long sw_nblksenabled Number of free pages 20 0 4 0 int  sw_nfpgs Swap priority for this device 24 0 4 0 int  sw_priority First swap table entry for this device 28 0 4 0 int  sw_head Last swap table entry for this device 32 0 4 0 int  sw_tail Pointer to next swap device sharing the same priority 36 0 4 0 *    sw_next 

Listing 7.4. q4> fields fswdev_t
 Pointer to next file system swap area with the same priority  0 0   4 0 *         fsw_next The status flags  4 0   4 0 int       fsw_flags Number of free swap pages  8 0   4 0 int       fsw_nfpgs Number of blocks allocated 12 0   4 0 long      fsw_allocated Minimum number of preallocated blocks 16 0   4 0 u_long    fsw_min The block allocation limit 20 0   4 0 u_long    fsw_limit The block reservation limit (File System swap equivalent to  minimum free space) 24 0   4 0 u_long    fsw_reserve Priority for this file system swap space 28 0   4 0 int       fsw_priority Pointer to the vnode for the file system's mount point 32 0   4 0 *         fsw_vnode The underlying file system's block size 36 0   4 0 u_int     fsw_bsize This swap space's first swap table entry 40 0   2 0 short     fsw_head This swap space's last swap table entry 42 0   2 0 short     fsw_tail The directory path name for the underlying file system's mount point 44 0 256 0 char[256] fsw_mntpoint 

Listing 7.5. q4> fields swpt_t
 Index to the first free swapmap array entry  0 0 2 0 short st_free Index of next chunk for same device or file system swap area  2 0 2 0 short st_next Status flags (ST_INDEL|ST_FREE|ST_INUSE)  4 0 4 0 int   st_flags Pointer to swap device  8 0 4 0 *     st_dev Pointer to swap file system 12 0 4 0 *     st_fsp Device of file system chunk vnode 16 0 4 0 *     st_vnode Number of free pages on the device 20 0 4 0 int   st_nfpgs Pointer to a swap maps starting address 24 0 4 0 *     st_swpmp 

Listing 7.6. q4> fields swpm_t
 Number of kthreads using this page 0 0 2 0 u_short sm_ucnt Index to first free entry in this swap map 2 0 2 0 short   sm_next 

As a final bit of discussion, do you see the slight-of-hand trick played by the disk block descriptor data? How can the 28-bit field in the dbd point to a specific device and an offset on the device. Device numbers are 32 bits long by themselves, and the block address on a modern disk may be quite large. The smoke and mirrors employed here are several levels of indirection. The upper half of the dbd data, dbd_swptb, points to the appropriate swap table entry. Here we pick up st_dev, the device number, and st_swpmp, the pointer to this chunk's swap map. Next, the dbd dbd_swpmp is the page offset into the swap chunk.

This means that currently no more than 2^14 swap chunks may be configured on a system and that each chunk may hold only 2^14 pages at most. If we do the math, this limits the maximum device swap space to:

2^14 * 2^14 * 4096 or 2^40 or 1 TB



HP-UX 11i Internals
HP-UX 11i Internals
ISBN: 0130328618
EAN: 2147483647
Year: 2006
Pages: 167

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net