The Linux Page Tables

team bbl


The Linux system maintains a page table for each process in physical memory and accesses the actual page tables via the identity mapped kernel segment. Page tables in Linux cannot be paged out to the swap space. This implies that a process that allocates a large range of address space could potentially saturate the memory subsystem, because the page table itself would use up all the available memory. On a similar note, because hundreds of processes are active simultaneously on a system, the combined size of all the page tables could potentially use all available memory. The large memory subsystems available on contemporary systems render this scenario unusual, but it still reflects a capacity planning issue that should be addressed. Keeping the page tables in physical memory simplifies the kernel design, and there is no need to deal with nested page faults. The per-process page table layout is based on a multilayer tree consisting of three levels. The first layer consists of the global directory (pgd), the second layer consists of the middle directory (pmd), and the third layer consists of the page table entry (pte). Normally, each directory node occupies one page frame and contains a fixed number of entries. Entries in the pgd and pmd directories are either not present or they point to a directory in the next layer of the tree. The pte entries depict the leaf nodes of the tree, and they contain the actual page table entries. Because the page table layout in Linux resembles a multilayer tree, the space requirements are proportional to the actual virtual address space in use. Therefore, the space requirements are not the maximum size of the virtual address space. Further, because Linux manages memory as a set of page frames, the fixed node size-based approach does not require the large and physically contiguous region of memory that the linear page table-based implementation requires.

While processing a virtual-to-physical page translation, the virtual address is decomposed into multiple sections. The sections utilized for a page-table lookup operation relay on pgd, pmd, and pte indexes (see Figure 9-3). A lookup operation is initiated via the page-table pointer that is stored in the mm structure. The page-table pointer references the global directory that is the root directory of the page-table tree. The pgd index identifies the entry that contains the address of the middle directory. The combination of that address plus the pmd index allows the system to locate the address of the pte directory. Expanding the mechanism by one more layer lets you identify the pte of the page to which the virtual address actually maps. The pte allows the system to calculate the address of the physical page frame, and by utilizing the offset component of the virtual address, the correct location can be identified. The multilayer-based approach to implementing the page tables in Linux represents a platform-independent solution. This solution allows the collapsing of the middle directory into the global directory if the entire tree structure is not necessary to support a certain implementation. This is the approach taken in an IA-32 environment, where the size of the page middle directory is set to 1. In other words, in IA-32, the 32-bit virtual address is decomposed into 10 bits reserved for the page directory, 10 bits for the page table entry, and the remaining 12 bits are utilized by the offset section. The address translation process is considered a joint venture between hardware (the memory management unit (MMU)) and software (the kernel). The kernel communicates with the MMU to identify the virtual pages to be mapped onto physical pages for each user address space. The MMU has the capability to notify the kernel of any error conditions in the process. The most common error condition revolves around page faults, where the kernel has to retrieve the required page from secondary storage. Other possible error conditions may be related to or triggered by any potential page protection issues. From a physical address perspective, Linux differentiates among different memory zones (ZONE_DMA, ZONE_NORMAL, and ZONE_ HIGHMEM), where each zone represents different characteristics. Most of the memory allocation happens in ZONE_NORMAL, whereas ZONE_HIGHMEM represents the physical addresses greater than 896MB. (See the later section "VM Tunables" for more information on memory zones.)

Figure 9-3. Address resolution process.


The next section introduces, from a performance perspective, some of the new features in Linux 2.6. The VM system impacts every other subcomponent in the system; this discussion extends into the CPU and the I/O subsystem components covered in a separate chapter.

    team bbl



    Performance Tuning for Linux Servers
    Performance Tuning for Linux Servers
    ISBN: 0137136285
    EAN: 2147483647
    Year: 2006
    Pages: 254

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net