11.1. Kernel Virtual Memory LayoutThe kernel, just like a process, uses virtual memory and uses the memory management unit (MMU) to translate its virtual memory addresses into physical pages. The kernel has its own address space and corresponding virtual memory layout. The kernel's address space is constructed of address space segments, using the standard Solaris memory architecture framework. Most of the kernel's memory is nonpageable, or "wired down." The reason is that the kernel requires its memory to complete operating system tasks that could affect other memory-related data structures and, if the kernel had to take a page fault while performing a memory management task (or any other task that affected pages of memory), a deadlock could occur. Solaris does, however, allow some deadlock-safe parts of the Solaris kernel to be allocated from pageable memory, which is used mostly for the lightweight process thread stacks. Kernel memory consists of a variety of mappings from physical memory (physical memory pages) to the kernel's virtual address space, and memory is allocated by a layered series of kernel memory allocators. Two segment drivers handle the creation and management of the majority of kernel mappings. Nonpageable kernel memory is mapped with the segkmem kernel segment driver and pageable kernel memory with the segkp segment driver. On platforms that support it, the critical and frequently used portions of the kernel are mapped from large (4-Mbyte) pages to maximize the efficiency of the hardware TLB. 11.1.1. Kernel Address SpaceThe kernel virtual memory layout differs from platform to platform, mostly based on the platform's MMU architecture. On x86, and platforms earlier than the sun4u, the kernel uses the top 256 Mbytes or 512 Mbytes of a common virtual address space, shared by the process and kernel (see Section 9.4). Sharing the kernel address space with the process address space limits the amount of usable kernel virtual address space to 256 Mbytes and 512 Mbytes, respectively, which is a substantial limitation on some of the older platforms (e.g., the SPARCcenter 2000). On sun4u platforms, the kernel has its own virtual address space context and consequently can be much larger. The sun4u kernel address space is 4 Gbytes on 32-bit kernels and spans the full 64-bit address range on 64-bit kernels. The kernel virtual address space contains the following major mappings:
The layout of the kernel's virtual memory address space is mostly platform specific, and as a result, the placement of each mapping is different on each platform. For reference, we show the sun4u 64-bit kernel address space map in Figure 11.1. Figure 11.1. Solaris 10 sun4u 64-Bit Kernel Address Space
11.1.2. Kernel Text and Data SegmentsThe kernel text and data segments are created when the kernel core is loaded and executed. The text segments contain the instructions, and the data segment contains the initialized variables from the kernel/unix image file, which is loaded at boot time by the kernel bootstrap loader. The kernel text and data are mapped into the kernel address space by the Open Boot PROM, before general startup of the kernel, to allow the base kernel code to be loaded and executed. Shortly after the kernel loads, the kernel creates the kernel address space and the segkmem kernel memory driver creates segments for kernel text and kernel data. On systems that support large pages, the kernel creates a large translation mapping for the first 4 megabytes of the kernel text and data segments and then locks that mapping into the MMU's TLB. Mapping the kernel into large pages greatly reduces the number of TLB entries required for the kernel's working set and has a dramatic impact on general system performance. Performance was increased by as much as 10 percent, for two reasons:
On SPARC platforms, we also put the trap table at the start of the kernel text (which resides on one large page). 11.1.3. Virtual Memory Data StructuresThe kernel keeps most of the virtual memory data structures required for the platform's HAT implementation in a portion of the kernel data segment and a separate memory segment. The data structures and allocation location are typically those summarized in Table 11.1.
11.1.4. UltraSPARC Kernel NucleusRequired on sun4u kernel implementations is a core area of memory that can be accessed without missing in the TLB. This memory area is necessary because the sun4u SPARC implementation uses a software TLB replacement mechanism to fill the TLB, and hence we require all the TLB miss handler data structures to be available during a TLB miss. As we discuss in Section 12.2, the TLB is filled from a software buffer, known as the translation storage buffer (TSB), of the TLB entries; all the data structures needed to handle a TLB miss and to fill the TLB from the TSB must be available with wired-down TLB mappings. To accommodate this requirement, SPARC V8 and SPARC V9 implement a special core of memory, known as the nucleus. On sun4u systems, the nucleus is the kernel text, kernel data, and the additional "large TSB" area, all of which are allocated from large pages. 11.1.5. Loadable Kernel Module Text and DataThe kernel loadable modules require memory for their executable text and data. On sun4u, up to 256 Kbytes of module text and data are allocated from the same segment as the kernel text and data, after which the module text and data are loaded from the general kernel allocation area: the kernel map segment. The location of kernel module text and data is shown in Table 11.2.
We can see which modules fit into the kernel text and data by looking at the module load addresses with the modinfo command. # modinfo Id Loadaddr Size Info Rev Module Name 5 1010c000 4b63 1 1 specfs (filesystem for specfs) 7 10111654 3724 1 1 TS (time sharing sched class) 8 1011416c 5c0 - 1 TS_DPTBL (Time sharing dispatch table) 9 101141c0 29680 2 1 ufs (filesystem for ufs) . . . . 97 10309b38 28e0 52 1 shmsys (System V shared memory) 97 10309b38 28e0 52 1 shmsys (32-bit System V shared memory) 98 1030bc90 43c - 1 ipc (common ipc code) 99 78096000 3723 18 1 ffb (ffb.c 6.42 Aug 11 1998 11:20:45) 100 7809c000 f5ee - 1 xfb (xfb driver 1.2 Aug 11 1998 11:2) 102 780c2000 1eca - 1 bootdev (bootdev misc module) Using the modinfo command, we can see on a sun4u system that the initial modules are loaded from the kernel-text large page. (Address 0x1030bc90 lies within the kernel-text large page, which starts at 0x10000000.) On 64-bit platforms, we have an additional segment for the spillover kernel text and data. The reason for having the segment is that the address at which the module text is loaded must be within a 32-bit offset from the kernel text. That's because the 64-bit kernel is compiled with the ABS32 flag so that the kernel can fit all instruction addresses within a 32-bit register. The ABS32 instruction mode provides a significant performance increase and allows the 64-bit kernel to provide similar performance to the 32-bit kernel. Because of that, a separate kernel heap mapping (segkmem32) within a 32-bit offset of the kernel text is used for spillover module text and data. Solaris does allow some portions of the kernel to be allocated from pageable memory. That way, data structures directly related to process context can be swapped out with the process during a process swap-out operation. Pageable memory is restricted to those structures that are not required by the kernel when the process is swapped out:
Pageable memory is allocated and swapped by the seg_kp segment and is only swapped out to its backing store when the memory scheduler (swapper) is activated. (See Section 10.3.6.) 11.1.6. The Kernel Address Space and SegmentsThe kernel address space is represented by the address space pointed to by the system object, kas. The segment drivers manage the manipulation of the segments within the kernel address space (see Figure 11.2). Figure 11.2. Kernel Address SpaceThe full list of segment drivers the kernel uses to create and manage kernel mappings is shown in Table 11.3. The majority of the kernel segments are manually calculated and placed for each platform, with the base address and offset hard-coded into a platform-specific header file. See Appendix A for a complete reference of platform-specific kernel allocation and address maps.
|