Section 11.1. Kernel Virtual Memory Layout


11.1. Kernel Virtual Memory Layout

The kernel, just like a process, uses virtual memory and uses the memory management unit (MMU) to translate its virtual memory addresses into physical pages. The kernel has its own address space and corresponding virtual memory layout. The kernel's address space is constructed of address space segments, using the standard Solaris memory architecture framework.

Most of the kernel's memory is nonpageable, or "wired down." The reason is that the kernel requires its memory to complete operating system tasks that could affect other memory-related data structures and, if the kernel had to take a page fault while performing a memory management task (or any other task that affected pages of memory), a deadlock could occur. Solaris does, however, allow some deadlock-safe parts of the Solaris kernel to be allocated from pageable memory, which is used mostly for the lightweight process thread stacks.

Kernel memory consists of a variety of mappings from physical memory (physical memory pages) to the kernel's virtual address space, and memory is allocated by a layered series of kernel memory allocators. Two segment drivers handle the creation and management of the majority of kernel mappings. Nonpageable kernel memory is mapped with the segkmem kernel segment driver and pageable kernel memory with the segkp segment driver. On platforms that support it, the critical and frequently used portions of the kernel are mapped from large (4-Mbyte) pages to maximize the efficiency of the hardware TLB.

11.1.1. Kernel Address Space

The kernel virtual memory layout differs from platform to platform, mostly based on the platform's MMU architecture. On x86, and platforms earlier than the sun4u, the kernel uses the top 256 Mbytes or 512 Mbytes of a common virtual address space, shared by the process and kernel (see Section 9.4). Sharing the kernel address space with the process address space limits the amount of usable kernel virtual address space to 256 Mbytes and 512 Mbytes, respectively, which is a substantial limitation on some of the older platforms (e.g., the SPARCcenter 2000). On sun4u platforms, the kernel has its own virtual address space context and consequently can be much larger. The sun4u kernel address space is 4 Gbytes on 32-bit kernels and spans the full 64-bit address range on 64-bit kernels.

The kernel virtual address space contains the following major mappings:

  • The kernel text and data (mappings of the kernel binary)

  • The kernel 64-bit heap (data structures, caches, etc.)

  • A 32-bit heap, for module text and data (64-bit kernels only)

  • Critical virtual memory data structures (TSB, etc.)

  • A place for mapping the file system cache (segmap)

The layout of the kernel's virtual memory address space is mostly platform specific, and as a result, the placement of each mapping is different on each platform. For reference, we show the sun4u 64-bit kernel address space map in Figure 11.1.

Figure 11.1. Solaris 10 sun4u 64-Bit Kernel Address Space


11.1.2. Kernel Text and Data Segments

The kernel text and data segments are created when the kernel core is loaded and executed. The text segments contain the instructions, and the data segment contains the initialized variables from the kernel/unix image file, which is loaded at boot time by the kernel bootstrap loader.

The kernel text and data are mapped into the kernel address space by the Open Boot PROM, before general startup of the kernel, to allow the base kernel code to be loaded and executed. Shortly after the kernel loads, the kernel creates the kernel address space and the segkmem kernel memory driver creates segments for kernel text and kernel data.

On systems that support large pages, the kernel creates a large translation mapping for the first 4 megabytes of the kernel text and data segments and then locks that mapping into the MMU's TLB. Mapping the kernel into large pages greatly reduces the number of TLB entries required for the kernel's working set and has a dramatic impact on general system performance. Performance was increased by as much as 10 percent, for two reasons:

  • The time spent in TLB miss handlers for kernel code was reduced to almost zero.

  • The number of TLB entries used by the kernel was dramatically reduced, leaving more TLB entries for user code and reducing the amount of time spent in TLB miss handlers for user code.

On SPARC platforms, we also put the trap table at the start of the kernel text (which resides on one large page).

11.1.3. Virtual Memory Data Structures

The kernel keeps most of the virtual memory data structures required for the platform's HAT implementation in a portion of the kernel data segment and a separate memory segment. The data structures and allocation location are typically those summarized in Table 11.1.

Table 11.1. Virtual Memory Data Structures

Platform

Data Structures

Location

sun4u

The Translation Storage Buffer (TSB). The HAT mapping blocks (HME), one for every page-sized virtual address mapping. (See Section 12.2.)

Allocated initially from the kernel data-segment large page, and overflows into another large-page, mapped segment, just above the kernel data segment.

amd64

Page Tables, Page Structures

Allocated in the kernel data-segment large page.

x86

Page Tables, Page Structures

Allocated from a separate VM data structure's segment.


11.1.4. UltraSPARC Kernel Nucleus

Required on sun4u kernel implementations is a core area of memory that can be accessed without missing in the TLB. This memory area is necessary because the sun4u SPARC implementation uses a software TLB replacement mechanism to fill the TLB, and hence we require all the TLB miss handler data structures to be available during a TLB miss. As we discuss in Section 12.2, the TLB is filled from a software buffer, known as the translation storage buffer (TSB), of the TLB entries; all the data structures needed to handle a TLB miss and to fill the TLB from the TSB must be available with wired-down TLB mappings. To accommodate this requirement, SPARC V8 and SPARC V9 implement a special core of memory, known as the nucleus. On sun4u systems, the nucleus is the kernel text, kernel data, and the additional "large TSB" area, all of which are allocated from large pages.

11.1.5. Loadable Kernel Module Text and Data

The kernel loadable modules require memory for their executable text and data. On sun4u, up to 256 Kbytes of module text and data are allocated from the same segment as the kernel text and data, after which the module text and data are loaded from the general kernel allocation area: the kernel map segment. The location of kernel module text and data is shown in Table 11.2.

Table 11.2. Kernel Loadable Module Allocation

Platform

Module Kernel and Text Allocation

sun4u 64 bit

Up to 256 Kbytes of kernel module are loaded from the same large pages as the kernel text and data. The remainder are loaded from the 32-bit kernel map segment, a segment that is specifically for module text and data.

amd64

Up to 256 Kbytes of kernel module are loaded from the same large pages as the kernel text and data. The remainder are loaded from an additional segment, shared by HAT data structures and module text/data.

x86

Up to 256 Kbytes of kernel module are loaded from the same large pages as the kernel text and data. The remainder are loaded from an additional segment, shared by HAT data structures and module text/data.


We can see which modules fit into the kernel text and data by looking at the module load addresses with the modinfo command.

# modinfo  Id Loadaddr   Size Info Rev Module Name   5 1010c000   4b63   1   1  specfs (filesystem for specfs)   7 10111654   3724   1   1  TS (time sharing sched class)   8 1011416c    5c0   -   1  TS_DPTBL (Time sharing dispatch table)   9 101141c0  29680   2   1  ufs (filesystem for ufs)                .                .                .                .  97 10309b38   28e0  52   1  shmsys (System V shared memory)  97 10309b38   28e0  52   1  shmsys (32-bit System V shared memory)  98 1030bc90    43c   -   1  ipc (common ipc code)  99 78096000   3723  18   1  ffb (ffb.c 6.42 Aug 11 1998 11:20:45) 100 7809c000   f5ee   -   1  xfb (xfb driver 1.2 Aug 11 1998 11:2) 102 780c2000   1eca   -   1  bootdev (bootdev misc module) 


Using the modinfo command, we can see on a sun4u system that the initial modules are loaded from the kernel-text large page. (Address 0x1030bc90 lies within the kernel-text large page, which starts at 0x10000000.)

On 64-bit platforms, we have an additional segment for the spillover kernel text and data. The reason for having the segment is that the address at which the module text is loaded must be within a 32-bit offset from the kernel text. That's because the 64-bit kernel is compiled with the ABS32 flag so that the kernel can fit all instruction addresses within a 32-bit register. The ABS32 instruction mode provides a significant performance increase and allows the 64-bit kernel to provide similar performance to the 32-bit kernel. Because of that, a separate kernel heap mapping (segkmem32) within a 32-bit offset of the kernel text is used for spillover module text and data.

Solaris does allow some portions of the kernel to be allocated from pageable memory. That way, data structures directly related to process context can be swapped out with the process during a process swap-out operation. Pageable memory is restricted to those structures that are not required by the kernel when the process is swapped out:

  • Lightweight process stacks

  • The TNF Trace buffers

  • Special pages, such as the page of memory that is shared between user and kernel for scheduler preemption control

Pageable memory is allocated and swapped by the seg_kp segment and is only swapped out to its backing store when the memory scheduler (swapper) is activated. (See Section 10.3.6.)

11.1.6. The Kernel Address Space and Segments

The kernel address space is represented by the address space pointed to by the system object, kas. The segment drivers manage the manipulation of the segments within the kernel address space (see Figure 11.2).

Figure 11.2. Kernel Address Space


The full list of segment drivers the kernel uses to create and manage kernel mappings is shown in Table 11.3. The majority of the kernel segments are manually calculated and placed for each platform, with the base address and offset hard-coded into a platform-specific header file. See Appendix A for a complete reference of platform-specific kernel allocation and address maps.

Table 11.3. Solaris Kernel Memory Segment Drivers

Segment

Function

seg_kmem

Allocates and maps nonpageable kernel memory pages.

seg_kp

Allocates, maps, and handles page faults for pageable kernel memory.

seg_nf

Nonfaulting kernel memory driver.

seg_map

Maps the file system cache into the kernel address space.

seg_kpm

Maps physical memory into the kernel address space, on 64-bit platforms. Allows fast access to file system page cache.





SolarisT Internals. Solaris 10 and OpenSolaris Kernel Architecture
Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)
ISBN: 0131482092
EAN: 2147483647
Year: 2004
Pages: 244

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net