Section 8.8. New Features of the Virtual Memory Implementation


8.8. New Features of the Virtual Memory Implementation

The Solaris virtual memory system was originally derived from BSD UNIX. From there, the significant major architectural changes have been the union of files and virtual memory to provide a unified cache and the object layering of VM into modules, maximizing the commonality of the code across multiple platforms and devices.

During the development of Solaris, there have been many unique features added to the virtual memory system, building upon the underlying framework:

  • File system cache scalability improvements. Historically, the file system cache could be quite intrusive on application performance, by virtue of paging pressure caused by filesystem reads and writes. Beginning with Solaris 8, the file system cache was lowered in priority and made cyclic, such that file system reads and writes consume the available free memory and pages against itself.

    A new page mapping facility minimizes the overhead of accessing pages during file system I/O. By using the 64-bit address space (on SPARC and x64 architectures), the kernel creates a permenant mapping of all physical pages into its address space (SEGKPM), eliminating the need to map/unmap for each I/O.

  • Utilization of large MMU pages. As Moore's law marches on, memory sizes have effectively doubled every 18 months. The virtual memory has scaled from an original design center of around one megabyte to one terabyte today. To enable performance to scale, MMU's typically support more than one page size, and the largest page size has scaled approximately with physical memory size. MMU sizes on the first SPARC processors were 4Kbytes, and the largest available now is 256MBytes.

    The kernel text and some data is placed on a large MMU page, when possible. Beginning with Solaris 2.6 some types of shared memory (specifically ISM used by Oracle, Sybase etc) is configured to use large pages when available. A generic frameworkMultiple Page Size Selection (MPSS) was introduced in Solaris 9 to allow applications leverage different MMU page sizes.

  • Support for non-uniform (NUMA) architectures. Many high end systems now have federated non-uniform memory locality groups. By definition, all processors in an SMP system need shared access to the system memory, however as SMP systems grow to larger processor counts, their memory system often has to reflect higher memory latencies, giving good system throughput at the expense of lowering per-processor performance. Alternatively, the memory system can be broken into clusters of processors and memory with fast access to memory "close" to the processor, and slower access to memory that resides in another group. This approach is the basis for NUMA architectures.

    The Solaris virtual memory system beginning with Solaris 9, introduces Memory Placement Optimization (MPO) with the concept of locality groups (Lgroups), which allows the kernel to optimally place memory allocations closer to the processors which are likely to use them. Applications are able to provide hints to the kernel about the intended relationship between memory and threads, which is used to optimize scheduling and page allocation accordingly.

  • Dynamic reconfiguration. Added to allow hardware components (including physical memory boards) to added and removed from the system whilst online. The virtual memory system has been enhanced to optimize itself to maximize the amount of memory that can be added and removed from the system.

    Memory can be added dynamically, resulting in new pages being added to the system's free list for immediate consumption by other applications. To facilite dynamic removal of memory, the kernel has facilities for dynamically freeing or relocating pages if they are being used by applications. Kernel pages are restricted into a "kernel cage," since they are sometimes non-relocatable, allowing all but one board to be dynamically removed from the system. Beginning with Solaris 10, all but a small component of the kernel is restricted to the cage.

    Event hooks are also provided to pre-notify applications of physical memory capacity changes. These hooks are provided by the resource configuration manager (RCM) and provide a scriptable interface to notify interested applications. At the time of writing, Oracle 9 is one such application. Using Oracle's dynamic SGA feature, Oracle can be configured to automatically grow and shrink according to memory capacity changes.

  • Modern memory allocators. Have been added to the kernel. Beginning with Solaris 2.4, the allocator was replaced with the "Slab Allocator." The new allocator provides efficient allocation of memory objects with minimal fragmentation. The allocator optimizes for SMPs by providing distinct re-use caches for each processor in the system, minimizing the amount of cross-processor memory sharing traffic. Beginning with Solaris 8, the kernel also uses a universal resource allocator (vmem). The vmem allocator manages allocations of arbritrary resources, represented by sets of integers. It replaces the older resource map allocators, as well as servering as a backend to the slab allocator to managed kernel virtual memory.




SolarisT Internals. Solaris 10 and OpenSolaris Kernel Architecture
Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)
ISBN: 0131482092
EAN: 2147483647
Year: 2004
Pages: 244

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net