8.14. Task Working Set Detection and MaintenanceThe kernel uses physical memory as a cache for virtual memory. When new pages are to be brought in because of page faults, the kernel may need to decide which pages to reclaim from among those that are currently in physical memory. For an application, the kernel should ideally keep in memory those pages that would be needed very soon. In the utopian operating system, the kernel would know ahead of time the pages an application references as it runs. Several algorithms that approximate such optimal page replacement have been researched. Another approach uses the Principle of Locality, on which the Working Set Model is based. As described in the paper titled "Virtual Memory,"[19] locality can be informally understood as a program's affinity for a subset of its pages, where this set of favored pages changes membership slowly. This gives rise to the working setinformally defined as the set of "most useful" pages for a program. The Working Set Principle establishes the rule that a program may run if and only if its working set is in memory, and a page may not be removed if it is the member of a running program's working set. Studies have shown that keeping a program's working set resident in physical memory typically allows it to run with acceptable performancethat is, without causing an unacceptable number of page faults.
8.14.1. The TWS MechanismThe Mac OS X kernel includes an application-profiling mechanism that can construct per-user, per-application working set profiles, save the corresponding pages in a designated directory, and attempt to load them when the application is executed by that user. We will call this mechanism TWS, for task working set (the various functions and data structures in its implementation have the tws prefix in their names). TWS is integrated with the kernel's page-fault-handling mechanismit is called when there is a page fault. The first time an application is launched in a given user context, TWS captures the initial working set and stores it in a file in the /var/vm/app_profile/ directory. Several aspects of the TWS scheme contribute to performance.
8.14.2. TWS ImplementationGiven a user with user ID U, TWS stores application profiles for that user as two files in /var/vm/app_profile/: #U_names and #U_data, where #U is the hexadecimal representation of U. The names file is a simple database that contains a header followed by profile elements, whereas the data file contains the actual working sets. The profile elements in the names file point to the working sets in the data file. // bsd/vm/vm_unix.c // header for the "names" file struct profile_names_header { unsigned int number_of_profiles; unsigned int user_id; unsigned int version; off_t element_array; unsigned int spare1; unsigned int spare2; unsigned int spare3; }; // elements in the "names" file struct profile_element { off_t addr; vm_size_t size; unsigned int mod_date; unsigned int inode; char name[12]; }; The kernel maintains a global profile cache data structure containing an array of global profiles, each of whose entries contains profile file information for one user. // bsd/vm/vm_unix.c // meta information for one user's profile struct global_profile { struct vnode *names_vp; struct vnode *data_vp; vm_offset_t buf_ptr; unsigned int user; unsigned int age; unsigned int busy; }; struct global_profile_cache { int max_ele; unsigned int age; struct global_profile profiles[3]; // up to 3 concurrent users }; ... struct global_profile_cache global_user_profile_cache = { 3, 0, { NULL, NULL, 0, 0, 0, 0 }, { NULL, NULL, 0, 0, 0, 0 }, { NULL, NULL, 0, 0, 0, 0 } }; Let us use the readksym.sh script to read the contents of global_user_profile_cache. We can see from the output shown in Figure 827 that the three global per-user slots are occupied by the user IDs 0x1f6 (502), 0, and 0x1f5 (501). Figure 827. Reading the contents of the TWS subsystem's global user profile cache
Most of the TWS functionality is implemented in osfmk/vm/task_working_set.c and bsd/vm/vm_unix.c. The former uses functions implemented by the latter for dealing with profile files.
As shown in Figure 86, the task structure's dynamic_working_set field is a pointer to a tws_hash structure [osfmk/vm/task_working_set.h]. This pointer is initialized during task creationspecifically by task_create_internal(), which calls task_working_set_create() [osfmk/vm/task_working_set.c]. Conversely, when the task is terminated, the working set is flushed (by task_terminate_internal()) and the corresponding hash entry is destroyed (by task_deallocate()). // osfmk/kern/task.c kern_return_t task_create_internal(task_t parent_task, boolean_t inherit_memory, task_t *child_task) { ... new_task->dynamic_working_set = 0; task_working_set_create(new_task, TWS_SMALL_HASH_LINE_COUNT, 0, TWS_HASH_STYLE_DEFAULT); ... } task_working_set_create() calls tws_hash_create() [osfmk/vm/task_working_set.c] to allocate and initialize a tws_hash structure. As shown in Figure 828, execve() saves the executable's name for the TWS mechanism. Before a Mach-O executable is loaded, the Mach-O image activator calls tws_handle_startup_file() [osfmk/vm/task_working_set.c] to preheat the task if possible. Figure 828. TWS-related processing during the execve() system call
tws_handle_startup_file() first calls bsd_read_page_cache_file() [bsd/vm/vm_unix.c] to read the appropriate page cache file. If the read attempt succeeds, the existing profile is read by a call to tws_read_startup_file(). If the read attempt fails because no profile was found for the application, a new profile is created by calling tws_write_startup_file(), which in turn calls task_working_set_create(). The working set information is later written to disk by a call to tws_send_startup_info(), which calls bsd_write_page_cache_file(). The rest (and most) of the TWS activity occurs during page-fault handlingthe mechanism is specifically invoked on a page fault, which allows it to monitor the application's fault behavior. vm_fault() [osfmk/vm/vm_fault.c]the page-fault handlercalls vm_fault_tws_insert() [osfmk/vm/vm_fault.c] to add page-fault information to the current task's working set. vm_fault_tws_insert() is provided with a VM object and an offset within it, using which it performs a hash lookup in the tws_hash data structure pointed to by the task's dynamic_working_set field. This way, it determines whether the object/offset pair needs to be inserted in the hash and whether doing so needs the cached working set to be expanded. Moreover, vm_fault_tws_insert() returns a Boolean value to its caller indicating whether the page cache files need to be written. If so, vm_fault() calls tws_send_startup_info() to write the files through an eventual call to bsd_write_page_cache_file(). vm_fault() may also call vm_fault_page() [osfmk/vm/vm_fault.c], which finds the resident page for the virtual memory specified by the given VM object and offset. In turn, vm_fault_page() may need to call the appropriate pager to retrieve the data. Before it issues a request to the pager, it calls tws_build_cluster() to add up to 64 pages from the working set to the request. This allows a single large request to be made to the pager. |