6.2. MachLet us briefly review our discussion of Mach from Chapters 1 and 2. Mach was designed as a communications-oriented operating system kernel with full multiprocessing support. Various types of operating systems could be built upon Mach. It aimed to be a microkernel in which traditional operating system services such as file systems, I/O, memory managers, networking stacks, and even operating system personalities were meant to reside in user space, with a clean logical and modular separation between them and the kernel. In practice, releases of Mach prior to release 3 had monolithic implementations. Release 3a project started at Carnegie Mellon University and continued by the Open Software Foundationwas the first true microkernel version of Mach: BSD ran as a user-space task in this version. The Mach portions of xnu were originally based on Open Group's Mach Mk 7.3 system, which in turn was based on Mach 3. xnu's Mach contains enhancements from MkLinux and work done on Mach at the University of Utah. Examples of the latter include the migrating thread model, wherein the thread abstraction is further decoupled into an execution context and a schedulable thread of control with an associated chain of contexts.
In this chapter, we will discuss basic Mach concepts and programming abstractions. We will look at some of these concepts in more detail in the next three chapters in the context of process management, memory management, and interprocess communication (IPC).
In this book, Mach-related programming examples are presented to demonstrate the internal working of certain aspects of Mac OS X. However, Apple does not support the direct use of most Mach-level APIs by third-party programs. Consequently, you are advised against using such APIs in software you distribute. 6.2.1. Kernel FundamentalsMach provides a virtual machine interface to higher layers by abstracting system hardwarea scenario that is common among many operating systems. The core Mach kernel is designed to be simple and extensible: It provides an IPC mechanism that is the building block for many services offered by the kernel. In particular, Mach's IPC features are unified with its virtual memory subsystem, which leads to various optimizations and simplifications.
The 4.4BSD virtual memory system was based on the Mach 2.0 virtual memory system, with updates from newer versions of Mach. Mach has five basic abstractions from a programmer's standpoint:
Besides providing the basic kernel abstractions, Mach represents various other hardware and software resources as port objects, allowing manipulation of such resources through its IPC mechanism. For example, Mach represents the overall computer system as a host object, a single physical CPU as a processor object, and one or more groups of CPUs in a multiprocessor system as processor set objects. 6.2.1.1. Tasks and ThreadsMach divides the traditional Unix abstraction of a process into two parts: a task and a thread. As we will see in Chapter 7, the terms thread and process have context-specific connotations in the Mac OS X user space, depending on the environment. Within the kernel, a BSD process, which is analogous to a traditional Unix process, is a data structure with a one-to-one mapping with a Mach task. A Mach task has the following key features.
A thread is the actual executing entity in Machit is a point of control flow in a task. It has the following features.
To sum up, a task is passive, owns resources, and is a basic unit of protection. Threads within a task are active, execute instructions, and are basic units of control flow. A single-threaded traditional Unix process is analogous to a Mach task with only one thread, whereas a multithreaded Unix process is analogous to a Mach task with many threads.
A task is considerably more expensive to create or destroy than a thread. Whereas every thread has a containing task, a Mach task is not related to its creating task, unlike Unix processes. However, the kernel maintains process-level parent-child relationships in the BSD process structures. Nevertheless, we may consider a task that creates another task to be the parent task and the newly created task to be the child task. During creation, the child inherits certain aspects of the parent, such as registered ports, exception and bootstrap ports, audit and security tokens, shared mapping regions, and the processor set. Note that if the parent's processor set has been marked inactive, the child is assigned to the default processor set.
Once a task is created, anyone with a valid task identifier (and thus the appropriate rights to a Mach IPC port) can perform operations on the task. A task can send its identifier to other tasks in an IPC message, if it so desires. 6.2.1.2. PortsA Mach port is a multifaceted abstraction. It is a kernel-protected unidirectional IPC channel, a capability, and a name. Traditionally in Mach, a port is implemented as a message queue with a finite length.
Besides Mach ports, Mac OS X provides many other types of IPC mechanisms, both within the kernel and in user space. Examples of such mechanisms include POSIX and System V IPC, multiple notification mechanisms, descriptor passing, and Apple Events. We will examine several IPC mechanisms in Chapter 9. The port abstraction, along with associated operations (the most fundamental being send and receive), is the basis for communication in Mach. A port has kernel-managed capabilitiesor rightsassociated with it. A task must hold the appropriate rights to manipulate a port. For example, rights determine which task can send messages to a given port or which task may receive messages destined for it. Several tasks can have send rights to a particular port, but only one task can hold receive rights to a given port. In the object-oriented sense, a port is an object reference. Various abstractions in Mach, including data structures and services, are represented by ports. In this sense, a port acts as a protected access provider to a system resource. You access objects such as tasks, threads, or memory objects[1] through their respective ports. For example, each task has a task port that represents that task in calls to the kernel. Similarly, a thread's point of control is accessible to user programs through a thread port. Any such access requires a port capability, which is the right to send or receive messages to that port, or rather, to the object the port represents. In particular, you perform operations on an object by sending messages to one of its ports.[2] The object holding receive rights to the port can then receive the message, process it, and possibly perform an operation requested in the message. The following are two examples of this mechanism.
Since a port is a per-task resource, all threads within a task automatically have access to the task's ports. A task can allow other tasks to access one or more of its ports. It does so by passing port rights in IPC messages to other tasks. Moreover, a thread can access a port only if the port is known to the containing taskthere is no global, system-wide port namespace. Several ports may be grouped together in a port set. All ports in a set share the same queue. Although there still is a single receiver, each message contains an identifier for the specific port within the port set on which the message was received. This functionality is similar to the Unix select() system call.
Note that a port can be used to send messages in only one direction. Therefore, unlike a BSD socket, a port does not represent an end point of a bidirectional communication channel. If a request message is sent on a certain port and the sender needs to receive a reply, another port must be used for the reply. As we will see in Chapter 9, a task's IPC space includes mappings from port names to the kernel's internal port objects, along with rights for these names. A Mach port's name is an integerconceptually similar to a Unix file descriptor. However, Mach ports differ from file descriptors in several ways. For example, a file descriptor may be duplicated multiple times, with each descriptor being a different number referring to the same open file. If multiple port rights are similarly opened for a particular port, the port names will coalesce into a single name, which would be reference-counted for the number of rights it represents. Moreover, other than certain standard ports such as registered, bootstrap, and exception ports, Mach ports are not inherited implicitly across the fork() system call. 6.2.1.3. MessagesMach IPC messages are data objects that threads exchange with each other to communicate. Typical intertask communication in Mach, including between the kernel and user tasks, occurs using messages. A message may contain actual inline data or a pointer to out-of-line (OOL) data. OOL data transfer is an optimization for large transfers, wherein the kernel allocates a memory region for the message in the receiver's virtual address space, without making a physical copy of the message. The shared memory pages are marked copy-on-write (COW). A message may contain arbitrary program data, copies of memory ranges, exceptions, notifications, port capabilities, and so on. In particular, the only way to transfer port capabilities from one task to another is through messages. Mach messages are transferred asynchronously. Even though only one task can hold receive rights to a port, multiple threads within a task may attempt to receive messages on a port. In such a case, only one of the threads will succeed in receiving a given message. 6.2.1.4. Virtual Memory and Memory ObjectsMach's virtual memory (VM) system can be cleanly separated into machine-independent and machine-dependent parts. For example, address maps, memory objects, share maps, and resident memory are machine-independent, whereas the physical map (pmap) is machine-dependent. We will discuss VM-related abstractions in detail in Chapter 8. Features of Mach's VM design include the following.
A memory object is a container for data (including file data) that is mapped into the address space of a task. It serves as a channel for providing memory to tasks. Mach traditionally allows a memory object to be managed by a user-mode external memory manager, wherein the handling of page faults and page-out data requests can be performed in user space. An external pager can also be used to implement networked virtual memory. This external memory management (EMM) feature of Mach is not used in Mac OS X. xnu provides basic paging services in the kernel through three pagers: the default (anonymous) pager, the vnode pager, and the device pager. The default pager handles anonymous memorythat is, memory with no explicitly designated pager. It is implemented in the Mach portion of the kernel. With help from the dynamic_pager user-space application,[3] which manages on-disk backing-store (or swap) files, the default pager pages to swap files on a normal file system.
Swap files reside under the /var/vm/ directory by default. The files are named swapfileN, where N is the swap file's number. The first swap file is called swapfile0. The vnode pager is used for memory-mapped files. Since the Mac OS X VFS is in the BSD portion of the kernel, the vnode pager is implemented in the BSD layer. The device pager is used for non-general-purpose memory. It is implemented in the Mach layer but used by the I/O Kit. 6.2.2. Exception HandlingA Mach exception is a synchronous interruption of a program's execution that occurs due to the program itself. The causes for exceptions can be erroneous conditions such as executing an illegal instruction, dividing by zero, or accessing invalid memory. Exceptions can also be caused deliberately, such as during debugging, when a debugger breakpoint is hit. xnu's Mach implementation associates an array of exception ports with each task and another with each thread within a task. Each such array has as many slots as there are exception types defined for the implementation, with slot 0 being invalid. All of a thread's exception ports are set to the null port (IP_NULL) when the thread is created, whereas a task's exception ports are inherited from those of the parent task. The kernel allows a programmer to get or set individual exception ports for both tasks and threads. Consequently, a program can have multiple exception handlers. A single handler may also handle multiple exception types. Typical preparation for exception handling by a program involves allocation of one or more ports to which the kernel will send exception notification messages. The port can then be registered as an exception port for one or more types of exceptions for either a thread or a task. The exception handler code typically runs in a dedicated thread, waiting for notification messages from the kernel. Exception handling in Mach can be viewed as a metaoperation consisting of several suboperations. The thread that causes an exception is called the victim thread, whereas the thread that runs the exception handler is called the handler thread. When a victim causes (raises) an exception, the kernel suspends the victim thread and sends a message to the appropriate exception port, which may be either a thread exception port (more specific) or a task exception port (if the thread has not set an exception port). Upon receiving (catching) the message, the handler thread processes the exceptionan operation that may involve fixing the victim's state, arranging for it to be terminated, logging an error, and so on. The handler replies to the message, indicating whether the exception was processed successfully (cleared). Finally, the kernel either resumes the victim or terminates it. A thread exception port is typically relevant for error handling. Each thread may have its own exception handlers that process exceptions corresponding to errors that affect only individual threads. A task exception port is typically relevant for debugging. A debugger can attach to a task by registering one of its own ports as the debugged task's exception port. Since a task inherits its exception ports from the creating task, the debugger will also be able to control child processes of the debugged program. Moreover, exception notifications for all threads that have no registered exception port will be sent to the task exception port. Recall that a thread is created with null exception ports and, correspondingly, with no default handlers. Therefore, this works well in the general case. Even when a thread does have valid exception ports, the corresponding exception handlers can forward exceptions to the task exception port. We will look at a programming example of Mach exception handling in Chapter 9. |