11.3. The Implementation of Disk DevicesAlthough the file system layer in the Mac OS X kernel sees storage devices as BSD devices, the I/O Kit ultimately drives these devices. Figure 118 shows the relevant portion of the I/O Kit stack on a system with two serial ATA (SATA) disks. Figure 118. An I/O Kit stack depicting a disk device and its partitionsThe IOATABlockStorageDriver is a client of the I/O Kit ATA family and a member of the storage family. In the I/O Kit, the actual storage on a storage device is represented by an I/O Media object (IOMedia), an instance of which can abstract several types of random access devicesboth real and virtualsuch as the following:
Apple's implementation of software RAID (AppleRAID) combines multiple block devices to construct an I/O Kit storage stack yielding a single virtual device. When I/O is performed to the virtual device, the RAID implementation calculates the offsets on the specific physical devices to which the I/O must be dispatched. An I/O Media object acts as a channel for all I/O that goes to the storage underlying it. As we saw in Chapter 10, Mac OS X also supports I/O Media Filter objects, which are subclasses of IOMedia and can be inserted between I/O Media objects and their clients, thereby routing all I/O through the filter object as well. The IOMediaBSDClient class, which is implemented as part of the IOStorageFamily I/O Kit family, is the entity in charge of making storage devices appear as BSD-style block and character devices. In particular, as disks and partitions appear in the I/O Kit, IOMediaBSDClient calls the device file system (devfs) to dynamically add the corresponding block and character device nodes. Similarly, when a device is removed, IOMediaBSDClient calls devfs to remove the corresponding BSD nodes. The block and character device function tablesthe traditional Unix-style bdevsw and cdevsw structuresare also part of the IOMediaBSDClient implementation (Figure 119). Figure 119. The Mac OS X block and character device switch structures
Let us see an example of how I/O propagates from the file system to a disk device. Figure 1110 is partially derived from Figure 852, which showed an overview of a page-in operation. In Figure 1110, we follow the path of a typical read request destined for an HFS Plus volume residing on an ATA device. Figure 1110. A typical read request's journey to a disk deviceNote in Figure 1110 that cluster_io() and related routines represent the typical I/O path in the kernelone that goes through the unified buffer cache. Although not shown in the figure, before issuing the I/O through the file system's strategy routine, cluster_io() calls the file system's VNOP_BLOCKMAP() operation to map file offsets to disk offsets. Eventually, the strategy routine of the block devicedkstrategy()is called. dkstrategy() calls dkreadwrite(), which sends the I/O down the I/O Kit stack. In this example, the device is an ATA device. When the I/O eventually reaches the IOBlockStorageDriver class, the latter will choose the appropriate ATA commands and flags to perform the actual transfer.
Note that Mac OS X does not use explicit disk scheduling. In particular, I/O requests are not explicitly reordered, although the nonI/O Kit parts of the kernel may defer a request in order to combine several requests into a single large request. |