Section 11.1. Disks and Partitions


11.1. Disks and Partitions

A common medium for storing user data is a hard[1] disk drive. The storage space on a disk is divided at the hardware level into fundamental units called sectors. In a typical hard drive, each sector holds 512 bytes[2] of user data. A sector may also hold some additional data used internally by the drivesuch as data for error correction and synchronization. A disk may also have some number of spare sectors that are not exposed through its interface. If a regular sector goes bad, the disk can attempt to transparently replace it with a spare one. Modern drives deprecate the geometric Cylinder-Head-Sector (CHS) addressing model for accessing a sector. The preferred model is Logical Block Addressing (LBA), in which addressable storage on a drive appears as a linear sequence of sectors.

[1] Given that it uses rigid platters, a hard disk was originally so called to distinguish it from a floppy disk.

[2] In contrast with disk drives, optical drives commonly use a sector size of 2KB.

The program in Figure 111 uses disk I/O control (ioctl) operations to retrieve and display basic information about a disk device on Mac OS X.

Figure 111. Using ioctl operations to display information about a disk device

// diskinfo.c #include <stdio.h> #include <fcntl.h> #include <unistd.h> #include <stdlib.h> #include <sys/disk.h> #define PROGNAME "diskinfo" void cleanup(char *errmsg, int retval) {     perror(errmsg);     exit(retval); } #define TRY_IOCTL(fd, request, argp) \     if ((ret = ioctl(fd, request, argp)) < 0) { \         close(fd); cleanup("ioctl", ret); \     } int main(int argc, char **argv) {     int       fd, ret;     u_int32_t blockSize;     u_int64_t blockCount;     u_int64_t maxBlockRead;     u_int64_t maxBlockWrite;     u_int64_t capacity1000, capacity1024;     dk_firmware_path_t fwPath;     if (argc != 2) {         fprintf(stderr, "usage: %s <raw disk>\n", PROGNAME);         exit(1);     }     if ((fd = open(argv[1], O_RDONLY, 0)) < 0)         cleanup("open", 1);     TRY_IOCTL(fd, DKIOCGETFIRMWAREPATH, &fwPath);     TRY_IOCTL(fd, DKIOCGETBLOCKSIZE, &blockSize);     TRY_IOCTL(fd, DKIOCGETBLOCKCOUNT, &blockCount);     TRY_IOCTL(fd, DKIOCGETMAXBLOCKCOUNTREAD, &maxBlockRead);     TRY_IOCTL(fd, DKIOCGETMAXBLOCKCOUNTWRITE, &maxBlockWrite);     close(fd);     capacity1024  = (blockCount * blockSize) / (1ULL << 30ULL);     capacity1000  = (blockCount * blockSize) / (1000ULL * 1000ULL * 1000ULL);     printf("%-20s = %s\n", "Device", argv[1]);     printf("%-20s = %s\n", "Firmware Path", fwPath.path);     printf("%-20s = %llu GB / %llu GiB\n", "Capacity",            capacity1000, capacity1024);     printf("%-20s = %u bytes\n", "Block Size", blockSize);     printf("%-20s = %llu\n", "Block Count", blockCount);     printf("%-20s = { read = %llu blocks, write = %llu blocks }\n",            "Maximum Request Size", maxBlockRead, maxBlockWrite);     exit(0); } $ gcc -Wall -o diskinfo diskinfo.c $ sudo ./diskinfo /dev/rdisk0 Device               = /dev/rdisk0 Firmware Path        = first-boot/@0:0 Capacity             = 250 GB / 232 GiB Block Size           = 512 bytes Block Count          = 488397168 Maximum Request Size = { read = 2048 blocks, write = 2048 blocks }

The two capacity numbers listed in Figure 111 are calculated using the metric and computer definitions of the giga prefix. The metric definition (1 gigabyte = 109 bytes; abbreviated as GB) leads to a larger capacity than a traditional computer science definition (1 gigabyte = 230 bytes; abbreviated as GiB).


A disk may be logically divided into one or more partitions, which are sets of contiguous blocks that can be thought of as subdisks. A partition may contain an instance of a file system. Such an instancea volumeis essentially a structured file residing on the partition. Besides user data, its contents include data structures that facilitate organization, retrieval, modification, access control, and sharing of the user data, while hiding the disk's physical structure from the user. A volume has its own block size that is usually a multiple of the disk block size.

The mount command, when invoked without any arguments, prints the list of currently mounted file systems. Let us determine the partition that corresponds to the root file system.[3]

[3] You can retrieve information about mounted file systems programmatically by calling the getmntinfo() library function.

$ mount /dev/disk0s10 on / (local, journaled) ...


In this case, the root file system is on the tenth partitionor sliceof disk0. This system uses the Apple partitioning scheme (see Section 11.1.1), and therefore, the partition table for disk0 can be viewed using the pdisk command, which is a partition table editor for this scheme.

Raw Devices

/dev/rdisk0 is the raw device corresponding to disk0. A disk or disk slice can be accessed through either its block device (such as /dev/disk0 or /dev/disk0s9) or the corresponding character device (such as /dev/rdisk0 or /dev/rdisk0s9). When data is read from or written to a disk's block device, it goes through the operating system's buffer cache. In contrast, with a character device, data transfers are rawwithout the involvement of the buffer cache. I/O to raw disk devices requires that the size of an I/O request be a multiple of the disk's block size and the request's offset to be aligned to the block size.

Many systems have historically provided raw devices so that programs for partitioning disks, creating file systems, and repairing existing file systems can do their job without invalidating the buffer cache. An application may also wish to implement its own buffering for the data it reads directly from disk into memory, in which case it is wasteful to have the data cached by the system as well. Using the raw interface to modify data that is already present in the buffer cache could lead to undesirable results. It can also be argued that raw devices are unnecessary because low-level file system utilities are rarely run. Moreover, mmap() is an alternative to reading directly from raw devices.

Sometimes a block device is referred to as a cooked device in contrast to a raw device.


A similar list of disk0's partitions can be obtained using the diskutil command.

$ diskutil list disk0 /dev/disk0    #:                   type name               size      identifier    0: Apple_partition_scheme                    *12.0 GB  disk0    1:    Apple_partition_map                    31.5 KB   disk0s1 ...    8:          Apple_Patches                    256.0 KB  disk0s8    9:              Apple_HFS Macintosh HD       11.9 GB   disk0s10


11.1.1. The Apple Partitioning Scheme

The disk in Figure 112 uses the Apple partitioning scheme, with a specific partition layout called UNIVERSAL HD, which includes several legacy partitions. Let us analyze the pdisk output for disk0.

  • There are 11 partitions on this disk.

  • The first partition (disk0s1) is the partition map that contains partitioning-related metadata. The metadata consists of partition map entries, each of which describes one partition. The map is 63 blocks in size, with each block being 512 bytes.

  • Partition numbers 2 (disk0s2) through 7 (disk0s7) are Mac OS 9 driver partitions. Historically, block device drivers could be loaded from several places: the ROM, a USB or FireWire device, or a special partition on a fixed disk. To support multiple operating systems or other features, a disk may have one or more device drivers installedeach in its own partition. The partitions named Apple_Driver43 in this example contain SCSI Manager 4.3. Note that neither Mac OS X nor the Classic environment uses these Mac OS 9 drivers.

  • Partition number 8 (disk0s8) is a patch partitiona metadata partition that can contain patches to be applied to the system before it can boot.

  • Partition number 9 consists of free spacea partition whose type is Apple_Free.

  • Partition number 10 (disk0s10) is a data partition. In this example, the disk has only one data partition that contains an HFS Plus (or HFS) file system.

  • The trailing free space constitutes the last partition.

Figure 112. Listing a disk's partitions

$ sudo pdisk /dev/rdisk0 -dump Partition map (with 512 byte blocks) on '/dev/rdisk0'  #:                type name                   length   base     ( size )  1: Apple_partition_map Apple                      63 @ 1  2:      Apple_Driver43*Macintosh                  56 @ 64  3:      Apple_Driver43*Macintosh                  56 @ 120  4:    Apple_Driver_ATA*Macintosh                  56 @ 176  5:    Apple_Driver_ATA*Macintosh                  56 @ 232  6:      Apple_FWDriver Macintosh                 512 @ 288  7:  Apple_Driver_IOKit Macintosh                 512 @ 800  8:       Apple_Patches Patch Partition           512 @ 1312  9:          Apple_Free                        262144 @ 1824     (128.0M) 10:           Apple_HFS Apple_HFS_Untitled_1 24901840 @ 263968   ( 11.9G) 11:          Apple_Free                            16 @ 25165808 Device block size=512, Number of Blocks=25165824 (12.0G) ...

A variant partition layout called UNIVERSAL CD would have partitions containing ATAPI drivers and SCSI Manager for CD.


Figure 113 shows details of the Apple partitioning scheme. Although the disk shown in this case has a simpler layout, with no patch or driver partitions, the on-disk data structures follow similar logic regardless of the number and types of partitions.

Figure 113. A disk partitioned using the Apple partitioning scheme


The first physical block's first two bytes are set to 0x4552 ('ER'), which is the Apple partitioning scheme signature. The next two bytes represent the disk's physical block size. The total number of blocks on the disk is contained in the next four bytes. We can use the dd command to examine the contents of these bytes for disk0.

$ sudo dd if=/dev/disk0 of=/dev/stdout bs=8 count=1 2>/dev/null | hexdump 0000000 4552 0200 0180 0000 ...


We see that the block size is 0x200 (512), and that the disk has 0x1800000 (25165824) 512-byte blocks.

The next 63 512-byte blocks constitute the partition map. Each block represents a single partition map entry that describes a partition. Each map entry contains 0x504D ('PM') as its first two bytes, followed by information that includes the partition's starting offset, size, and type.

The pdisk command lets you view, edit, and create Apple partitionsboth interactively and otherwise. Another command-line tool, diskutil, uses the Mac OS X Disk Management framework to let you modify, verify, and repair disks. You can also use the GUI-based Disk Utility application (/Applications/Utilities/Disk Utility.app) to manage disks, partitions, and volumes. Disk Utility allows creation of up to 16 partitions on a disk. One could create as many partitions as would fit in a given partition mapsay, using pdisk. However, some programs may not be able to handle more than 16 partitions properly.

11.1.2. PC-Style Partitioning

In contrast with the Apple partitioning scheme, PC partitions may be primary, extended, or logical, with at most four primary partitions allowed on a disk. The first 512-byte sector of a PC disk, the master boot record (MBR), has its space divided as follows: 446 bytes for bootstrap code, 64 bytes for four partition table entries of 16 bytes each, and 2 bytes for a signature. Therefore, the size of a PC partition table is rather limited, which in turn limits the number of primary partitions. However, one of the primary partitions may be an extended partition. An arbitrary number of logical partitions can be defined within an extended partition. The Mac OS X command-line program fdisk can be used to create and manipulate PC-style partitions.

11.1.3. GUID-Based Partitioning

We discussed GUID-based partitioning in Section 4.16.4.4 in the context of the Extensible Firmware Interface (EFI). x86-based Macintosh computers use the GUID-based scheme instead of the Apple partitioning scheme. In particular, although x86-based Macintosh computers support the Apple partitioning scheme, they can boot only from a volume partitioned using the GUID-based scheme.

Figure 423 shows the structure of a GPT-partitioned disk. The gpt command-line program can be used on Mac OS X to initialize a disk with a GUID Partition Table (GPT) and to manipulate partitions within it. Section 11.4.4 provides an example of using gpt. The diskutil command also works with GPT disks.

$ diskutil list disk0 # GPT disk /dev/disk0    #:                   type name               size      identifier    0:  GUID_partition_scheme                    *93.2 GB  disk0    1:                    EFI                    200.0 MB  disk0s1    2:              Apple_HFS Mini HD            92.8 GB   disk0s2





Mac OS X Internals. A Systems Approach
Mac OS X Internals: A Systems Approach
ISBN: 0321278542
EAN: 2147483647
Year: 2006
Pages: 161
Authors: Amit Singh

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net