Background | Volume Analysis

Volume Concepts

Volume systems have two central concepts to them. One is to assemble multiple storage volumes into one storage volume and the other is to partition storage volumes into independent partitions. The terms "partition" and "volume" are frequently used together, but I am going to make a distinction.

A volume is a collection of addressable sectors that an Operating System (OS) or application can use for data storage. The sectors in a volume need not be consecutive on a physical storage device; instead, they need to only give the impression that they are. A hard disk is an example of a volume that is located in consecutive sectors. A volume may also be the result of assembling and merging smaller volumes.

General Theory of Partitions

One of the concepts in a volume system is to create partitions. A partition is a collection of consecutive sectors in a volume. By definition, a partition is also a volume, which is why the terms are frequently confused. I will refer to the volume in which a partition is located as the partition's parent volume. Partitions are used in many scenarios, including

Some file systems have a maximum size that is smaller than hard disks.
Many laptops use a special partition to store memory contents when the system is put to sleep.
UNIX systems use different partitions for different directories to minimize the impact of file system corruption.
IA32-based systems that have multiple operating systems, such as Microsoft Windows and Linux, may require separate partitions for each operating system.

Consider a Microsoft Windows system with one hard disk. The hard disk volume is partitioned into three smaller volumes, and each has a file system. Windows assigns the names C, D, and E to each volume. We can see this in Figure 4.1.

Figure 4.1. An example hard disk volume is organized into three partitions, which are assigned volume names.

Each operating system and hardware platform typically uses a different partitioning method. We will cover the different implementations in Chapter 5, "PC-based Partitions," and Chapter 6, "Server-based Partitions," but we will examine the basic components here. The common partition systems have one or more tables, and each table entry describes a partition. The data in the entry will have the starting sector of the partition, the ending sector of the partition (or the length), and the type of partition. Figure 4.2 shows an example table with three partitions.

Figure 4.2. A basic table with entries for the start, end, and type of each partition.

The purpose of a partition system is to organize the layout of a volume; therefore, the only essential data are the starting and ending location for each partition. A partition system cannot serve its purpose if those values are corrupt or non-existent. All other fields, such as a type and description, are nonessential and could be false.

In most cases, the first and last sector of a partition does not contain anything that identifies them as the border sectors. This is similar to how most property lines are not marked. A surveyor and documents are typically needed to identify the exact property lines, and the partition data structures are the equivalent to the survey documents. When the partition system structures are missing, the partition boundaries can sometimes be guessed using knowledge of what was stored inside of the partition. This is analogous to guessing property boundaries based on the landscape.

Note that a partition system is dependent on the operating system and not the type of interface on the hard disk. Therefore, a Windows system uses the same partition system regardless if the disk uses an AT Attachment interface (ATA/IDE) or a Small Computer Systems Interface (SCSI).

Usage of Volumes in UNIX

UNIX systems typically do not use volumes the same way a Microsoft Windows system does. This section is intended for users who are not familiar with UNIX, and it provides a brief overview of how volumes are used in UNIX. A UNIX system administration book should be consulted for more details.

In UNIX, the user is not presented with several "drives", such as C: and D:. Instead, the user is presented with a series of directories that start at the root directory, or /. The subdirectories of / are either subdirectories in the same file system, or they are mounting points for new file systems and volumes. For example, a CD-ROM would be given the E: drive in Windows, but it may be mounted at /mnt/cdrom in Linux. This allows the user to change drives by changing directories, and in many cases the user is unaware that they have done so. Figure 4.3 shows how hard disk and CD volumes are accessed in Windows and UNIX.

Figure 4.3. Mount points of two volumes and a CD-ROM in (A) Microsoft Windows and (B) a typical UNIX system.

To minimize the impact of drive corruption and to improve efficiency, UNIX typically partitions each disk into several volumes. A volume for the root directory (/) stores basic information, a separate volume may exist for the user's home directories (/home/), and applications may be located in their own volume (/usr/). All systems are unique and may have a completely different volume and mounting scheme. Some systems use only one large volume for the root directory and do not segment the system.

General Theory of Volume Assembly

Larger systems use volume assembly techniques to make multiple disks look like one. One motivation for this is to add redundancy in case a disk fails. If data are being written to more then one disk, there exists a backup copy if one disk fails. Another motivation for this is to make it easier to add more storage space. Volume spanning works by combining the total storage space of multiple volumes so that one large volume is created. Additional disks can be added to the larger volume with no impact on the existing data. We will cover these techniques in Chapter 7, "Multiple Disk Volumes."

Let's look at a quick example. Figure 4.4 shows an example of two hard disk volumes with a total of three partitions. Partition 1 is assigned a volume name of C: and a hardware device processes partitions 2 and 3. The hardware device outputs one large volume, and that is organized into two partitions, which are given volume names. Note that in this case the hardware device does not provide increased reliability, only a larger volume.

Figure 4.4. A volume system that merges two partitions into one volume and partitions it.

Sector Addressing

In Chapter 2, we discussed how to address a sector. The most common method is to use its LBA address, which is a number that starts at 0 at the first sector of the disk. This address is the physical address of a sector.

A volume is a collection of sectors, and we need to assign an address to them. A logical volume address is the address of a sector relative to the start of its volume. Note that because a disk is a volume, the physical address is the same as a logical volume address for the disk volume. The starting and ending locations of partitions are typically described using logical volume addresses.

When we start to talk about the contents of a partition, there is another layer of logical volume addresses. These addresses are relative to the start of the partition and not the start of the disk or parent volume. We will differentiate these by preceding the word volume with "disk" or "partition." If a sector is not allocated to a partition, it will not have a logical partition volume address. Figure 4.5 shows an example where there are two partitions and unpartitioned space in between. The first partition starts in sector 0, so the logical partition volume addresses in it are the same as the logical disk volume addresses. The second partition starts in physical sector 864 and the logical disk volume addresses of these sectors are 864 sectors larger than their logical partition volume addresses.

Figure 4.5. The logical partition volume address is relative to the start of the partition while the logical disk volume address is relative to the start of the disk.