As we've seen, the basic Unix file storage unit is the disk partition. Filesystems are created on disk partitions, and all of the separate filesystems are combined into a single directory tree. The initial parts of this section discuss the process by which a physical disk becomes one or more filesystems on a Unix system, treating the topic at a conceptual level. Later subsections discuss the mechanics of adding a new disk to the various operating systems we are considering.
10.3.1 Defining Disk Partitions
Traditionally, the Unix operating system organizes disks into fixed-size partitions, whose sizes and locations are determined when the disk is first prepared (as we'll see). Unix treats disk partitions as logically independent devices, each of which is accessed as if it were a physically separate disk. For example, one physical disk may be divided into four partitions, each of which holds a separate filesystem. Alternatively, a physical disk may be configured to contain only one partition comprising its entire capacity.
Many Unix implementations allow several physical disks to be combined into a single logical device or partition upon which you can build a filesystem. Systems offering a logical volume manager carry this trend to its logical conclusion, allowing multiple physical disks to be combined into a single logical disk, which can then be divided into logical partitions. AIX uses only an LVM and does not use traditional partitions at all.
Physically, adisk consists of a vertical stack of equally spaced circular platters. Reading and writing is done by a stack of heads that move in and out along the radius as the platters spin around at high speed. The basic idea is not so different from an audio turntable I hope you've seen one although both sides of the platters can be accessed at once.
Partitions consist of subcylinders of the disk: specific ranges of distance from the spindle (the vertical center of the stack ofplatters): e.g., from one inch to two inches, to make up an arbitrary example. Thus, a disk partition uses the same sized and located circular section on all the platters in the disk drive. In this way, disks are divided vertically, through the platters, not horizontally.
Partitions can be defined as part of adding a new disk. In some versions of Unix, default disk partitions are defined in advance by the operating system. These default definitions provide some amount of flexibility by defining more than one division scheme for the physical disk.
Figure 10-3 depicts a BSD-stylepartition scheme. Each drawing corresponds to a different disk layout: one way of dividing up the disk. The various cylinders graphically represent each partition's location on the disk. The solid black area at the center of each disk indicates the part of the disk that cannot be accessed, containing the bad block list and other disk data.
Figure 10-3. Sample disk partitioning scheme
Readers who prefer numeric to graphical representations can consider the numeric partitioning scheme in Table 10-4, which illustrates the same point.
Seven different partitions are defined for the disk, named by letters from a to g. Three drawings are needed to display all seven partitions because some of them are defined to occupy the same disk locations.
Traditionally, the c partition comprised the entire disk, including the forbidden area; this is why the c partition was never used under standard BSD. However, on most current systems using this sort of naming convention, you can use the c partition to build a filesystem that uses the entire disk. Check the documentation if you're unsure about the conventions on your system.
The other six defined partitions are a, b, and d through g. However, it is not possible to use them all at one time, because some of them include the same physical areas of the disk. Partitions d and e occupy the same space as partition g in the sample layout. Hence, a disk will use either partitions d and e, or partition g, but not both. Similarly, the a and b partitions use the same area of the disk as partition f, and partitions f and g use the same area as partition c.
Thisdisk layout, then, offers three different ways of using the disk, divided into one, two, or four partitions, each of which may hold a filesystem or be used as a swap partition. Some disk partitioning schemes offer even more alternative layouts of the disk. Flexibility is designed in to meet the needs of different systems.
This flexibility also has the following consequence: nothing prevents you from using a disk drive inconsistently. For example, nothing prevents you from mounting /dev/disk2d and /dev/disk2g from the same disk. However, this will have catastrophic consequences, because these two partitions overlap. Best practice is to modify partitions in a standard layout that you will not be using so that they have zero length (or delete them).
These days, the following partition naming conventions generally apply:
10.3.2 Adding Disks
In this section, we'll begin by examining the general process of adding a disk to a Unix system and then go on to consider the commands and procedures for the various operating systems. The following list outlines the steps needed to make a new disk accessible to users:
The processes used to handle these activities will be discussed in the sections that follow.
As usual, planning should precede implementation. Before performing any of these operations, the system administrator must decide how the disk will be used: which partitions will have filesystems created on them and what files (types of files) will be stored in them. The layout of your filesystems can influence your system's performance significantly. You should therefore take some care in planning the structure of your filesystem.
For best performance, heavily used filesystems should each have their own disk drive, and they should not share a disk with aswap partition. Preferably, heavily used filesystems should be located on drives attached to different controllers. This setup balances the load between disk drives and disk controllers. These issues are discussed in more detail in Section 15.5. Coming up with the optimal layout may require consulting with other people: the database administrator, software developers, and so on.
We now turn to the mechanics of adding a new disk. We'll begin by considering aspects of the process that are common to all systems. The subsequent subsections discuss adding a new SCSI disk to each of the various Unix versions we are considering.
10.3.2.1 Preparing and connecting the disk
There are two main types of disks in wide use today: IDEdisks and SCSI disks. IDE disks are low cost devices developed for the microcomputer market, and they are generally used on PC-based Unix systems. SCSI disks are generally used on (non-Intel) Unix workstations and servers from the major hardware vendors. IDE disks generally do not perform as well as SCSI disks (claims made by ATA-2 drive vendors notwithstanding).
IDE disks are easy to attach to the system, and the manufacturer's instructions are generally good. When you add a second disk drive to an IDE controller, you will usually need to perform some minor reconfiguration for both the existing and new disks. One disk must be designated as the master device and the other as the slave device; generally, the existing disk becomes the master and the new disk is the slave.
The master/slave setting for a disk is specified by means of a jumper on the disk drive itself, and it is almost always located on the same face of the disk as the bus and power connector sockets. Consult the documentation for the disk you are using to determine the jumper location and proper setting. Doing so on the new drive is easy because you can do it before you install the disk. Remember to check the existing drive's configuration as well, because single drives are often left unjumpered by the manufacturer. Note that the master/slave setting is not an operational definition; the two disks are treated equally by the operating system.
SCSI disks are in wide use in both PC-based systems and traditional Unix computers. When performance counts, use SCSI disks, because high-end SCSI subsystems are many times faster than the best EIDE-based ones. The SCSI subsystems are also more expensive than the best EIDE-based ones.
SCSI disks may be internal or external. These disks are designated by a number ranging from 0 to 6 known as their SCSI ID (the SCSI ID 7 is used by the controller itself). Normal SCSI adapters thus support up to seven devices, each of which must be assigned a unique SCSI ID; wide SCSI controllers support up to 15 devices (ID 7 is still used for the controller). SCSI IDs are generally set via jumpers on internal devices and via a thumbwheel or push button counter on external devices. Keep in mind that when you change the ID setting of a SCSI disk, the device must generally be power-cycled before the change will take effect.
On rare occasions, the ID display setting on an external SCSI disk will not match what is actually being set. When this happens, the counter is either attached incorrectly (backwards) or faulty (the SCSI ID does not change even though the counter does). When you are initially configuring a device, check the controller's power-on message to determine whether all devices are being recognized and to determine the actual SCSI ID assignments being used. Once again, these problems are rare, but I have seen two examples of the former and one example of the latter in my career.
SCSI disks come in many varieties; the current offerings are summarized in Table 10-5. You should be aware of the distinction between normal and differentialSCSI devices. In the latter type, there are two physical wires for each signal within the bus, and such devices use the voltage difference between the two wires as the signal value. This design reduces noise on the bus and allows for longer total cable lengths. Special cables and terminators are needed for such SCSI devices (as well as adapter support), and you cannot mix differential and normal devices. Differential signaling has used two forms over the years, high voltage differential (HVD) and low voltage differential (LVD); the two forms cannot be mixed. The most recent standards employ the latter exclusively.
Table 10-5 can also serve as a simple history of SCSI. It shows the progressively faster speeds these devices have been able to obtain. Speed-ups come from a combination of a faster bus speed and using more bits for the bus (the "wide" devices). The most recent SCSI standards are all 16 bits, and the term "wide" has been dropped from the name because there are no "narrow" devices from which they need to be distinguished.
There are a variety of connectors that you will encounter on SCSI devices. These are the most common:
Figure 10-4 illustrates these connector types (shown in the external versions).
Figure 10-4. SCSI connectors
From left to right, Figure 10-4 shows a Centronics connector, two versions of the 50-pin mini-micro connector, and a DB-25 connector. 68-pin connectors look very similar to these 50-pin mini-micro connectors; they are simply wider. Figure 10-5 depicts the pin numbering schemes for these connectors.
Figure 10-5. SCSI connector pinouts
You can purchase cables that use any combination of these connectors and adapters to convert between them.
The various SCSI devices on a system are connected in a daisy chain (i.e., serially, in a single line). The first and last devices in the SCSI chain must be terminated for proper operation. For example, when the SCSI chain is entirely external, the final device will have a terminator attached and the SCSI adapter itself will usually provide termination for the beginning of the chain (check its documentation to determine whether this feature must be enabled or not). Similarly, when the chain is composed of both internal and external devices, the first device on the internal portion of the SCSI bus will have termination enabled (for example, via a jumper on an internal disk), and the final external device will again have a terminator attached.
Termination consists of regulating the voltages across the various lines comprising the SCSI bus. Terminators prevent the signal reflection that would occur on an open end. There are several different types of SCSI terminators:
A few SCSI devices have built-in terminators that you select or deselect via a switch. External boxes containing multiple SCSI disks also often include termination. Check the device characteristics for your devices to determine if such features are present.
Be aware that filesystems on SCSI disks are not guaranteed to survive a change of controller model (although they usually will); the standard does not specify that they must be interoperable. Thus, if you move a SCSI disk containing data from one system to another system with a different kind of SCSI controller, there's a chance you will not be able to access the existing data on the disk and will have to reformat it. Similarly, if you need to change the SCSI adapter in a computer, it is safest to replace it with another of the same model.
Having said this, I will note that I do move SCSI disks around fairly often, and I've only seen one failure of this kind. It's rare, but it does happen.
Once the disk is attached to the system, you are ready to configure it. The discussion that follows assumes that the new disk to be added is connected to the computer and is ready to accept partitions. These days, disks seldom if ever require low-level formatting, so we won't pay much attention to this process.
Before turning to the specific procedures for various operating systems, we'll look at the general issue of creating special files.
10.3.2.2 Making special files
Before filesystems can be created on a disk, thespecial files for the desired disk partitions must exist. Sometimes, they are already on the system when you go to look for them. On many systems, the boot process automatically creates the appropriate special files when it detects new hardware.
Otherwise, you'll have to create them yourself. Special files are created with the mknod command. mknod has the following syntax:
# mknod name | major minor
The first argument is the filename, and the second argument is the letter c or b, depending on whether you're making the character or block special file. The other two arguments are the major and minor device numbers for the device. These numbers serve to identify the proper device driver to the kernel. The major device number indicates the general device type (disk, serial line, etc.), and the minor device number indicates the specific member within that class.
These numbers are highly implementation-specific. To determine the numbers you need, use the ls -l command on some existing special files for disk partitions; the major and minor device numbers will appear in the size field. For example:
$ cd /dev/dsk; ls -l c1d* Major, minor device numbers. brw------- 1 rootroot0,144 Mar 13 19:14 c1d1s0 brw------- 1 rootroot0,145 Mar 13 19:14 c1d1s1 brw------- 1 rootroot0,146 Mar 13 19:14 c1d1s2 ... brw------- 1 rootroot0,150 Mar 13 19:14 c1d1s6 brw------- 1 rootroot0,151 Mar 13 19:14 c1d1s7 brw------- 1 rootroot0,160 Mar 13 19:14 c1d2s0 brw------- 1 rootroot0,161 Mar 13 19:14 c1d2s1 ... $ cd /dev/rdsk; ls -l c1d1* crw------- 1 rootroot3,144 Mar 13 19:14 c1d1s0 crw------- 1 rootroot3,145 Mar 13 19:14 c1d1s1 .. .
In this example, the numbering pattern is pretty clear: block special files for disks on controller 1 have major device number 0; the corresponding character special files have major device number 3. The minor device number of the same partition of successive disks differs by 16. So if you want to make the special files for partition 2 on disk 3, its minor device number would be 162+16 = 178, and you'd use the following mknod commands:
# mknod /dev/dsk/c1d3s2 b 0 178 # mknod /dev/rdsk/c1d3s2 c 3 178
Except on Linux and FreeBSD systems, be sure to make both the block and character special files.
On many systems, the /dev directory includes a shell script named MAKEDEV which automates running mknod. It takes the base name of the new device as an argument and creates the character and block special files defined for it. For example, the following command creates the special files for a SCSI disk under Linux:
# cd /dev # ./MAKEDEV sdb
The command creates the special files /dev/sdb0 through /dev/sdb16.
The first step is to attach the disk to the system and then reboot.FreeBSD should detect the new disk. You can check the boot messages or the output of the dmesg command to ensure that it has:
da1 at adv0 bus 0 target 2 lun 0 da1: <SEAGATE ST15150N 0017> Fixed Direct Access SCSI-2 device da1: 10.000MB/s transfers (10.000MHz, offset 15), Tagged Queueing Enabled da1: 4095MB (8388315 512 byte sectors: 255H 63S/T 522C)
FreeBSD disk partitioning is a bit more complex than for the other operating systems we are considering. It is a two-part process. First, the disk is divided into physical partitions, which BSD calls slices. One or more of these is assigned to FreeBSD. The FreeBSD slice is then itself subdivided into partitions. The latter are where filesystems actually get built.
The fdisk utility is used to divide a disk into slices. Here we create a single slice comprising the entire disk:
# fdisk -i /dev/da1 ******* Working on device /dev/da1 ******* .. . Information from DOS bootblock is: The data for partition 1 is: <UNUSED> Do you want to change it? [n] y Supply a decimal value for "sysid (165=FreeBSD)"  165 Supply a decimal value for "start"  Supply a decimal value for "size"  19152 Explicitly specify beg/end address ? [n] n sysid 165,(FreeBSD/NetBSD/386BSD) start 0, size 19152 (9 Meg), flag 0 beg: cyl 0/ head 0/ sector 1; end: cyl 18/ head 15/ sector 63 Are we happy with this entry? [n] y The data for partition 2 is: <UNUSED> Do you want to change it? [n] n .. . Do you want to change the active partition? [n] n Should we write new partition table? [n] y
The disklabel command creates FreeBSD partitions within the FreeBSD slice:
# disklabel -r -w da1 auto
The auto parameter says to create a default layout for the slice. You can preview what disklabel will do by adding the -n option.
Once you have created a default label (division), you can edit it by running disklabel -e. This command starts a editor session from which you can modify the partitioning (using the editor specified in the EDITOR environment variable).
disklabel is a very cranky utility, and often fails with the message:
disklabel: No space left on device
The message is completely spurious. This happens more often with larger disks than with smaller ones. If you encounter this problem, try running sysinstall, and select the ConfigureLabel menu path. This form of the utility can usually be coaxed to work, but even it will not accept all valid partition sizes. Caveat emptor.
Once you have made partitions, you create filesystems using the newfs command, as in this example:
# newfs /dev/da1a /dev/da1a: 19152 sectors in 5 cylinders of 1 tracks, 4096 sectors 9.4MB in 1 cyl groups (106 c/g, 212.00MB/g, 1280 i/g) super-block backups (for fsck -b #) at: 32
The following options can be used to customize the newfs operation:
The tunefs command can be used to modify the values of -m and -o for an existing filesystem (using the same option letters). Similarly, -n can be used to enable/disable soft updates for an existing filesystem (it takes enable or disable as its argument).
Finally, we run fsck on the new filesystem:
# fsck /dev/da1a ** /dev/da1a ** Last Mounted on ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 1 files, 1 used, 4682 free (18 frags, 583 blocks, 0.4% fragmentation)
In this instance, fsck finishes very quickly.
The growfs command can be used to increase the size of an existingfilesystem, as in this example:
# growfs /dev/da1a
By default, the filesystem is increased to the size of the underlying partition. You can specify a specific new size with the -s option if you want to.
After attaching the disk to the system, it should be detected when the system is booted. You can use the dmesg command to display boot messages. Here are some sample messages from a very old, but still working, Intel-based Linux system:
scsi0 : at 0x0388 irq 10 options CAN_QUEUE=32 CMD_PER_LUN=2 ... scsi0 : Pro Audio Spectrum-16 SCSI scsi : 1 host. Detected scsi disk sda at scsi0, id 2, lun 0 scsi : detected 1 SCSI disk total.
The messages indicate that this disk is designated as sda.
If necessary, create the device special files for the disk (needed only when you have many, many disks). For example, these commands create the special files used to access the sixteenth SCSI disk:
# cd /dev; ./MAKEDEV sdp
Assuming we have our special files all in order, we will use fdisk or cfdisk (a screen-oriented version) to divide the disk into partitions (we'll be creating two partitions). The following commands will start these utilities:
# fdisk /dev/sda # cfdisk /dev/sda
The available subcommands for these utilities are listed in Table 10-6.
cfdisk is often more convenient to use because the partition table is displayed continuously, and we'll use it here. cfdisk subcommands always operate on the current (highlighted) partition. Thus, in order to create a new partition, move the highlight to the line corresponding to Free Space and press n.
You first need to select either a primary or a logical (extended) partition. PC disk partitions are of two types: primary and extended. A disk may contain up to four partitions. Both partition types are a physical subset of the total disk. Extended partitions may be further subdivided into units known as logical partitions (or drives) and thereby provide a means for dividing a physical disk into more than four pieces.
Next, cfdisk prompts for the partition information:
Primary or logical [pl]: p Size (in MB): 110
If you'd rather enter the size in a different set of units, use the u subcommand (units cycle among MB, sectors, and cylinders). Once these prompts are answered, you will be asked if you want the partition placed at the beginning or the end of the free space (if there is a choice).
Use the same procedure to create a second partition, and then activate the first partition with the b subcommand. Then, use the t subcommand to change the partition types of the two partitions. The most commonly needed type codes are 6 for Windows FAT16, 82 for a Linux swap partition, and 83 for a regular Linux partition.
Here is the final partition table (output has been simplified):
cfdisk 2.11i Disk Drive: /dev/hde Size: 3228696576 bytes Heads: 128 Sectors per Track: 63 Cylinders: 782 Name Flags Part Type FS Type Size (MB) -------------------------------------------------------------- /dev/sda1 Boot Primary Linux 110.0 /dev/sda2 Primary Linux 52.5 Pri/Log Free Space 0.5
(Yes, those sizes are small; I told you it was an old system.)
At this point, I reboot the system. In general, when I've changed the partition layout of the disk in other words, done anything other than change the types assigned to the various partitions I always reboot PC-based systems. Friends and colleagues accuse me of being mired in an obsolete Windows superstition by doing so and argue that this is not really necessary. However, many Linux utility writers (see fdisk) and filesystem designers (see mkreiserfs) agree with me.
Next, use the mkfs command to create a filesystem on the Linux partition. mkfs has been streamlined in the Linux version and requires little input:
# mkfs -t ext3 -j /dev/sda1
This command creates a journaled ext3 filesystem, the current default filesystem type for many Linux distributions. The ext3 filesystem is a journaled version of the ext2 filesystem, which was used on Linux systems for several years and is still in wide use. In fact, ext3 filesystems are backward-compatible and can be mounted in ext2 mode.
If you want to customize mkfs's operation, the following options can be used:
Once the filesystem is built, run fsck:
# fsck -f -y /dev/sda1
The -f option is necessary to force fsck to run even though the filesystem is clean. The new filesystem may now be mounted and entered into /etc/fstab.
The tune2fs command may be used to list and alter fields within the superblock of an ext2 filesystem. Here is an example of its display output (shortened):
# tune2fs -l /dev/sdb1 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: filetype sparse_super Filesystem state: not clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 253952 Block count: 507016 Reserved block count: 25350 Free blocks: 30043 Free inodes: 89915 First block: 0 Block size: 4096 Last mount time: Thu Apr 4 11:28:19 2002 Last write time: Wed May 22 10:00:36 2002 Mount count: 1 Maximum mount count: 20 Last checked: Thu Apr 4 11:28:01 2002 Check interval: 15552000 (6 months) Next check after: Tue Oct 1 12:28:01 2002 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root)
The check-related items in the list indicate when fsck will check the filesystem even if it is clean (they appear fifth to third from last). The Linux version of fsck for ext3 filesystems checks the filesystem if either the maximum number of mounts without a check has been exceeded or the maximum time interval between checks has expired (20 times and 6 months in the preceding output; the check interval is given in seconds).
tune2fs's -i option may be used to specify the maximum time interval between checks in days, and the -c option may be used to specify the maximum number of mounts between checks. For example, the following command disables the time-between-checks function and sets the maximum number of mounts to 25:
# tune2fs -i 0 -c 25 /dev/sdb1 Setting maximal mount count to 25 Setting interval between check 0 seconds
Another useful option to tune2fs is -m, which allows you to change the percentage of filesystem space held in reserve. The -u and -g options allow you to specify the user and group ID (respectively) allowed to access the reserved space.
You can convert an ext2 filesystem to ext3 with a command like this one:
# tune2fs -j /dev/sdb2
Existing ext2 and ext3 filesystems can be resized using the resize2fs command, which takes the filesystem and new size (in 512-byte blocks) as parameters. For example, the following commands will change the size of the specified filesystem to 200,000 blocks:
# umount /dev/sdc1 # e2fsck -f /dev/sdc1 e2fsck 1.23, 15-Aug-2001 for EXT2 FS 0.5b, 95/08/09 Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /1: 11/247296 files (0.0% non-contiguous), 15979/493998 blocks # resize2fs -p /dev/sdc1 200000 resize2fs 1.23 (15-Aug-2001) Begin pass 1 (max = 1) Extending the inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Begin pass 3 (max = 10) Scanning inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX The filesystem on /dev/sdc1 is now 200000 blocks long.
The -p option says to display a progress bar as the operation runs. Naturally, the size of the underlying disk partition or logical volume (discussed later in this chapter) will need to be increased beforehand.
Increasing the size of a filesystem is always safe. If you want the new size to be the same as the size of the underlying disk partition as is virtually always the case you can omit the size parameters from the resize2fs command. To decrease the size of a filesystem, perform the resize2fs operation first, and then use fdisk or cfdisk to decrease the size of the underlying partition. Note that data loss is always possible, even likely, when decreasing the size of a filesystem, because no effort is made to migrate data within the filesystem prior to shortening it.
10.3.2.4.1 The Reiser filesystem
Some Linux distributions also offer the Reiser filesystem, designed by Hans Reiser (see http://www.reiserfs.org). The commands to create a Reiser filesystem are very similar:
# mkreiserfs /dev/sdb3 <-------------mkreiserfs, 2001-------------> reiserfsprogs 3.x.0k-pre9 mkreiserfs: Guessing about desired format.. mkreiserfs: Kernel 2.4.10-4GB is running. 13107k will be used Block 16 (0x2142) contains super block of format 3.5 with standard journal Block count: 76860 Bitmap number: 3 Blocksize: 4096 Free blocks: 68646 Root block: 8211 Tree height: 2 Hash function used to sort names: "r5" Objectid map size 2, max 1004 Journal parameters: Device [0x0] Magic [0x18bbe6ba] Size 8193 (including journal header) (first block 18) Max transaction length 1024 Max batch size 900 Max commit age 30 Space reserved by journal: 0 Correctness checked after mount 1 Fsck field 0x0 ATTENTION: YOU SHOULD REBOOT AFTER FDISK! ALL DATA WILL BE LOST ON '/dev/hdf2'! Continue (y/n):y Initializing journal - 0%....20%....40%....60%....80%....100% Syncing..ok ReiserFS core development sponsored by SuSE Labs (suse.com) Journaling sponsored by MP3.com. To learn about the programmers and ReiserFS, please go to http://namesys.com Have fun. # reiserfsck -x /dev/sdb3 <-------------reiserfsck, 2001-------------> reiserfsprogs 3.x.0k-pre9 Will read-only check consistency of the filesystem on /dev/hdf2 Will fix what can be fixed w/o --rebuild-tree Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes):Yes 13107k will be used ########### reiserfsck --check started at Wed May 22 11:36:07 2002 ########### Replaying journal.. No transactions found Checking S+tree..ok Comparing bitmaps..ok Checking Semantic tree...ok No corruptions found There are on the filesystem: Leaves 1 Internal nodes 0 Directories 1 Other files 0 Data block pointers 0 (zero of them 0) ########### reiserfsck finished at Wed May 22 11:36:19 2002 ###########
Reiser filesystems may be resized with the resize_reiserfs -s command. They can also be resized when they are mounted. The latter operation uses a command like the following:
# mount -o remount,resize=200000 /dev/sdc1
This command changes the size of the specified filesystem to 200,000 blocks. Once again, increasing the size of a filesystem is always safe, while decreasing it requires great care to avoid data loss.
In this section, we add a SCSI disk (SCSI ID 2) to aSolaris system.
After attaching the device, boot the system with boot -r, which tells the operating system to look for new devices and create the associated special files and links into the /devices tree. The new disk should be detected when the system is booted (output simplified):
sd2 at esp0: target 2 lun 0 corrupt label - wrong magic number Vendor 'QUANTUM', product 'CTS160S', 333936 512 byte blocks
The warning message about a corrupt label comes because no valid Sun label (a vendor-specific disk header block that Sun uses) has been written to the disk yet. If you miss the messages during the boot, use the dmesg command.
We now label the disk and then create partitions on it (which Solaris sometimes calls slices). Solaris uses the format utility for these tasks. Previously, it was often necessary to tell format about the characteristics of your disk. These days, however, the utility knows about most kinds of disks, which makes adding a new disk much simpler.
Here is the command used to start format and write a generic label to the disk (if it is unlabeled):
# format /dev/rdsk/c0t2d0s2 Partition 2 = the entire disk. selecting /dev/rdsk/c0t2d0s2 [disk formatted, no defect list found] FORMAT MENU: ...Menu is printed here. format> label Write generic disk label. Ready to label disk, continue? y
Once the disk label is written, we can set up partitions. We'll be dividing this disk into two equal partitions. We use the partition subcommand to define them:
format> partition PARTITION MENU: 0 - change `0' partition 1 - change `1' partition .. . 7 - change `7' partition select - select a predefined table modify - modify a predefined partition table name - name the current table print - display the current table label - write partition map and label to the disk quit partition> Redefine partition 0 Enter partition id tag[unassigned]: root Specifies partition use. Enter partition permission flags[wm]: wm Read-write, mountable. Enter new starting cyl: Enter partition size[0b, 0c, 0e, 0.00mb, 0.00gb]: 5.00gb .. . partition> 1 Enter partition id tag[unassigned]: Enter partition permission flags[wm]: wm Enter new starting cyl: 10403 Enter partition size[0b, 0c, 0e, 0.00mb, 0.00gb]: 7257c .. . partition> print Print partition table. Current partition table (unnamed): Total disk cylinders available: 17660 + 2 (reserved cylinders) Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 10402 5.00GB (10403/0/0) 10486224 1 unassigned wm 10403 - 17659 3.49GB (7257/0/0) 7315056 2 unassigned wm 0 0 (0/0/0) 0 .. . 7 unassigned wm 0 0 (0/0/0) 0
We define two partitions here, 0 and 1. In the first case, we specify a starting cylinder number of 0 and the partition size in GB. In the second case, we specify a starting cylinder and the length in cylinders. We took a look at the partition table between issuing these two commands to find these numbers.
The partition ID tag is a label specifying the intended use of the partition. Partition 0 will be used for the root filesystem and is labeled accordingly.
The permission flags are usually one of wm (read-write and mountable) and wu (read-write and not mountable). The latter is used for swap partitions.
Once the partitions are defined, we write a label to the disk using the label subcommand:
partition> label Ready to label disk, continue? y partition> quit format> quit
The partition submenu also has a name subcommand, which allows a custom partition table to be named and saved; it can be applied to a new disk with the select subcommand on the same menu.
Now, we create filesystems on the new disk partitions with the newfs command:
# newfs /dev/rdsk/c0t2d0s0 newfs: construct a new file system /dev/rdsk/c0t2d0s3: (y/n)? y /dev/rdsk/c0t0d0s3: 10486224 sectors in 10403 cylinders of 16 tracks, 63 sectors 5120.2MB in 119 cyl groups (88 c/g, 43.31MB/g, 5504 i/g) super-block backups (for fsck -F ufs -o b=#) at: 32, 88800, 177568, 266336, 355104, 443872, 532640, 621408, 710176, .. .
The prudent course of action is to print out this list and store it somewhere for safe keeping, in case both the primary superblock and the one at address 32 get corrupted.
Finally, we run fsck on the new filesystem:
# fsck -y /dev/rdsk/c0t2d0s0 ** /dev/rdsk/c0t0d0s3 ** Last Mounted on ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 2 files, 9 used, 5159309 free (13 frags, 644912 blocks, 0.0% fragmentation)
This process is repeated for the other disk partition.
You can customize the parameters for the new filesystem using these options to newfs:
The -N option to newfs may be used to have the command display all of the parameters it would pass to mkfs the utility that does the actual work without building the filesystem.
10.3.2.6 AIX, HP-UX, and Tru64
These operating systems use a logical volume manager (LVM) by default. Adding disks to these systems is considered during the LVM discussion later in this chapter.
10.3.2.7 Remaking an existing filesystem
Occasionally, it may be necessary to reconfigure a disk. For example, you might want to select another layout, using a different set of partitions. You might want to change the value of a filesystem parameter, such as its block size. Or you might want to add an additional swap partition or get rid of an unneeded one. Sometimes, these operations require that you recreate the existing filesystems.
Recreating a filesystem will destroy all the existing data in the filesystem, so it is essential to perform a full backup first (and to verify that the tapes are readable; see Chapter 11). For example, the following commands may be used to reconfigure a filesystem with a 4K block size under Linux:
# umount /chem Dismount filesystem. # dump 0 /dev/sda1 Backup. # restore -t Check tape is OK! # mke2fs -b 4096 -j /dev/sda1 Remake filesystem. # mount /chem Mount new filesystem. # cd /chem; restore -r Restore files.
A very cautious administrator would make two copies of the backup tape.
10.3.3 Logical Volume Managers
This section looks at logical volume managers (LVMs). The LVM is the only disk management facility under AIX, and the corresponding facilities are also used by default under HP-UX and Tru64. Linux and Solaris 9 also offer LVM facilities. As usual, we'll begin this section with a conceptual overview of logical volume managers and then move on to the specifics for the various operating systems.
When dealing with an LVM, you will do well to forget everything you know about disk partitions under Unix. Not only is a completely different vocabulary employed, but some Unix terms like partition also are used with completely different meanings. However, once you get past the initial obstacles, the LVM point of view is very clear and sensible, and it is superior to the standard Unix approach to handling disks. A willing suspension of disbelief will come in very handy at first.
In general, an LVM brings the following benefits:
10.3.3.1 Disks, volume groups, and logical volumes
To begin at the beginning, there are disks: real, material, solid objects that hurt your toe if they fall on it. However, such disks must be initialized and made into physical volumes before they may be used by the LVM. When they are made part of a volume group (defined in a moment), these disks are divided into allocable units of space known as physical partitions (AIX) or physical extents (HP-UX and Tru64). The default size for these units is generally 4 MB. Note that these partitions/extents are units of disk storage only; they have nothing to do with traditional Unix disk partitions.
A volume group is a named collection of disks. Volume groups can also include collections of disks accessed as a single hardware unit (e.g., a RAID array). Volume groups allow filesystems to span physical disks (although it is not required that they do so). Paradoxically, the volume group is the LVM equivalent of the Unix physical disk: that entity which can be split into subunits called logical volumes, each of which holds a single filesystem. Unlike Unix disk partitions, volume groups are infinitely flexible in how they may be divided into filesystems.
HP-UX allows volume groups to be subdivided into sets of disks called physical volume groups (PVGs). These groups of disks are accessed through separate controllers and/or buses, and the facility is designed to support high-availability systems by reducing the number of potential single points of hardware failure.
Logical volumes are the entities on which filesystems reside; they may also be used as swap devices, as dump devices, for storing boot programs, and by application programs in raw mode (analogously to a raw-mode disk partition). They consist of some number of fixed physical partitions (disk chunks) generally located arbitrarily within a volume group (although some implementations optionally allow specific physical volumes to be requested when a logical volume is created or extended). Hence, logical volumes may be any size that is a multiple of the physical partition size for their volume group. They may be easily increased in size after creation while the operating system is running. Logical volumes may also be shrunk (although not without consequences to any filesystem they may contain).
Logical volumes are composed of logical partitions (AIX) or logical extents (HP-UX). Many times, physical and logical partitions are identical (or at least map one-to-one). However, logical volumes have the capability of storing redundant copies of all data, if desired; from one to two additional copies of each data block may be stored. When only a single copy of the data is stored, one logical partition corresponds to one physical partition. If two copies are stored, one logical partition corresponds to two physical partitions: one original and one mirror. Similarly, in a doubly mirrored logical volume, each logical partition corresponds to three physical partitions.
The main LVM data storage entities are illustrated in Figure 10-6 (representing an AIX system). The figure shows how three physical disks are combined into a single volume group (named chemvg). The separate disks composing it are suggested via shading.
Figure 10-6. Logical volume managers illustrated
Three user logical volumes are then defined from chemvg. Two of them chome and cdata store a single copy of their data using physical partitions from three separate disks. cdata is a striped logical volume, writing data to all three disks in parallel. It uses identically sized sections from each physical disk. chome illustrates the way that a filesystem can be spread across multiple physical disks, even noncontiguously in the case of hdisk3.
The other logical volume, qsar, is a mirrored logical volume. It contains an equal number of physical partitions from all three disks; it stores three copies of its data (each on a separate disk), and one physical partition per disk is used for each of its logical partitions.
Once a logical volume exists, you can build a filesystem on it and mount it normally. At any point in its lifetime, a filesystem's size may be increased as long as there are free physical partitions within its volume group. There need not initially be any free logical partitions within its logical volume. Generally, both the logical volume and filesystem are resized using a single command.
Some operating systems can also reduce the size of an existing logical volume. If this operation is performed on a mounted filesystem, and the new size of the logical volume is still at least a little larger than the existing filesystem, it can be accomplished without losing any data. Under any other conditions, data loss is very, very likely indeed. This technique is not for the fainthearted.
Currently, there is no easy way to decrease the size of a filesystem under AIX or FreeBSD, even if there is unused space within the filesystem. If you want to make a filesystem smaller, you need to back up the current files (and verify that the tape is readable!), delete the filesystem and its logical volume, create a new, smaller logical volume and filesystem, and then restore the files. The freed logical partitions can then be allocated as desired within their volume group; they can be added to an existing logical volume, used to make a new logical volume and filesystem, used in a new or existing paging space, or held in reserve.
Table 10-7 lists theLVM-related terminology used by the various Unix operating systems.
10.3.3.2 Disk striping
Disk striping is an option that is increasingly available as an extension to Unix, especially on high-performance systems. Striping combines one or more physical disks (or disk partitions) into a single logical disk, viewed like any other filesystem device by the rest of Unix. Disk striping is used to increase I/O performance at least as often as it is used to create very large filesystems spanning more than one physical disk. Striped disks split I/O operations across the physical disks in the stripe, performing them in parallel, and are thus able to achieve significant performance improvements over a single disk (although not always the nearly linear speedups that are sometimes claimed). Striping is especially effective for single-process transfer rates to a very large file and for processes performing a large number of I/O operations. Disk striping performance is discussed in detail in Section 15.5.
Special-purpose striped-disk devices are available from many vendors. In addition, many Unix systems offer software disk-striping. They provide utilities for configuring physical disks into a striped device, and the striping itself is done by the operating system, at the cost of some additional overhead.
The following general considerations apply to softwarestriped-disk configurations:
Software disk-striping is generally accomplished via the LVM or similar facility.
10.3.3.3 Disk mirroring and RAID
Another approach to combining multiple disks into a single logical device is RAID (or Redundant Array of Inexpensive Disks). In general, RAID devices are designed for increased data integrity and availability (via redundant copies), not for improved performance (RAID 0 is an exception).
There are at least 6 definedRAID levels that differ in how the multiple disks within the unit are organized. Most available hardware RAID devices support some combination of the following levels (level 2 is not used in practice). Table 10-8 summarizes the available RAID levels.
Figure 10-7 illustratesRAID 5 in action, using 5 disks.
Figure 10-7. The RAID 5 data distribution scheme
There are also some hybrid RAID levels:
Both these levels use a minimum of four disks.
Most hardware RAID devices connect to standard SCSI or SCSI-2 controllers. Many systems also offer software RAID facilities within their LVM (as we shall see).
The following considerations apply to all softwareRAID implementations:
AIX defines the root volume group, rootvg, automatically when the operating system is installed. Here is a typical setup:
# lsvg rootvg Display volume group attributes. VOLUME GROUP: rootvg VG IDENTIFIER: 0000018900004c0.. . VG STATE: active PP SIZE: 32 megabyte(s) VG PERMISSION: read/write TOTAL PPs: 542 (17344 MB) MAX LVs: 256 FREE PPs: 69 (2208 MB) LVs: 11 USED PPs: 473 (15136 MB) OPEN LVs: 10 QUORUM: 2 TOTAL PVs: 1 VG DESCRIPTORS: 2 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 1 AUTO ON: yes MAX PPs per PV: 1016 MAX PVs: 32 LTG size: 128 kilobyte(s) AUTO SYNC: no HOT SPARE: no # lsvg -l rootvg List logical volumes in a volume group. rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 16 16 1 open/syncd N/A hd8 jfs2log 1 1 1 open/syncd N/A hd4 jfs2 1 1 1 open/syncd / hd2 jfs2 49 49 1 open/syncd /usr hd9var jfs2 3 3 1 open/syncd /var hd3 jfs2 1 1 1 open/syncd /tmp hd1 jfs2 1 1 1 open/syncd /home hd10opt jfs2 1 1 1 open/syncd /opt lg_dumplv sysdump 32 32 1 open/syncd N/A
Adding a new disk under AIX follows the same basic steps as for other Unix systems, although the commands used to perform them are quite different. Once you've attached the device to the system, reboot it. Usually, AIX will discover new devices at boot time and automatically create special files for them. Defined disks have special filenames like /dev/hdisk1. The cfgmgr command may be used to search for new devices between boots; it has no arguments.
The lsdev command will list the disks present on the system:
$ lsdev -C -c disk hdisk0 Available 00-00-0S-0,0 1.0 GB SCSI Disk Drive hdisk1 Available 00-00-0S-2,0 Other SCSI Disk Drive .. .
The new disk must then be made part of a volume group. To create a new volume group, use the mkvg command:
# mkvg -y "chemvg" hdisk5 hdisk6
This command creates a volume group named chemvg consisting of the disks hdisk5 and hdisk6. mkvg's -s option can be used to specify the physical partition size in MB: from 1 to 1024 (4 is the default). The value must be a power of 2.
After a volume group is created, it must be activated with the varyonvg command:
# varyonvg chemvg
Thereafter, the volume group will be activated automatically at each boot time. Volume groups are deactivated with varyoffvg; all of their filesystems must be dismounted first.
A new disk may be added to an existing volume group with the extendvg command. For example, the following command adds the disk hdisk4 to the volume group named chemvg:
# extendvg chemvg hdisk4
The following other commands operate on volume groups:
Logical volumes are created with the mklv command, which has the following basic syntax:
mklv -y "lvname" volgrp n [disks]
lvname is the name of the logical volume, volgrp is the volume group name, and n is the number of logical partitions. For example, the command:
# mklv -y "chome" chemvg 64
makes a logical volume in the chemvg volume group consisting of 64 logical partitions (256 MB) named chome. The special files /dev/chome and /dev/rchome will automatically be created by mklv.
The mklv command has many other options, which allow the administrator as much control over how the logical volume maps to physical disks as desired, down to the specific physical partition level. However, the default settings work very well for most applications.
The following commands operate on logical volumes:
A small logical volume in each volume group is used for logging and other disk management purposes. Such logical volumes are created automatically by AIX and have names like lvlog00.
Once the logical volumes have been created, you can build filesystems on them. AIX has a version of mkfs, but crfs is a much more useful command for creating filesystems. There are two ways to create a filesystem:
The second way is faster, but the logical volume name AIX chooses is quite generic (lv00 for the first one so created, and so on), and the size must be specified in 512-byte blocks rather than in logical partitions (which default to 4 MB units).
The crfs command is used to create a filesystem. The following basic form may be used to create a filesystem:
crfs -v jfs2 -g vgname -a size=n -m mt-pt -A yesno -p prm
The options have the following meanings:
For example, the following command creates a new filesystem in the chemvg volume group:
# crfs -v jfs2 -g chemvg -a size=50000 -a frag=1024 -m /organic2 -A yes # mount /organic2
The new filesystem will be mounted at /organic2 (automatically at boot time), is 25 MB in size, and uses a fragment size of 1024 bytes. A new logical volume will be created automatically, and the filesystem will be entered into /etc/filesystems. The initial mount must be done by hand.
The -d option is used to create a filesystem on an existing logical volume:
# crfs -v jfs2 -d chome -m /inorganic2 -A yes
This command creates a filesystem on the logical volume we created earlier. The size and volume group options are not needed in this case.
The chfs command may be used to increase the size of a filesystem. For example, the following command increases the size of the /inorganic2 filesystem (and of its logical volume chm00) created above:
# chfs -a size=+50000 /inorganic
An absolute or relative size may be specified for the size parameter (in 512-byte blocks). The size of a logical volume may be increased with the extendlv command, but it has no effect on filesystem size.
The following commands operate on AIX jfs and jfs2 filesystems:
10.3.3.4.1 Replacing a failed disk
When you need to remove a disk from the system, most likely due to a hardware failure, there are two considerations to keep in mind:
The following commands remove hdisk4 from the LVM configuration (the volume group chemvg2 and the logical volume chlv2 holding the /chem2 filesystem are used as an example):
# umount /chem2 Unmount filesystem. # rmfs /chem2 Repeat for all affected filesystems. # rmlvcopy chlv2 2 hdisk4 Remove mirrors on hdisk4. # chps -a n paging02 Don't activate paging space at next boot. # shutdown -r now Reboot the system. # chpv -v r hdisk4 Make physical disk unavailable. # reducevg chemvg2 hdisk4 Remove disk from volume group. # rmdev -l hdisk4 -d Remove definition of disk.
When the replacement disk is added to the system, it will be detected, and devices will be created for it automatically.
10.3.3.4.2 Getting information from the LVM
AIX provides many commands and options for listing information about LVM entities. Table 10-9 attempts to make it easier to figure out which one to use for a given task.
10.3.3.4.3 Disk striping and disk mirroring
A striped logical volume is created by specifying mklv's -S option, indicating the stripe size, which must be a power of 2 from 4K to 128K. For example, this command creates a 500 MB logical volume striped across two disks consisting of a total of 125 logical partitions, each 4 MB in size:
# mklv -y cdata -S 64K chemvg 125 hdisk5 hdisk6
Note that the disk names are required on the mklv command when creating a striped logical volume.
Multiple data copies mirroring may be specified with the -c option, which takes the number of copies as its argument (the default is 1). For example, the following command creates a two-way mirror logical volume:
# mklv -c 2 -s s -w y biovg 500 hdisk2 hdisk3
The command specifies two copies, a super strict allocation policy (forces each mirror to a separate physical disk, which are listed), and specifies that write synchronization take place during each I/O operation (which reduces I/O performance but guarantees data synchronization).
An entire volume group can also be mirrored. This is configured using the mirrorvg command.
Finally, the -a option is used to request placement of the new logical volume within a general region of the disk. For example, this command requests that the logical volume be placed in the center portion of the disk to as great an extent as possible:
# mklv -y chome -ac chemvg 64
Disks are divided into five regions named as follows (beginning at the outer edge): edge, middle, center, inner-middle, and inner-edge. The middle region is the default, and the other available arguments to -a are accordingly e, im, and ie.
AIX does not provide general software RAID, although one can use mirrors and stripes to achieve the same functionality as RAID 0, 1, and 1+0.
HP-UX provides another version of a LVM that is used by default. The vg00volume group holds the system files, which are divided into several logical volumes:
# vgdisplay vg00 Display volume group attributes. --- Volume groups --- Output shortened. VG Name /dev/vg00 VG Write Access read/write VG Status available Max LV 255 Cur LV 8 Open LV 8 Max PV 16 Cur PV 1 Act PV 1 Max PE per PV 2500 PE Size (Mbytes) 4 Total PE 2169 Alloc PE 1613 Free PE 556 Total Spare PVs 0 Total Spare PVs in use 0 # bdf Output shows mounted logical volumes. Filesystem kbytes used avail %used Mounted on /dev/vg00/lvol3 143360 22288 113567 16% / /dev/vg00/lvol1 83733 32027 43332 42% /stand /dev/vg00/lvol7 2097152 419675 1572833 21% /var /dev/vg00/lvol6 1048576 515524 499746 51% /usr /dev/vg00/lvol5 65536 1128 60386 2% /tmp /dev/vg00/lvol4 2097152 632916 1372729 32% /opt /dev/vg00/lvol8 20480 1388 17900 7% /home
The process of creating a volume group begins by designating the component disks (or disk partitions) as physical volumes, using the pvcreate command:
# pvcreate /dev/rdsk/c2t0d0
Next, a directory and character special file must be created in /dev for the volume group:
# mkdir /dev/vg01 # mknod /dev/vg01/group c 64 0x010000
The major number is always 64, and the minor number is of the form 0x0n0000, where n varies from 0 to 9 and must be unique across all volume groups (I assign them in order).
The volume group may now be created with the vgcreate command, which takes the volume group directory in /dev and the component disks as its arguments:
# vgcreate /dev/vg01 /dev/dsk/c2t0d0
vgcreate's -s option may be used to specify an alternate physical extent size (in megabytes). The default of 4 may be too small for large disks. You can add an additional volume to an existing volume group with the vgextend command.
The vgcreate and vgextend commands also have a -g option, which allows you to define named subsets of the disks in the volume group, known as physical volume groups, as in this example that creates two physical volume groups in the vg01 volume group:
# vgcreate /dev/vg01 -g groupa /dev/dsk/c2t2d0 /dev/dsk/c2t4d0 # vgextend /dev/vg01 -g groupb /dev/dsk/c1t0d0 /dev/dsk/c1t1d0
The file /etc/lvmpvg holds the physical volume group data, and it may be edited directly rather than running vgcreate:
VG /dev/vg01 PVG groupa /dev/dsk/c2t0d0 /dev/dsk/c2t4d0 PVG groupb /dev/dsk/c1t0d0 /dev/dsk/c1t1d0
Once the volume group is created, the lvcreate command may be used to create a logical volume. For example, the following command creates a 200 MB logical volume named chemvg:
# lvcreate -n chemvg -L 200 /dev/vg01
If the specified size is not an even multiple of the extent size (4 MB), the size is rounded up to the nearest multiple.
If the new logical volume is to be used for the root or boot filesystem or as a swap space, you must run the lvlnboot command with its -r, -b, or -s option (respectively). The command takes the logical volume device as its argument:
# lvlnboot -r -s /dev/vg01/swaplv
The -r option will create a combined boot/root volume if the specified logical volume is the first one on the physical volume.
Once a logical volume is built, a filesystem may be built upon it. For example:
# newfs /dev/vg01/rchemvg
The logical volume name is concatenated to the volume group directory in /dev to form the special filenames referring to the logical volume; note that newfs uses the raw device. The new filesystem may then be mounted and entered into the filesystem configuration file.
You can customize the parameters for a new VxFS filesystem using these options to newfs :
Other commands that operate on LVM entities are listed below:
10.3.3.5.1 Displaying LVM information
The following commands display information about LVM entities:
10.3.3.5.2 Disk striping and mirroring
The LVM is also used to perform disk striping and disk mirroring onHP-UX systems. For example, the following command creates a 200 MB logical volume named cdata with one mirrored copy:
# lvcreate -n cdata -L 200 -m 1 -s g /dev/vg01
The -s g option specifies that the mirrors must be placed into different physical volume groups.
Under HP-UX, disk striping occurs at the logical volume level. The following command creates an 800 MB four-way striped logical volume, using a striped width of 64 KB:
# lvcreate -n tyger -L 400 -i 4 -I 64 /dev/vg01
The -i option specifies the number of stripes (disks) and can be no larger than the total number of disks in the volume group; -I specifies the stripe size in KB, and its valid range is powers of 2 from 4 to 64.
Most HP-UX version do not provide software RAID.
Tru64 provides two facilities which have many of the characteristics of a logical volume manager:
We'll consider each of them in separate subsections.
The AdvFS defines the following entities:
Unlike other LVMs, under the AdvFS, domains and filesets physical storage and directory trees are independent, and either one can be modified without affecting the other (as we'll see).
The AdvFS facility is used by default on Tru64 systems. It defines two domains and several filesets:
# showfdmn root_domain Describe this domain. Id Date Created LogPgs Version Domain Name 3a535b22.000c47c0 Wed Jan 3 12:02:26 2001 512 4 root_domain Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name 1L 524288 95680 82% on 256 256 /dev/disk/dsk0a # mountlist -v List mounted filesets. root_domain#root Root filesystem. usr_domain#usr Mounted at /usr. usr_domain#var Mounted at /var. # showfsets usr_domain List filesets in a domain. usr Id : 3a535b27.0005a120.1.8001 Files : 43049, SLim= 0, HLim= 0 Blocks (512) : 1983812, SLim= 0, HLim= 0 Quota Status : user=off group=off Output shortened. var Id : 3a535b27.0005a120.2.8001 Files : 1800, SLim= 0, HLim= 0 Blocks (512) : 34954, SLim= 0, HLim= 0 Quota Status : user=off group=off
You can create a new domain with the mkfdmn command:
# mkfdmn /dev/disk/dsk1c chem_domain
This command creates the chem_domain domain consisting of the specified volume (here, a disk partition). If you have the AdvFS Utilities installed, you can add volumes to a domain with the addvol command, as in this example, which adds a second disk partition to the chem_domain domain:
# addvol /dev/disk/dsk2c chem_domain # balance chem_domain
You can similarly remove a volume from a domain with the rmvol command. The balance command is typically run after either one; it has the effect of balancing disk space usage among the various volumes in the domain to improve performance.
Once a domain has been created, you can create filesets within it. This process creates an entity which is effectively a relocateable filesystem; a fileset is ready to accept files as soon as it is created (no mkfs step is required), and its contents can be moved to a different physical disk location in its domain if required.
The following commands create two filesets with our domain, and mount them into two existing directories immediately afterward:
# mkfset chem_domain bronze # mkfset chem_domain silver # mount chem_domain#bronze /bronze # mount chem_domain#silver /silver
The fileset is referred to by appending its name to the domain name, separated by a number sign (#). Note that we don't have to specify any actual disk locations (and indeed we cannot do so). These matters are handled by the AdvFS itself.
The rmfset command may be used to remove a fileset from a domain. The renamefset command may be used to change the name of a fileset, as in this example:
# renamefset chem_domain lead gold
The AdvFS offers some limiteddisk striping facilities as part of its optional utilities package. A file can be striped by creating it with the stripe command:
# stripe -n 2 sulfur
This command creates the file sulfur as a two-way striped file. The file must created before any data is placed into it. More complex striping of entire volumes can be done with the Logical Storage Manager described in the next subsection.
The Tru64 Logical Storage Manager is designed to support advanced disk features such as disk striping and fault tolerance. It is a layered product which must be added to the basic Tru64 operating system.
Under the LSM, a whole new set of terminology comes into play:
For the most common cases, you need only worry about disk groups and volumes; plexes are taken care of automatically by the LSM. In the remainder of this section, we'll look briefly at some simple examples of LSM configuration. Consult the documentation for full details.
The voldiskadd command is used to create new disk group. This command takes the disks to be added to the group as its arguments:
# voldiskadd dsk3 dsk4 .. .
It is an interactive tool which will prompt you for the additional information it needs, including the disk group name (we'll use dg1 in our examples) and the use for each disk (data or spare).
If you later want to place additional disks into a disk group, you use the voldg command, as in this example, which adds several more disks to dg1:
# voldg -g dg1 adddisk dsk9 dsk10 dsk11
Volumes are generally created with the volassist command. For example, the following command creates a volume consisting of a concatenated plex named chemvol, essentially a logical volume comprised of space from multiple disks on which a filesystem can be built:
# volassist -g dg1 make chemvol 2g dsk3 dsk4
The volume is created using the dg1 disk group, using the specified disks (the disk list is optional). Its size is 2 GB.
We'll go on to make this a mirrored plex, using these commands:
# volassist -g dg1 mirror chemvol init=active layout=nolog dsk5 dsk6 # volassist addlog chemvol dsk7
The first command adds a mirror to the chemvol volume (we've again chosen to specify which disks to use). The second command adds the required logging area to the volume.
We can create a striped plex in a similar way:
# volassist -g dg1 make stripevol 2g layout=stripe nstripe=2 dsk3 dsk4
This command creates a two-way striped volume named stripevol.
The following command will create a 3 GB RAID 5 volume:
# volassist -g dg1 make raidvol 3g layout=raid5 nstripe=5 disks
For both striped andRAID 5 volumes, you can also use the stripeunit attribute (following nstripe) to specify the stripe size.
Disk groups containing mirrored or RAID 5 volumes should include designated hot spare disks. The following commands designate dsk9 as a hot spare for our disk group:
# voledit -g dg1 set spare=on dsk9 # volwatch -s firstname.lastname@example.org
The volwatch command enables automatic hot spare replacement (-s), and its argument is the email address to which to send notifications when these events occur.
Once an LSM volume is created, it can be placed within an AdvFS domain and used for creating filesets.
The following commands are useful in obtaining information about LSM entities:
Finally, the volsave command is used to save the LSM metadata to a disk file, which can then be backed up. The default location for these files is /usr/var/lsm/db, but you can specify an alternate location using the command's -d option. The files themselves are given names of the form LSM.n.host, where n is a 14 digit encoding of the date and time. The volrestore command will restore the saved data should it ever be necessary.
Solaris 9 introduces a logical volume manager as part of the standard operating system. This facility was available as an add-on product with earlier versions of Solaris (although there have been some changes with respect to previous versions see the documentation for details).
The Solaris Volume Manager supports striping, mirroring, RAID 5, soft partitions (the ability to divide any disk into more than four partitions), and some other features. The Volume Manager must be initialized before its first use, using commands like these:
# metadb -a -f c0t0d0s7 Create initial state database replicas. # metadb -a -c 2 c1t3d0s2 Add replicas on this slice.
We are now ready to create volumes. We will look briefly at some simple examples in the remainder of this section.
The Solaris Volume Manager uses fixed names for volumes of the form dn, where n is an integer from 0 to 127. Thus, the maximum number of volumes is 128. The metainit command does most of the work of creating and configuring volumes.
The following command will create a concatenated volume consisting of three disks:
# metainit d1 3 1 c1t1d0s2 1 c1t2d0s2 1 c1t3d0s2
The parameters are the volume name, the number of components (always greater than one for a concatenated volume), and then three pairs consisting of the number of component disks (always 1 here) followed by desired disk(s). When the command completes, the volume d1 can be treated as if it were a single disk partition.
You can expand an existingfilesystem using a similar command, as in this example, which expands the /docs filesystem (originally on c0t0d0s6):
# umount /docs # metainit d10 2 1 c0t0d0s6 1 c2t3d0s2 Add additional disk space. # vi /etc/vfstab Change the filesystem's devices to /dev/md/[r]dsk/md10. # mount /docs # growfs -M /dev/md/rdsk/d10 Increase the filesystem size to the volume size.
The following command will create a striped volume:
# metainit d2 1 2 c1t1d0s2 c2t2d0s2 -i 64k
The parameters following the volume name indicate that we are creating a singlestriped volume with two component disks, using a stripe size (interlace value) of 64 KB (-i).
You can mirror volumes using metainit's -m option, followed by the metattach command, as in this example:
# metainit d20 -f 1 1 c0t3d0s2 Create the volume to be mirrored. # umount /dev/dsk/c0t3d0s2 # metainit d21 1 1 c2t1d0s2 Create a volume to be used as the mirror. # metainit d22 -n d20 Specify the volume to be mirrored. # vi /etc/vfstab Modify entry to point to the mirror volume (d22). # mount /dev/md/dsk/d22 Remount filesystem. # metattach d22 d21 Add a mirror.
In this case, we add a mirror to an existing filesystem. We use the -f option on the first metainit command to force a volume to be created from an existing filesystem.
Other volume types can also be mirrored concatenated, striped, etc. using just the final two commands.
You can specify the read and write policies for mirrored volumes using the metaparam command, as in this example:
# metaparam -r geometric -w parallel d22
The -r option specifies the read policy, one of roundrobin (successive read operations go to each disk in turn, which is the default), first (all reads go to the first disk), and geometric (read operations are divided between the component disks by assigning specific disk regions to each one). The geometric read policy can minimize seek times by confining disk head movement to a subset of the disk, which can produce measurable performance improvements for I/O that is seek time-limited (e.g., randomly accessed data, such as a database).
The -w parameter specifies the write policy, one of parallel (write to all disks at the same time, which is the default) and serial. The latter might be used to improve performance when both mirrors are on the same busy disk controller.
The following command will create aRAID 5 volume:
# metainit d30 -r disks -i 96k
This creates a RAID 5 volume using a stripe size of 96 KB. The default stripe size is 16 KB, and it must range from 8 KB to 100 KB.
You can replace a failed RAID 5 component volume using the metareplace command, as in this example:
# metareplace -e d30 c2t5d0s2
Alternatively, you can define a hot spare pool from which disks can be taken as needed for all RAID 5 devices. For example, these commands create a pool named hsp001 and designate it for use with RAID 5 device d30:
# metainit hsp001 c3t1d0s2 c3t2d0s2 # metaparam -h hsp001 d30
You can modify the disks in a hot space pool using the metahs command and its -a (add), -r (replace), and -d (delete) options.
The last Volume Manager feature we'll consider is soft partitions. Soft partitions are simply logical partitions (subsets) of a disk. For example, the following command creates a volume consisting of 2 GB from the specified disk:
# metainit d7 -p c2t6d0s2 2g
When used with a new disk, you can add the -e option to the command. This causes the disk to be repartitioned so that all but 4 MB is in slice 0 (the 4 MB is in slice 7 and is used to hold a state database replica). For example, this command performs that repartitioning and then assigns 3 GB of slice 0 to volume d8:
# metainit d8 -p -e c2t5d0s2 3g
Once volumes are created, you can create a UFS filesystem on them using newfs as usual. You can also remove any volume with the metaclear command, which takes the desired volume as its argument. Naturally, any data on the volume will be lost.
The following commands are useful for obtaining information about the Volume Manager and individual volumes:
Linux systems can use both a logical volume manager and software disk striping and RAID, although the two facilities are separate. They are compatible, however; for example, RAID volumes can be used as components in the logical volume manager.
The Linux Logical Volume Manager (LVM) project has been in existence for several years (its homepage is http://www.sistina.com/products_lvm.htm), and support for the LVM is merged into the Linux 2.4 kernel. Conceptually, the LVM allows you to combine and divide physical disk partitions in a completely flexible manner. The resulting filesystems are dynamically resizable. The current version of the LVM supports up to 99 volume groups and 256 logical volumes. The maximum logical volume size is currently 256 GB.
The logical volume manager is included in some recent Linux distributions (for example, SuSE Linux 6.4 and later). If it is not included in yours, installing it is quite straightforward:
The LVM package includes a large number of administrative utilities, each of which is designed to create or manipulate a specific type of LVM entity. For example, the commands vgcreate, vgdisplay, vgchange, and vgremove create, display information about, modify the characteristics of, and delete a volume group (respectively). You can also backup and restore the volume group configurations with vgcfgbackup and vgcfgrestore, change the size of a volume group with vgextend (increase its size by adding disk space to it) and vgreduce (decrease its size), divide and combine volume groups (vgsplit and vgmerge), move a volume group between computer systems (vgexport and vgimport), search all local disks for volume groups (vgscan), and rename a volume group (vgrename). (Many of these commands are similar to the HP-UX equivalents.)
There are similar commands for other LVM entities:
Let's look at some of these commands in action as we create a volume group and some logical volumes and then build filesystems on them.
The first step is to set the partition type of the desired disk partitions to 0x8E. We use fdisk for this task; here is the process for the first disk partition:
# fdisk /dev/sdb1 Command (m for help): t Partition number (1-4): 1 Hex code (type L to list codes): 8e Command (m for help): w
The first time we use the LVM, we need to run vgscan to initialize the facility (among other things, it creates the /etc/lvmtab file). Next, we designate the disk partitions as physical volumes by specifying the desired disk partitions as command arguments to the pvcreate command (/dev/sdc2 is the second partition we will be using in our volume group):
# pvcreate /dev/sdb1 /dev/sdc2 pvcreate -- reinitializing physical volume pvcreate -- physical volume "/dev/sdb1" successfully created ...
We are now ready to create a volume group, which we will name vg1:
# vgcreate vg1 /dev/sdb1 /dev/sdc2 vgcreate -- INFO: using default physical extent size 4 MB vgcreate -- INFO: maximum logical volume size is 255.99 Gigabyte vgcreate -- doing automatic backup of volume group "vg1" vgcreate -- volume group "vg1" successfully created and activated
This command creates the vg1 volume group using the two specified disk partitions. In doing so, it creates/updates the ASCII configuration file /etc/lvmtab (which holds the names of the system's volume groups) and places a binary configuration file into two subdirectories of /etc: lvmtab.d/vg1 and lvmconf/vg1.conf (the latter directory will also store old binary configuration files for this volume group, reflecting changes to its characteristics and components).
The vgcreate command also creates the special file /dev/vg1/group, which can be used to refer to the volume group as a device.
Now we can create two 800 MB logical volumes:
# lvcreate -L 800M -n chem_lv vg1 lvcreate -- doing automatic backup of "vg1" lvcreate -- logical volume "/dev/vg1/chem_lv" successfully created # lvcreate -L 800M -n bio_lv -r 8 -C y vg1 lvcreate -- doing automatic backup of "vg1" lvcreate -- logical volume "/dev/vg1/bio_lv" successfully created
We set the sizes of both logical volumes via the lvcreate command's -L option. In the case of the second logical volume, bio_lv, we also specify that the read-ahead mode chunk size is 8 sectors via -r (the amount of data returned at a time during sequential access) and specify that a contiguous logical volume be created (via the -C y option).
Once again, two new special files are created, each named after the corresponding logical volume and located under the volume group directory in /dev (here, /dev/vg1).
We can now create filesystems using the ordinary mke2fs command, specifying the logical volume as the device on which to build the new filesystem. For example, the following command creates an ext3 filesystem on the bio_lv logical volume:
# mke2fs -j /dev/vg1/bio_lv
Once built, this filesystem may be mounted as usual. You can also build a Reiser filesystem on a logical volume.
In addition to the previously mentioned commands, the LVM provides the e2fsadmin command, which can be used to increase the size of a logical volume and the ext2 or ext3 filesystem it contains a single, nondestructive operation. This utility requires the resize2fs utility (originally developed by PowerQuest as part of its PartitionMagic product and now available under the GPL at http://e2fsprogs.sourceforge.net).
Here is an example of its use; the following command adds 100 MB to the bio_lv logical volume and the filesystem that it contains:
# umount /dev/vg1/bio_lv # e2fsadm /dev/vg1/bio_lv -L+100M e2fsck 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09 Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/vg1/bio_lv: 11/51200 files (0.0% non-contiguous), 6476/819200 blocks lvextend -- extending logical volume "/dev/vg1/bio_lv" to 900 MB lvextend -- doing automatic backup of volume group "vg1" lvextend -- logical volume "/dev/vg1/bio_lv" successfully extended resize2fs 1.19 (13-Jul-2000) Begin pass 1 (max = 5) Extending the inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Begin pass 3 (max = 25) Scanning inode table XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX The filesystem on /dev/vg1/bio_lv is now 921600 blocks long. e2fsadm -- ext2fs in logical volume "/dev/vg1/bio_lv" successfully extended to 900 MB
Note that the filesystem must be unmounted in order to increase its size.
To use the Linux software RAID facility, you must install the component disks, enable RAID support in the kernel and then set up the RAID configuration. You can perform the second task using a utility like make xconfig and selecting the Block Devices category from the main menu. The Multiple devices driver support item is the one that must be enabled to access all of the other RAID-related items. I recommend enabling all of them.
RAID devices use special files of the form /dev/mdn (where n is an integer), and they are defined in the /etc/raidtab configuration file. Once defined, you can create them using the mkraid command and start and stop them with the raidstart and raidstop commands. Alternatively, you can define them with the persistent superblock options, which enables automatic detection and mounting/dismounting of RAID devices by the kernel. In my view, the latter is always the best choice.
The best way to understand the /etc/raidtab file is to examine some sample entries. Here is an entry corresponding to a striped disk using two component disks, which I have annotated:
raiddev /dev/md0 Defines RAID device 0. raid-level 0 RAID level. nr-raid-disks 2 Number of component disks. chunk-size 64 Stripe size (in KB). persistent-superblock 1 Enable the persistent superblock feature. device /dev/sdc1 Specify the first component disk ... raid-disk 0 and number it. device /dev/sdd1 Same for all remaining component disks. raid-disk 1
If we had wanted to define a two-way mirror set instead of a stripe set, using the same disks, we would omit the chunk-size parameter and change the raid-level parameter from 0 to 1 in the first section, and the rest of the entry would remain the same.
We can set up a RAID 0+1 disk, a mirrored striped disk, in this way:
raiddev /dev/md0 ...Set up the first striped disk. raiddev /dev/md1 ...Set up the second striped disk. raiddev /dev/md2 raid-level 1 nr-raid-disks 2 persistent-superblock 1 device /dev/md0 The component disks are also md devices. raid-disk 0 device /dev/md1 raid-disk 1
The following entry defines a RAID 5 disk containing 5 component disks, as well as a spare disk to be automatically used should any of the active disks fail:
raiddev /dev/md0 raid-level 5 Use RAID level 5. nr-raid-disks 5 Number of active disks in the device. persistent-superblock 1 device /dev/sdc1 Specify the 5 component disks. raid-disk 0 device /dev/sdd1 raid-disk 1 device /dev/sde1 raid-disk 2 device /dev/sdf1 raid-disk 3 device /dev/sdg1 raid-disk 4 device /dev/sdh1 Specify a spare disk. spare-disk 0
You can use multiple spare disks if you want to.
RAID devices can be used with the logical volume manager if desired.
FreeBSD provides theVinum Volume Manager. It uses somewhat different concepts than other LVMs. Under Vinum, a drive is a physical disk partition. Disk space is allocated from drives in user-specified chunks known as subdisks. Subdisks in turn are used to define plexes, and one or more plexes makes up a Vinum volume. Multiple plexes within a volume constitute mirrors.
Be prepared to be very patient when learning Vinum. It is quite inflexible in how it wants operations to be performed. Plan to learn the procedures on a safe test system.
In addition, be aware that the facility is still under development. As of this writing, only the most basic functionality is present.
To use a disk partition with Vinum, it must be prepared as follows:
Partition e is somewhat arbitrary, but it works. Note that partition c cannot be used with Vinum.
Once the drives are prepared, the best way to proceed is to create a description file that defines the Vinum entities that you want to create. Here is a file that defines a volume named big:
drive d1 device /dev/da1s1e Define drives. drive d2 device /dev/da2s1e volume big Define volume big. plex org concat Create a concatenated plex. sd length 500m drive d1 First 500 MB subdisk from drive d1. sd length 200m drive d2 Second 200 MB subdisk from drive d2.
The file first defines the drives to be used, naming them d1 and d2. Note that this operation needs to be performed only once for a given partition. Future example configurations will omit drive definitions.
The second section of the file defines the volume big as one concatenated plex (org concat). It consists of two subdisks: 500 MB of space from /dev/da1s1e and 200 MB of space from /dev/da2s1e. This disk space will be treated as a single unit.
You can create these entities using the following command:
# vinum create /etc/vinum.big.conf
The final argument specifies the location of the description file.
Once the volume is created, you can create a filesystem on it:
# newfs -v /dev/vinum/big
The device is specified via the file in /dev/vinum named for the volume. The -v option tells newfs not to look for partitions on the specified device. Once newfs completes, the filesystem may be mounted. For it to be detected properly at boot time, however, the following line must be present in /etc/rc.conf:
This causes the Vinum kernel module to be loaded on boots.
Here is a description file that defines a striped (RAID 1) volume:
volume fast plex org striped 1024k sd length 0 drive d1 sd length 0 drive d2
This stripe set consists of two components. The plex line has an additional entry, the stripe size. This value must be a multiple of 512 bytes. The subdisk definitions specify a length of 0; this corresponds to all available space in the device. The actual volume can be created using the vinum create command as before.
If both of these volumes were created, then different areas of the various disk partitions would be used by each one. Vinum drives can be subdivided among different volumes. You can specify the location with the drive when the subdisk is created (see the vinum(8) manual page for details).
The following configuration file creates amirrored volume by defining two plexes:
volume mirror plex org concat First mirror. sd length 1000m drive d1 plex org concat Second mirror. sd length 1000m drive d2
Creating and activating the mirrored volume requires several vinum commands (the output is not shown):
# vinum create file Create the volume. # vinum init mirror.p1 Initialize the subdisk. Wait for command to finish. # vinum start mirror.p1 Activate the mirror.
When you first create a mirrored volume, the state of the second plex appears in status listings as faulty, and its component subdisk has a status of empty. The vinum init command initializes all of the component subdisks of plex mirror.p1, and the vinum start command regenerates the mirror (actually, creates it for the first time). Both of these commands start background processes to do the actual work, and you must wait for the initialization to finish before running the regeneration. You can check on their status using this command:
# vinum list
Once both of these commands have completed, you can build a filesystem and mount it.
The following description file created aRAID 5 volume named safe:
volume safe plex org raid5 1024k sd length 0 drive d1 sd length 0 drive d2 sd length 0 drive d3 sd length 0 drive d4 sd length 0 drive d5
This volume consists of a single plex containing five subdisks. The following commands can be used to create and activate the volume:
# vinum create file Create the volume. # vinum init safe.p0 Initialize the subdisks.
Once again, the initialization process runs in the background, and you must wait for it to finish before creating a filesystem.
As a final example, consider this description file:
volume zebra plex org striped 1024k sd length 200m drive d1 sd length 200m drive d2 plex org striped 1024k sd length 200m drive d3 sd length 200m drive d4
This file defines a volume named zebra, which is a striped mirrored volume (RAID 0+1). The volume consists of two striped plexes which become mirrors. The following commands are required to create and activate this volume:
# vinum create file Create the volume. # vinum init zebra.p0 zebra.p1 Initialize subdisks. # vinum start zebra.p1 Regenerate the mirror.
The following commands are useful for displaying Vinum information:
You can follow any of these commands with the name of a specific item to limit the display to its characteristics.
Here is an example of the vinum list command:
4 drives: D d1 State: up Device /dev/ad1s1e Avail: 2799/2999 MB (93%) D d2 State: up Device /dev/ad1s2e Avail: 2799/2999 MB (93%) D d3 State: up Device /dev/ad1s3e Avail: 2799/2999 MB (93%) D d4 State: up Device /dev/ad1s4e Avail: 532/732 MB (72%) 1 volumes: V zebra State: up Plexes: 2 Size: 400 MB 2 plexes: P zebra.p0 S State: up Subdisks: 2 Size: 400 MB P zebra.p1 S State: faulty Subdisks: 2 Size: 400 MB 4 subdisks: S zebra.p0.s0 State: up PO: 0 B Size: 200 MB S zebra.p0.s1 State: up PO: 1024 kB Size: 200 MB S zebra.p1.s0 State: R 16% PO: 0 B Size: 200 MB S zebra.p1.s1 State: R 16% PO: 1024 kB Size: 200 MB
This display shows the zebra volume we defined earlier. The subdisk initialization has completed. At this moment, the regeneration operation is 16% complete.
10.3.4 Floppy Disks
On systems with floppy disk drives, Unix filesystems may also be created on floppy disks. (Before they can be used, floppy disks must, of course, be formatted.) But why bother? These days, it is usually much more convenient to use floppy disks in one of the following ways:
10.3.4.1 Floppy disk special files
Floppy disks are accessed using the followingspecial files (the default refers to a 1.44 MB 3.5-inch diskette):
Floppy disk special files are only occasionally needed on Solaris systems, because these devices are managed by the media handling daemon (discussed later in this chapter).
10.3.4.2 Using DOS disks on Unix systems
Methods for accessing DOS disks vary widely from system to system. In this section, we'll look at formatting diskettes in DOS format and copying files to and from them on each system.
Under HP-UX, the following commands format a DOS floppy disk:
$ mediainit -v -i2 -f16 /dev/rdsk/c0t1d0 $ newfs -n /dev/rdsk/c0t1d0 ibm1440
The -n option on the newfs command prevents boot information from being written to the diskette.
HP-UX provides a number of utilities to access files on DOS diskettes: doscp, dosdf, doschmod, dosls, dosll, dosmkdir, dosrm, and dosrmdir. Here is an example using doscp:
$ doscp /dev/rdsk/c0d1s0:paper.txt paper.new
This command copies the file paper.txt from the diskette to the current HP-UX directory.
On Linux and FreeBSD systems, a similar process is used. These commands format a DOS floppy and write files to it:
The Mtools utilities are also available on Linux and FreeBSD systems (described in the next section).
AIX also provides several utilities for accessing DOS disks: dosformat, dosread, doswrite, dosdir, and dosdel. However, they provide only minimal functionality for example, there is no wildcard support so you'll be much happier and work more efficiently if you use the Mtools utilities.
On Solaris systems, diskettes are controlled by the volume management system and its vold daemon. This facility merges the diskette as transparently as possible within the normal Solaris filesystem.
These commands could be used to format a diskette and create a DOS filesystem on it:
$ volcheck $ fdformat -d -b g04
The volcheck command tells the volume management system to look for new media in the devices that it controls. The fdformat command formats the diskette, giving it a label of g04.
The following commands illustrate the method for copying files to and from diskette:
$ volcheck $ cp ~/proposals/prop2.txt /floppy/g96 $ cp /floppy/g96/drug888.dat ./data $ eject
The diskette is mounted in a subdirectory of /floppy named for its label (or in /floppy/unnamed_floppy if it does not have a label). Configuration of vold is discussed later in this chapter.
Tru64 provides no support for DOS diskettes, so you'll need to use the Mtools utilities, to which we will now turn.
10.3.4.3 The Mtools utilities
The Mtools package is available for all the Unix versions we are considering. It is currently maintained by David Niemi and Alain Knaff (see http://mtools.linux.lu).
The package contains a series of utilities for accessing DOS diskettes and their files, modeled after their similarly named DOS counterparts:
Here are some examples of using the Mtools utilities:
$ mdir Volume in drive A is GIAO24 Directory for A:/ SILVERDAT79 1-29-95 9:36p PROP43_1 TXT2304 1-29-95 9:33p REFCARD DOC73216 1-13-95 5:28p 3 File(s) 1381376 bytes free $ mren prop43_1.txt prop43_1.old $ mcopy a:refcard.doc . Copying REFCARD.DOC $ mcopy proposal.txt a: Copying PROPOSAL.TXT $ mmd data2 $ mcopy gold* a:data2 Copying GOLD.DAT Copying GOLD1.DAT $ mcopy "a:\data\*.dat" ./data Copying NA.DAT Copying HG.DAT $ mdel silver.dat
As these examples illustrate, the Mtools utilities are designed to make accessing diskettes as painless as possible. For example, it generally assumes that files being referred to are on the floppy disk. The only time that you have to refer explicitly to the diskette via the a: construct is with the mcopy command, which makes sense because there is no other way to know which direction the copy is taking place. Note also that filenames on diskette are not case-sensitive.
10.3.4.4 Stupid DOS partition tricks
On PC-based Unix systems, hard-disk DOSpartitions can also be mounted within the Unix filesystem. This allows not only for copying files between Unix and the other operating systems, but also for handling the entire partition using Unix utilities. For example, suppose you decide to change the partitioning scheme on your boot disk, decreasing the size of the DOS partition (without affecting the Unix partitions). The following commands will let you do so without reinstalling DOS, Windows, or any installed software:
# mount -t msdos /dev/hdal /mnt Linux is used as an example. # cd /mnt # tar -c -f /tmp/dos.tar * # unmount /mnt Mess with partitions and/or filesystems. # mount -t msdos /dev/hda1 /mnt # cd /mnt # tar -x -f /tmp/dos.tar # cd /; umount /mnt
You could restore only some of the files from the tar archive if that is what made sense. Many other operations along these lines are also possible: for example, moving the DOS partition from the first hard drive to the second one, copying a DOS partition between systems or across a network, and so on. There are, of course, other ways of accomplishing these same tasks, but this procedure is often much faster.
10.3.5 CD-ROM Devices
CD-ROM drives are also generally treated in a manner similar to disks. The following special files are used to access SCSI CD-ROM devices:
The following example commands all mount a CD on the various systems:
mount -o ro -v cdrfs /dev/cd0 /mnt AIX mount -r -t cd9660 /dev/cd0c /mnt FreeBSD mount -o ro -F cdfs /dev/dsk/c1t2d0 /mnt HP-UX mount -r -t iso9660 /dev/sonycd_31a /mnt Linux mount -o ro -t hsfs /dev/c0t2d0s0 /mnt Solaris mount -r -t cdfs /dev/disk/cdrom0c /mnt Tru64
Entries can also be added to the filesystem configuration file for CD-ROM filesystems.
10.3.5.1 CD-ROM drives under AIX
On AIX systems, if you add a CD-ROM drive to an existing system, you'll need to create a device for it in this manner:
# mkdev -c cdrom -r cdrom1 -s scsi -p scsi0 -w 5,0 cd0 available
This command adds a CD-ROM device using SCSI ID 5.
Individual CDs are usually mounted via predefined mount points. For example, the following commands create a generic CD-ROM filesystem to be mounted on /cdrom:
# mkdir /cdrom # crfs -v cdrfs -p ro -d cd0 -m /cdrom -A no
This filesystem will be mounted read-only and will not automatically be mounted when the system boots. A CD may now be mounted with the mount /cdrom command.
The lsfs command may be used to list all defined CD-ROM filesystems:
$ lsfs -v cdrfs Name Nodename Mount Pt VFS Size Options Auto Acct /dev/cd0 -- /cdrom cdrfs -- ro no no
10.3.5.2 The Solaris media-handling daemon
Solaris has a similar media handling facility implemented by the vold daemon. It generally mounts CDs and diskettes in directory trees rooted at /cdrom and /floppy, respectively, creating a subdirectory named for the label on the current media (or unnamed_cdrom and unnamed_floppy for unlabeled ones).
There are two configuration files associated with the volume management facility. /etc/vold.conf specifies the devices that it controls and the filesystem types it supports:
# Volume Daemon Configuration file # # Database to use (must be first) db db_mem.so # Labels supported label dos label_dos.so floppy label cdrom label_cdrom.so cdrom label sun label_sun.so floppy # Devices to use use cdrom drive /dev/dsk/c0t6 dev_cdrom.so cdrom0 use floppy drive /dev/diskette dev_floppy.so floppy0 # Actions insert /vol*/dev/diskette[0-9]/* user=root /usr/sbin/rmmount insert /vol*/dev/dsk/* user=root /usr/sbin/rmmount eject /vol*/dev/diskette[0-9]/* user=root /usr/sbin/rmmount eject /vol*/dev/dsk/* user=root /usr/sbin/rmmount notify /vol*/rdsk/* group=tty /usr/lib/vold/volmissing -c # List of file system types unsafe to eject unsafe ufs hsfs pcfs
The section labeled Actions indicates commands to be run when various events occur media is inserted or removed, for example. The final section lists filesystem types that must be unmounted before being removed and hence will require the user to issue an eject command.
If you want to share mounted CDs via the network, you'll need to add an entry to /etc/rmmount.conf :
# Removable Media Mounter configuration file. # # File system identification ident hsfs ident_hsfs.so cdrom ident ufs ident_ufs.so cdrom floppy ident pcfs ident_pcfs.so floppy # Actions action -premount floppy action_wabi.so.1 action cdrom action_filemgr.so action floppy action_filemgr.so # File System Sharing share cdrom* share solaris_2.x* -o ro:phys
File-sharing entries are in the final section of this file. An entry is provided for sharing standard CD-ROM filesystems (mounted at /cdrom/cdrom*). The -o in the second entry in this section passes options to the share command, in this case limiting access. You can modify the provided entry for CD-ROMs if appropriate. Shared CD-ROM filesystems can be mounted by other systems using the mount command and entered into their /etc/vfstab files.