6.2 LVM Mirroring (RAID 1) | HP-UX CSE(c) Official Study Guide and Desk Reference

LVM mirroring is the process of having multiple Physical Extents per Logical Extent. Ideally, the additional Physical Extents are on a separate disk drive (strict allocation) and the disk is connected to a separate physical interface card (PVG-strict can be set up to assist in this). The default write behavior to a mirrored volume (parallel IO) assumes that you have taken these steps to alleviate a bottleneck in the mirroring configuration. The default mirror catch-up policy is somewhat different from a performance perspective. The Mirror Write Cache (MWC) can be seen to introduce additional IO to a disk whenever a write to a block is undertaken. In LVM-speak, a mirrored block is known as a Logical Track Group (LTG) and is 256KB in size. (This is handy because HP-UX can merge IO on consecutive disk blocks into a single merged-IO transaction, which happens to be 256KB in size ) The use of the MWC does allow for quicker recoveries after a disk failure because the MWC can be used to quickly resync only the stale extents (LTGs actually). There is a slight confusion sometimes when we talk about the MWC when it is written to disk. In order to perform a quick-resync after a disk failure, we need to ensure that the MWC is written to disk in case the system is rebooted. When the MWC is written to disk, it is referred to as the Mirror Consistency Record (MCR). However, if we choose not to use the MWC but we still want to offer some form of recovery, the no-MWC option is known as Mirror Consistency Recovery (MCR). Was that a bad choice of name ? Possibly. The way to remember it is simple:

MWC : Fast resync. Disk version of MWC is simply that: a disk-based MWC. I don't refer to it as a Mirror Consistency Recovery because it gets confusing when you mention
MCR : Slow resync. No disk-based MWC, so we need to resync the entire volume.
No MCR : No resync at all. The data is never resynchronized. Probably, totally transient data that will be rebuilt after a reboot. We are using this simply to avoid an interruption to service due to a disk failure, e.g., swap space, scratch area for RDBMS system.

6.2.1 PVG-strict

PVG-strict is a strictness policy in relation to how the mirroring of logical volumes is performed. I have seen PVG-strict used in a variety of crazy ways. The idea behind PVG-strict is to allow you to put disk in a Volume Group into what is in effect a subgroup . These sub-groups are intended to house disks connected to the same interface. By using PVG-strict allocation policy, you are forcing the additional Physical Extents to come from a disk from a different Physical Volume Group (PVG), which must mean they are from a disk connected to a different interface card. This is good because we are not sending mirror-IO down the same interface as the IO for the original Physical Extents (see Figure 6-1).

Figure 6-1. Physical Volume Groups.

graphics/06fig01.jpg

The correct setup of PVGs is entirely the responsibility of the administrator. You know what you are doing, don't you?! The other consideration is that LVM allows you to explicitly specify which disk you want to mirror onto. I have known some novice administrators complain that a volume in PVG0 on disk 0, for example, was being mirrored to PVG1 (good) but disk 1. Someone pointed out that "it didn't really matter which disk we mirrored to, as long as it is in a different PVG." The (common) response is that "it makes my diagrams look messy." Yes, but that's not the point of PVGs. If you want "nice diagrams," then explicitly specify at the command line on which disk you want your mirror volume to reside. In my experience, I prefer to explicitly specify on which disk my data and mirrored data are housed. Consequently, I don't use PVGs except in specific situations (Distributed volumes): I simply create a logical volume on a specified physical volume and then set up the mirror(s) on specified physical volumes. It's relatively simple and makes your diagrams easy to understand. I am not going to go through an example of this simplistic case, as I believe it is trivial. I will go through an example of Physical Volume Groups because I have been privy to some horrendous configurations and want to ensure that you don't do the same.

I set up a single mirrored logical volume in the /dev/vgora1 group as depicted in Figure 6-1 using PVGs. I set up the logical volume in a good configuration as well as an oh-not-that configuration. Here's an example of how to do it correctly (thinking about high availability and performance).

 root@hpeos003[]  mkdir /dev/vgora1  root@hpeos003[]  mknod /dev/vgora1/group c 64 0x010000  root@hpeos003[]  pvcreate /dev/rdsk/c0t1d0  Physical volume "/dev/rdsk/c0t1d0" has been successfully created. root@hpeos003[]  pvcreate /dev/rdsk/c0t2d0  Physical volume "/dev/rdsk/c0t2d0" has been successfully created. root@hpeos003[]  pvcreate /dev/rdsk/c0t3d0  Physical volume "/dev/rdsk/c0t3d0" has been successfully created. root@hpeos003[]  pvcreate /dev/rdsk/c4t9d0  Physical volume "/dev/rdsk/c4t9d0" has been successfully created. root@hpeos003[]  pvcreate /dev/rdsk/c4t10d0  Physical volume "/dev/rdsk/c4t10d0" has been successfully created. root@hpeos003[]  pvcreate /dev/rdsk/c4t11d0  Physical volume "/dev/rdsk/c4t11d0" has been successfully created. root@hpeos003[] root@hpeos003[]  vgcreate /dev/vgora1 /dev/dsk/c0t[123]d0 /dev/dsk/c4t9d0 /dev/dsk/c4t10d0  /dev/dsk/c4t11d0  Increased the number of physical extents per physical volume to 17501. Volume group "/dev/vgora1" has been successfully created. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]

That's the volume group taken care of. Now I can create the /etc/lvmpvg file by hand. This is where I need to be careful:

 root@hpeos003[]  vi /etc/lvmpvg  VG      /dev/vgora1 PVG     PVG0 /dev/dsk/c0t1d0 /dev/dsk/c0t2d0 /dev/dsk/c0t3d0 PVG     PVG1 /dev/dsk/c4t9d0 /dev/dsk/c4t10d0 /dev/dsk/c4t11d0 root@hpeos003[] root@hpeos003[]  vgdisplay -v /dev/vgora1  --- Volume groups --- VG Name                     /dev/vgora1 VG Write Access             read/write VG Status                   available Max LV                      255 Cur LV                      0 Open LV                     0 Max PV                      16 Cur PV                      6 Act PV                      6 Max PE per PV               17501 VGDA                        12 PE Size (Mbytes)            4 Total PE                    104994 Alloc PE                    0 Free PE                     104994 Total PVG                   0 Total Spare PVs             0 Total Spare PVs in use      0    --- Physical volumes ---    PV Name                     /dev/dsk/c0t1d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c0t2d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c0t3d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c4t9d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c4t10d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c4t11d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    --- Physical volume groups ---    PVG Name                    PVG0    PV Name                     /dev/dsk/c0t1d0    PV Name                     /dev/dsk/c0t2d0    PV Name                     /dev/dsk/c0t3d0    PVG Name                    PVG1    PV Name                     /dev/dsk/c4t9d0    PV Name                     /dev/dsk/c4t10d0    PV Name                     /dev/dsk/c4t11d0 root@hpeos003[]

This looks fine; each PVG is made up of disks on separate interface cards. Let's create a PVG strict logical volume of 1000MB on c0t1d0 :

 root@hpeos003[]  lvcreate s g -n db /dev/vgora1  Logical volume "/dev/vgora1/db" has been successfully created with character device "/dev/vgora1/rdb". Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]  lvextend l 1 /dev/vgora1/db /dev/dsk/c0t1d0  Logical volume "/dev/vgora1/db" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]

I know I haven't created the 1000MB yet. The reason is that I created it with one extent, set up the mirroring, and then extended the volume to its correct size. In this way, the initial mirroring has to mirror only one extent.

 root@hpeos003[]  lvextend -m 1 /dev/vgora1/db  The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vgora1/db" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/db  --- Logical volumes --- LV Name                     /dev/vgora1/db VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            4 Current LE                  1 Allocated PE                2 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  PVG-strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    1         1    /dev/dsk/c4t9d0    1         1    --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00000 current  /dev/dsk/c4t9d0    00000 current root@hpeos003[]

Subsequent allocation will be on both halves of the mirror simultaneously . There is the outside possibility that LVM could give me additional extents on a different disk (if the volume group had some previously allocated extents), but because there are no other volumes in this volume group yet, I am confident that my volume will grow on c0t1d0 and c4t9d0 . This is due to the existence of PVGs and the use of PVG-strict allocation. If I simply used strict allocation, I would need to be very careful how I created and extended the volume. Here is an example where I use simple strict allocation:

 root@hpeos003[]  lvcreate -l 1 -n strict /dev/vgora1  Logical volume "/dev/vgora1/strict" has been successfully created with character device "/dev/vgora1/rstrict". Logical volume "/dev/vgora1/strict" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/ vgora1.conf root@hpeos003[]  lvextend -m 1 /dev/vgora1/strict /dev/dsk/c4t9d0  The newly allocated mirrors are now being synchronized. This operation will take some time  . Please wait .... Logical volume "/dev/vgora1/strict" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/strict  --- Logical volumes --- LV Name                     /dev/vgora1/strict VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            4 Current LE                  1 Allocated PE                2 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on   Allocation                  strict   IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    1         1    /dev/dsk/c4t9d0    1         1    --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00250 current  /dev/dsk/c4t9d0    00250 current root@hpeos003[]

Now if I simply extend the volume to 1000MB, it is interesting where the additional extents are obtained:

 root@hpeos003[]  lvextend -L 1000 /dev/vgora1/strict  Logical volume "/dev/vgora1/strict" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/strict  more  --- Logical volumes --- LV Name                     /dev/vgora1/strict VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            1000 Current LE                  250 Allocated PE                500 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    250       250   /dev/dsk/c0t2d0    249       249     /dev/dsk/c4t9d0    1         1   Standard input    --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00250 current  /dev/dsk/c4t9d0    00250 current    00001 /dev/dsk/c0t1d0    00251 current  /dev/dsk/c0t2d0    00000 current    00002 /dev/dsk/c0t1d0    00252 current  /dev/dsk/c0t2d0    00001 current    00003 /dev/dsk/c0t1d0    00253 current  /dev/dsk/c0t2d0    00002 current ...    00246 /dev/dsk/c0t1d0    00496 current  /dev/dsk/c0t2d0    00245 current    00247 /dev/dsk/c0t1d0    00497 current  /dev/dsk/c0t2d0    00246 current    00248 /dev/dsk/c0t1d0    00498 current  /dev/dsk/c0t2d0    00247 current    00249 /dev/dsk/c0t1d0    00499 current  /dev/dsk/c0t2d0    00248 current root@hpeos003[]

You can see that the additional mirror extents are obtained from the first available physical volume in the volume group. In such a situation, you need to be precise as to how to extend volumes. I will rectify this situation:

 root@hpeos003[]  lvreduce -l 1 /dev/vgora1/strict  When a logical volume is reduced useful data might get lost; do you really want the command to proceed (y/n) :  y  Logical volume "/dev/vgora1/strict" has been successfully reduced. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]  lvextend -L 1000 /dev/vgora1/strict /dev/dsk/c0t1d0 /dev/dsk/c4t9d0  Logical volume "/dev/vgora1/strict" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/strict  more  --- Logical volumes --- LV Name                     /dev/vgora1/strict VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            1000 Current LE                  250 Allocated PE                500 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV   /dev/dsk/c0t1d0    250       250     /dev/dsk/c4t9d0    250       250   --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00250 current  /dev/dsk/c4t9d0    00250 current    00001 /dev/dsk/c0t1d0    00251 current  /dev/dsk/c4t9d0    00251 current    00002 /dev/dsk/c0t1d0    00252 current  /dev/dsk/c4t9d0    00252 current ..    00247 /dev/dsk/c0t1d0    00497 current  /dev/dsk/c4t9d0    00497 current    00248 /dev/dsk/c0t1d0    00498 current  /dev/dsk/c4t9d0    00498 current    00249 /dev/dsk/c0t1d0    00499 current  /dev/dsk/c4t9d0    00499 current root@hpeos003[]

We can see that this is now configured as we would expect. It is worth noting this behavior of LVM because it can lead to a less-than -optimal solution unless you are exceptionally careful.

The other issue I wanted to point out is the way we create mirrored volumes. If you create the 1000MB volume first and then mirror it, it has lots of extents to mirror even though there isn't any data in there yet. In my examples above, I create a one-extent volume and then mirror only that one extent. When I come to extend this volume, LVM already knows it must assign additional extents on both sides of the mirror. This can save lots of setup time, especially if you have a large number of volumes to create. Obviously, this is not possible for existing volumes.

Let's extend the db volume to its correct size and view the results:

 root@hpeos003[]  lvextend -L 1000 /dev/vgora1/db  Logical volume "/dev/vgora1/db" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]  lvdisplay -v /dev/vgora1/db  more  --- Logical volumes --- LV Name                     /dev/vgora1/db VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            1000 Current LE                  250 Allocated PE                500 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  PVG-strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV   /dev/dsk/c0t1d0    250       250     /dev/dsk/c4t9d0    250       250   --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00000 current  /dev/dsk/c4t9d0    00000 current    00001 /dev/dsk/c0t1d0    00001 current  /dev/dsk/c4t9d0    00001 current    00002 /dev/dsk/c0t1d0    00002 current  /dev/dsk/c4t9d0    00002 current    00003 /dev/dsk/c0t1d0    00003 current  /dev/dsk/c4t9d0    00003 current ...    00248 /dev/dsk/c0t1d0    00248 current  /dev/dsk/c4t9d0    00248 current    00249 /dev/dsk/c0t1d0    00249 current  /dev/dsk/c4t9d0    00249 current root@hpeos003[]

This looks okay because my mirror is situated on a disk in the other PVG, which is made up of disks on another interface. Now, let's delete this volume and rework our /etc/lvmpvg file to look like the wrong configuration in Figure 6-1:

 root@hpeos003[]  lvremove /dev/vgora1/db  The logical volume "/dev/vgora1/db" is not empty; do you really want to delete the logical volume (y/n) :  y  Logical volume "/dev/vgora1/db" has been successfully removed. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  vi /etc/lvmpvg  VG      /dev/vgora1 PVG     PVG0 /dev/dsk/c0t1d0 /dev/dsk/c4t9d0 PVG     PVG1 /dev/dsk/c0t2d0 /dev/dsk/c4t10d0 PVG     PVG2 /dev/dsk/c0t3d0 /dev/dsk/c4t11d0 root@hpeos003[] root@hpeos003[]  vgdisplay -v /dev/vgora1  --- Volume groups --- VG Name                     /dev/vgora1 VG Write Access             read/write VG Status                   available Max LV                      255 Cur LV                      0 Open LV                     0 Max PV                      16 Cur PV                      6 Act PV                      6 Max PE per PV               17501 VGDA                        12 PE Size (Mbytes)            4 Total PE                    104994 Alloc PE                    0 Free PE                     104994 Total PVG                   3 Total Spare PVs             0 Total Spare PVs in use      0    --- Physical volumes ---    PV Name                     /dev/dsk/c0t1d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c0t2d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c0t3d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c4t9d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c4t10d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    PV Name                     /dev/dsk/c4t11d0    PV Status                   available    Total PE                    17499    Free PE                     17499    Autoswitch                  On    --- Physical volume groups ---    PVG Name                    PVG0    PV Name                     /dev/dsk/c0t1d0    PV Name                     /dev/dsk/c4t9d0    PVG Name                    PVG1    PV Name                     /dev/dsk/c0t2d0    PV Name                     /dev/dsk/c4t10d0    PVG Name                    PVG2    PV Name                     /dev/dsk/c0t3d0    PV Name                     /dev/dsk/c4t11d0 root@hpeos003[]

Now let's recreate our volume again:

 root@hpeos003[]  lvcreate -s g -n db /dev/vgora1  Logical volume "/dev/vgora1/db" has been successfully created with character device "/dev/vgora1/rdb". Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]  lvextend -l 1 /dev/vgora1/db /dev/dsk/c0t1d0  Logical volume "/dev/vgora1/db" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[]  lvextend -m 1 /dev/vgora1/db  The newly allocated mirrors are now being synchronized. This operation will take some time  . Please wait .... Logical volume "/dev/vgora1/db" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvextend -L 1000 /dev/vgora1/db  Logical volume "/dev/vgora1/db" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/db  more  --- Logical volumes --- LV Name                     /dev/vgora1/db VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            4 Current LE                  1 Allocated PE                2 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  PVG-strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV  /dev/dsk/c0t1d0    250         250   /dev/dsk/c0t2d0    250         250  --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00000 current  /dev/dsk/c0t2d0    00000 current    00001 /dev/dsk/c0t1d0    00001 current  /dev/dsk/c0t2d0    00001 current    00002 /dev/dsk/c0t1d0    00002 current  /dev/dsk/c0t2d0    00002 current ...    00247 /dev/dsk/c0t1d0    00247 current  /dev/dsk/c0t2d0    00247 current    00248 /dev/dsk/c0t1d0    00248 current  /dev/dsk/c0t2d0    00248 current    00249 /dev/dsk/c0t1d0    00249 current  /dev/dsk/c0t2d0    00249 current root@hpeos003[]

As you can see, my mirror adheres to PVG-strict but the disk the mirror is situated on is connected to the same interface as my original disk not a good configuration! This shows you that you need to fully understand device files and how they relate to the physical connections to your disks, especially when you have a technology such as a Fibre Channel SAN involved where decoding device files can be somewhat interesting .

6.2.2 Mirroring vg00

The first thing we need to remember is the physical layout of an LVM boot disk (Figure 6-2).

Figure 6-2. Layout of a bootable LVM disk.

With a non-bootable disk, the BDRA is missing. The LIF header and LIF data would have to fit into the 8K of space normally reserved just for the LIF Header; consequently, it is of little use.

When we set up a mirror of the logical volumes in vg00 , we must ensure that the boot, root, and primary swap volumes are created in exactly the same order as they are on the original disk. The PDC/IODC make certain assumptions regarding the order of volumes on the root/boot disk, especially in maintenance mode. Most administrators will duplicate the order of all volumes from the original disk to the disk(s) being used for the mirror(s). It is important to remember to make the other disk(s) bootable:

 root@hpeos003[]  pvcreate -B /dev/rdsk/c3t15d0  Physical volume "/dev/rdsk/c3t15d0" has been successfully created. root@hpeos003[]  vgextend /dev/vg00 /dev/dsk/c3t15d0  Volume group "/dev/vg00" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/vg00.conf root@hpeos003[] root@hpeos003[]  lifls /dev/rdsk/c3t15d0  lifls: Can't list /dev/rdsk/c3t15d0; not a LIF volume root@hpeos003[]  lifls -l /usr/lib/uxbootlf  volume ISL10 data size 19521 directory size 2 filename   type   start   size     implement  created =============================================================== ISL        -12800 16      306      0          00/11/08 20:49:59 AUTO       -12289 328     1        0          00/11/08 20:49:59 HPUX       -12928 336     848      0          00/11/08 20:50:00 PAD        -12290 1184    1580     0          00/11/08 20:50:00 root@hpeos003[]

If you have installed the OnlineDiag product, your original root/boot disk will have a plethora of offline diagnostics tool that you may want included on the new root/boot disk mirror.

 root@hpeos003[]  lifls /dev/rdsk/c1t15d0  ODE          MAPFILE      SYSLIB       CONFIGDATA   SLMOD2 SLDEV2       SLDRV2       SLSCSI2      MAPPER2      IOTEST2 PERFVER2     PVCU         SSINFO       ISL          AUTO HPUX         LABEL root@hpeos003[]

The default file for the mkboot command (/usr/lib/uxbootlf ) does not contain the offline diagnostics tools.

 root@hpeos003[]  lifls /usr/lib/uxbootlf  ISL          AUTO         HPUX         PAD root@hpeos003[]

One way of ensuring that the diagnostics tools are included on the new disk is to use your original disk as the source of the mkboot command:

 root@hpeos003[]  mkboot -b /dev/rdsk/c1t15d0 /dev/rdsk/c3t15d0  root@hpeos003[]  lifls -l /dev/rdsk/c3t15d0  volume ISL10 data size 7984 directory size 8 filename   type   start   size     implement  created =============================================================== ODE        -12960 584     848      0          02/07/08 12:33:46 MAPFILE    -12277 1432    128      0          02/07/08 12:33:46 SYSLIB     -12280 1560    353      0          02/07/08 12:33:46 CONFIGDATA -12278 1920    235      0          02/07/08 12:33:46 SLMOD2     -12276 2160    141      0          02/07/08 12:33:46 SLDEV2     -12276 2304    135      0          02/07/08 12:33:46 SLDRV2     -12276 2440    205      0          02/07/08 12:33:46 SLSCSI2    -12276 2648    131      0          02/07/08 12:33:46 MAPPER2    -12279 2784    142      0          02/07/08 12:33:46 IOTEST2    -12279 2928    411      0          02/07/08 12:33:46 PERFVER2   -12279 3344    124      0          02/07/08 12:33:46 PVCU       -12801 3472    64       0          02/07/08 12:33:46 SSINFO     -12286 3536    2        0          02/07/08 12:33:46 ISL        -12800 3544    306      0          00/11/08 20:49:59 AUTO       -12289 3856    1        0          00/11/08 20:49:59 HPUX       -12928 3864    848      0          00/11/08 20:50:00 LABEL      BIN    4712    8        0          03/10/01 15:59:56 root@hpeos003[] root@hpeos003[]  lifls -l /dev/rdsk/c1t15d0  volume ISL10 data size 7984 directory size 8 filename   type   start   size     implement  created =============================================================== ODE        -12960 584     848      0          02/07/08 12:33:46 MAPFILE    -12277 1432    128      0          02/07/08 12:33:46 SYSLIB     -12280 1560    353      0          02/07/08 12:33:46 CONFIGDATA -12278 1920    235      0          02/07/08 12:33:46 SLMOD2     -12276 2160    141      0          02/07/08 12:33:46 SLDEV2     -12276 2304    135      0          02/07/08 12:33:46 SLDRV2     -12276 2440    205      0          02/07/08 12:33:46 SLSCSI2    -12276 2648    131      0          02/07/08 12:33:46 MAPPER2    -12279 2784    142      0          02/07/08 12:33:46 IOTEST2    -12279 2928    411      0          02/07/08 12:33:46 PERFVER2   -12279 3344    124      0          02/07/08 12:33:46 PVCU       -12801 3472    64       0          02/07/08 12:33:46 SSINFO     -12286 3536    2        0          02/07/08 12:33:46 ISL        -12800 3544    306      0          00/11/08 20:49:59 AUTO       -12289 3856    1        0          00/11/08 20:49:59 HPUX       -12928 3864    848      0          00/11/08 20:50:00 LABEL      BIN    4712    8        0          03/10/01 15:59:56 root@hpeos003[]

We just need to ensure that the LABEL and the AUTO files are updated whenever we make changes to the root/boot configuration. Now we can begin mirroring the volumes in the correct order. The best way to know the correct order is to look at the minor number for the exiting logical volumes:

 root@hpeos003[]  ll /dev/vg00/[!rg]*  brw-r-----   1 root      sys         64 0x0000   01   Oct 29 08:34 /dev/vg00/lvol1 brw-r-----   1 root      sys         64 0x0000   02   Oct  1 21:21 /dev/vg00/lvol2 brw-r-----   1 root      sys         64 0x0000   03   Oct  1 21:21 /dev/vg00/lvol3 brw-r-----   1 root      sys         64 0x0000   04   Oct  1 21:21 /dev/vg00/lvol4 brw-r-----   1 root      sys         64 0x0000   05   Oct  1 21:21 /dev/vg00/lvol5 brw-r-----   1 root      sys         64 0x0000   06   Oct  1 21:21 /dev/vg00/lvol6 brw-r-----   1 root      sys         64 0x0000   07   Oct  1 21:21 /dev/vg00/lvol7 brw-r-----   1 root      sys         64 0x0000   08   Oct  1 21:21 /dev/vg00/lvol8 brw-r-----   1 root      sys         64 0x0000   09   Oct 24 13:24 /dev/vg00/lvol9 root@hpeos003[]

Because I have only two disks in vg00 , I don't need to specify the target Physical Volume on the lvextend command line. The mirroring will take a few minutes to complete. Time for coffee/tea/beer/ :

root@hpeos003[] for x in 1 2 3 4 5 6 7 8 9 > do > lvextend -m 1 /dev/vg00/lvol${x} > done The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol1" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol2" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol3" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol4" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol5" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol6" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol7" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol8" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vg00/lvol9" has been successfully extended. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/ vg00.conf root@hpeos003[]

Now I can ensure that we override quorum requirements in the event that we lose a disk and a reboot occurs (ensure that you update the AUTO file on both disks):

 root@hpeos003[]  mkboot -a "hpux -lq" /dev/rdsk/c1t15d0  root@hpeos003[]  mkboot -a "hpux -lq" /dev/rdsk/c3t15d0  root@hpeos003[]  lifcp /dev/rdsk/c1t15d0:AUTO -  hpux -lq root@hpeos003[]  lifcp /dev/rdsk/c3t15d0:AUTO -  hpux -lq root@hpeos003[]

We should ensure that the BDRA (and the LABEL file) is updated (it should already have been done) with the complete list of volumes in vg00 :

 root@hpeos003[]  lvlnboot -vR /dev/vg00  Boot Definitions for Volume Group /dev/vg00: Physical Volumes belonging in Root Volume Group:         /dev/dsk/c1t15d0 (0/0/1/1.15.0) -- Boot Disk         /dev/dsk/c3t15d0 (0/0/2/1.15.0) -- Boot Disk Boot: lvol1     on:     /dev/dsk/c1t15d0                         /dev/dsk/c3t15d0 Root: lvol3     on:     /dev/dsk/c1t15d0                         /dev/dsk/c3t15d0 Swap: lvol2     on:     /dev/dsk/c1t15d0                         /dev/dsk/c3t15d0 Dump: lvol2     on:     /dev/dsk/c1t15d0, 0 Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/vg00.conf root@hpeos003[]

Now we can set our alternate boot path to be the hardware path of the mirror disk:

 root@hpeos003[]  lssf /dev/dsk/c3t15d0  sdisk card instance 3 SCSI target 15 SCSI LUN 0 section 0 at address 0/0/2/1.15.0 /dev/dsk  /c3t15d0 root@hpeos003[]  setboot -a 0/0/2/1.15.0  root@hpeos003[]  setboot  Primary bootpath : 0/0/1/1.15.0 Alternate bootpath : 0/0/2/1.15.0 Autoboot is ON (enabled) Autosearch is ON (enabled) root@hpeos003[]

You can see that Autoboot and Autoseatch are both ON . This looks good. On newer , partitioned servers, you have access to an additional boot device known as the High Availability Alternate ( HAA ). This was designed as the second boot device. In such a situation, the HAA would be set (using parmodify ) to be the address of the disk containing the mirrored volumes. You could set the Alternate bootpath to be a third mirror if you had configured it.

At the moment, all appears well with the system and our mirroring appears to be in place and working. One small issue is the use of the MWC for our primary swap area:

 root@hpeos003[]  lvdisplay /dev/vg00/lvol2  --- Logical volumes --- LV Name                     /dev/vg00/lvol2 VG Name                     /dev/vg00 LV Permission               read/write LV Status                   available/syncd Mirror copies               1   Consistency Recovery        MWC   Schedule                    parallel LV Size (Mbytes)            2048 Current LE                  256 Allocated PE                512 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   off Allocation                  strict/contiguous IO Timeout (Seconds)        default root@hpeos003[]

Swap space is one of these situations where we don't want LVM to worry about tracking and recovering changes in the volume. After a reboot, the data in a swap area is gone, i.e., it is completely transient. The only problem with changing the Consistency Recovery setting for a volume is that the volume cannot be opened in order to change the configuration. The only way this can be achieved for Primary Swap is to boot the system in LVM Maintenance Mode and to activate vg00 without starting the resynchronization process. It is lots of work to go through, but it will mean a reduction in IO to the root/boot disk, and on reboot, it will reduce the time the system takes to resynchronize all volumes in vg00 after a failure.

 Processor is booting from first available device. To discontinue, press any key within 10 seconds. Boot terminated. ---- Main Menu --------------------------------------------------------------      Command                           Description      -------                           -----------      BOot [PRIALT<path>]             Boot from specified path      PAth [PRIALT] [<path>]           Display or modify a path      SEArch [DIsplayIPL] [<path>]     Search for boot devices      COnfiguration menu                Displays or sets boot values      INformation menu                  Displays hardware information      SERvice menu                      Displays service commands      DIsplay                           Redisplay the current menu      HElp [<menu><command>]           Display help for menu or command      RESET                             Restart the system ---- Main Menu: Enter command or menu >  bo pri  Interact with IPL (Y, N, or Cancel)?>  y  Booting... Boot IO Dependent Code (IODC) revision 1 HARD Booted. ISL Revision A.00.43  Apr 12, 2000 ISL>  hpux lm  Boot : disk(0/0/1/1.15.0.0.0.0.0;0)/stand/vmunix 10485760 + 1781760 + 1515760 start 0x1f8fe8 alloc_pdc_pages: Relocating PDC from 0xf0f0000000 to 0x3fb01000. gate64: sysvec_vaddr = 0xc0002000 for 2 pages NOTICE: nfs3_link(): File system was registered at index 4. NOTICE: autofs_link(): File system was registered at index 5. NOTICE: cachefs_link(): File system was registered at index 6. td: claimed Tachyon XL2 Fibre Channel Mass Storage card at 0/4/0/0 td: claimed Tachyon XL2 Fibre Channel Mass Storage card at 0/6/2/0 asio0_init: unexpected SAS subsystem ID (1283)     System Console is on the Built-In Serial Interface asio0_init: unexpected SAS subsystem ID (1283) Logical volume 64, 0x3 configured as ROOT Logical volume 64, 0x2 configured as SWAP Logical volume 64, 0x2 configured as DUMP the kernel tunable maxswapchunks of size 1048576 is too big. please decrease max swapchunks to size 16384 or less and re-configure your system     Swap device table:  (start & size given in 512-byte blocks)         entry 0 - major is 31, minor is 0x1f003; ... /sbin/ioinitrc: fsck: /dev/vg00/lvol1: possible swap device (cannot determine) fsck SUSPENDED BY USER. /dev/vg00/lvol1: No such device or address Unable to mount /stand - please check entries in /etc/fstab Skipping KRS database initialization - /stand can't be mounted INIT: Overriding default level with level 's' INIT: SINGLE USER MODE INIT: Running /sbin/sh # #  vgchange -a y -s /dev/vg00  Activated volume group Volume group "/dev/vg00" has been successfully changed. #  lvchange -M n -c n /dev/vg00/lvol2  Logical volume "/dev/vg00/lvol2" has been successfully changed. Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/vg00.conf # #  lvdisplay /dev/vg00/lvol2  --- Logical volumes --- LV Name                     /dev/vg00/lvol2 VG Name                     /dev/vg00 LV Permission               read/write LV Status                   available/syncd Mirror copies               1  Consistency Recovery        NONE  Schedule                    parallel LV Size (Mbytes)            2048 Current LE                  256 Allocated PE                512 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   off Allocation                  strict/contiguous IO Timeout (Seconds)        default # #  reboot  Shutdown at 10:56 (in 0 minutes) System shutdown time has arrived

One of the last parts of the configuration we need to test is the ability for the system to sustain a real disk failure. If we have external disks, we can simply turn off one of the external disks. If not, we may have to be a bit crueler: one test I was involved in was to destroy the data stored on disk (using the dd command) to overwrite the beginning of the disk acting as our current Primary bootpath and then reboot the system. We go through two different tests:

Lose a disk online, but have it replaced while the system is still running.
Lose a disk, and sustain a reboot before the disk can be replaced.

These tests will destroy the data on the current disks. Some administrators are reluctant to perform such tests. If you are one of those reluctant administrators ask yourself this question: How do you know your recovery procedures really work? Here goes.

6.2.3 Lose a disk online, but have it replaced while the system is still running

In this scenario, a disk fails while the system is up and running. Because we have mirroring in place, we should see no interruption to service. In our case, it will be the disk from which the system booted. The steps to initiate the recovery are similar for a root/boot volume group as a data volume group, the main differences being the reinstatement of the LIF/boot data. I am using vg00 because it is a more dramatic test and I want to be sure that my system can sustain the loss of a root/boot disk. Although both disks are viewed as the same, I am going to lose the disk I booted from. I can establish which disk I booted from by looking at the kernel- maintained variable, boot_string :

 root@hpeos003[]  echo "boot_string/S"  adb /stand/vmunix /dev/kmem  boot_string: boot_string:    disk(0/0/1/1.15.0.0.0.0.0;0)/stand/vmunix root@hpeos003[]

We can see which disk (and which kernel) we booted from. I am going to physically remove this disk from the system. I am using self-terminating SCSI cables, which allows me to remove SCSI disks from the system without generating noise on the SCSI interface and could cause a system panic. This should have a minimal impact on the system. The impact should be limited to the length of PV Timeout ( default = the default returned by the device driver for the specific device, normally 30 or 60 seconds for most disks), where any outstanding IOs to the primary disk fail and are re-routed to the mirror disk. (Any current commands may pause slightly while the PV Timeout takes effect. This is just about enough time for a user to look quizzically at her screen, look-up the number of the internal Help Desk, and just as she's calling the number, the PV Timeout expires and the command breathes back into life.) We should see a SCSI lbolt error in syslog :

 root@hpeos003[]  more /var/adm/syslog/syslog.log  ... Oct 29 17:47:23 hpeos003 vmunix: SCSI: Reset detected -- lbolt: 644442, bus: 1 Oct 29 17:47:23 hpeos003 vmunix:                lbp->state: 4060 Oct 29 17:47:23 hpeos003 vmunix:                lbp->offset: ffffffff Oct 29 17:47:23 hpeos003 vmunix:                lbp->uPhysScript: f8040000 Oct 29 17:47:23 hpeos003 vmunix:        From most recent interrupt: Oct 29 17:47:23 hpeos003 vmunix:                ISTAT: 02, SIST0: 02, SIST1: 00, DSTAT: 80  , DSPS: f8040028 Oct 29 17:47:23 hpeos003 vmunix:        lsp: 0000000000000000 Oct 29 17:47:23 hpeos003 vmunix:        lbp->owner: 0000000000000000 Oct 29 17:47:23 hpeos003 vmunix:        scratch_lsp: 0000000000000000 Oct 29 17:47:23 hpeos003 vmunix:        Pre-DSP script dump [fffffffff80400e0]: Oct 29 17:47:23 hpeos003 vmunix:                e0340004 00000000 e0100004 00000000 Oct 29 17:47:23 hpeos003 vmunix:                48000000 00000000 78350000 00000000 Oct 29 17:47:23 hpeos003 vmunix:        Script dump [fffffffff8040100]: Oct 29 17:47:23 hpeos003 vmunix:                50000000 f8040028 80000000 0000000b Oct 29 17:47:23 hpeos003 vmunix:                0f000001 f80405c0 60000040 00000000 Oct 29 17:48:02 hpeos003 vmunix: DIAGNOSTIC SYSTEM WARNING: Oct 29 17:48:02 hpeos003 vmunix:    The diagnostic logging facility has started receiving  excessive Oct 29 17:48:02 hpeos003 vmunix:    errors from the I/O subsystem.  I/O error entries will  be lost Oct 29 17:48:02 hpeos003 vmunix:    until the cause of the excessive I/O logging is corrected. Oct 29 17:48:02 hpeos003 vmunix:    If the diaglogd daemon is not active, use the Daemon  Startup command Oct 29 17:48:02 hpeos003 vmunix:    in stm to start it. Oct 29 17:48:02 hpeos003 vmunix:    If the diaglogd daemon is active, use the logtool  utility in stm Oct 29 17:48:02 hpeos003 vmunix:    to determine which I/O subsystem is logging excessive  errors. Oct 29 17:48:02 hpeos003 vmunix:  Oct 29 17:48:02 hpeos003 vmunix: SCSI: Async write error -- dev: b 31 0x01f000,errno: 126,  resid: 2048,  Oct 29 17:48:02 hpeos003 vmunix:        blkno: 6230152, sectno: 12460304, offset:  2084708352, bcount: 2048. Oct 29 17:48:02 hpeos003 vmunix:        blkno: 5750354, sectno: 11500708, offset:  1593395200, bcount: 2048. Oct 29 17:48:02 hpeos003 vmunix:        blkno: 5625944, sectno: 11251888, offset:  1465999360, bcount: 2048. Oct 29 17:48:02 hpeos003 vmunix:        blkno: 5045018, sectno: 10090036, offset:  871131136, bcount: 2048. Oct 29 17:48:02 hpeos003 vmunix:        blkno: 4848930, sectno: 9697860, offset: 670337024  , bcount: 2048. Oct 29 17:48:02 hpeos003 vmunix:        blkno: 4848962, sectno: 9697924, offset: 670369792  , bcount: 2048. ... Oct 29 17:48:02 hpeos003 vmunix:   LVM: vg[0]: pvnum=0 (dev_t=0x1f01f000) is POWERFAILED   Oct 29 17:48:02 hpeos003 vmunix:   SCSI: Write error -- dev: b 31 0x01f000   , errno: 126,  resid: 10240, Oct 29 17:48:02 hpeos003 vmunix:        blkno: 2438, sectno: 4876, offset: 2496512, bcount  : 10240. ... root@hpeos003[]

I have italicized and underlined the points of interest. We can see from the LVM POWERFAIL message the dev_t = block device known by LVM = 0x1f01f000 . This can be decoded as follows :

Hexadecimal

Major Number

Minor Number

(=31 in decimal)

01f000

When we look in /dev/dsk , we see the major number is 31 as expected. When we look for the relevant minor number, we see the disk in question: /dev/dsk/c1t15d0 .

 root@hpeos003[]  ll /dev/dsk  total 0 brw-r-----   1 bin        sys         31 0x000000 Oct 25 12:05 c0t0d0 brw-r-----   1 bin        sys         31 0x001000 Oct 25 12:05 c0t1d0 brw-r-----   1 bin        sys         31 0x002000 Oct 25 12:05 c0t2d0 brw-r-----   1 bin        sys         31 0x003000 Oct 29 11:40 c0t3d0 brw-r-----   1 bin        sys         31 0x004000 Oct 29 11:40 c0t4d0 brw-r-----   1 bin        sys         31 0x005000 Oct 29 11:40 c0t5d0 brw-r-----   1 bin        sys         31 0x006000 Oct 29 11:40 c0t6d0  brw-r-----   1 bin        sys         31 0x01f000 Oct  1 17:29 c1t15d0  brw-r-----   1 bin        sys         31 0x03f000 Oct 28 11:30 c3t15d0 brw-r-----   1 bin        sys         31 0x04a000 Oct 29 11:40 c4t10d0 brw-r-----   1 bin        sys         31 0x04b000 Oct 29 11:40 c4t11d0 brw-r-----   1 bin        sys         31 0x04c000 Oct 29 11:40 c4t12d0 brw-r-----   1 bin        sys         31 0x04d000 Oct 29 11:40 c4t13d0 brw-r-----   1 bin        sys         31 0x04e000 Oct 29 11:40 c4t14d0 brw-r-----   1 bin        sys         31 0x048000 Oct 29 11:40 c4t8d0 brw-r-----   1 bin        sys         31 0x049000 Oct 29 11:40 c4t9d0 root@hpeos003[]  lssf /dev/dsk/c1t15d0  sdisk card instance 1 SCSI target 15 SCSI LUN 0 section 0 at address   0/0/1/1.15.0   /dev/dsk  /c1t15d0 root@hpeos003[]

This is the disk from which we booted. The system is still up and running, but we are starting to accumulate stale extents.

 root@hpeos003[]   lvdisplay -v /dev/vg00/lvol4   --- Logical volumes --- LV Name                     /dev/vg00/lvol4 VG Name                     /dev/vg00 LV Permission               read/write  LV Status                   available/stale  Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            64 Current LE                  8 Allocated PE                16 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c1t15d0   8         8    /dev/dsk/c3t15d0   8         8    --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2   00000 /dev/dsk/c1t15d0   00433 stale    /dev/dsk/c3t15d0   00433 current   00001 /dev/dsk/c1t15d0   00434 current  /dev/dsk/c3t15d0   00434 current    00002 /dev/dsk/c1t15d0   00435 current  /dev/dsk/c3t15d0   00435 current    00003 /dev/dsk/c1t15d0   00436 current  /dev/dsk/c3t15d0   00436 current    00004 /dev/dsk/c1t15d0   00437 current  /dev/dsk/c3t15d0   00437 current    00005 /dev/dsk/c1t15d0   00438 current  /dev/dsk/c3t15d0   00438 current    00006 /dev/dsk/c1t15d0   00439 current  /dev/dsk/c3t15d0   00439 current    00007 /dev/dsk/c1t15d0   00440 current  /dev/dsk/c3t15d0   00440 current root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vg00/lvol*  grep stale  wc -l  46 root@hpeos003[]

If this disk was simply having an intermittent problem, it could come back online and LVM would recognize the disk, because it would still contain a full compliment of LVM headers. LVM would resynchronize the stale extents with no further input from the administrator. However, if the disk has really failed, it will need replacing. In my case, I need to replace the failed disk with a new one. In this instance if, I try to reactivate the volume group, it will fail to recognize the LVM headers on the disk; they're missing ”it's a new disk! Here's the message from LVM found in syslog :

 root@hpeos003[]  tail /var/adm/syslog/syslog.log  Oct 29 18:14:31 hpeos003 vmunix: Oct 29 18:14:36 hpeos003  above message repeats 3 times Oct 29 18:14:31 hpeos003 vmunix: SCSI: Read error -- dev: b 31 0x01f000, errno:126, resid:  2048, Oct 29 18:14:36 hpeos003  above message repeats 3 times Oct 29 18:14:31 hpeos003 vmunix:        blkno: 8, sectno: 16, offset: 8192, bcount: 2048. Oct 29 18:14:37 hpeos003  above message repeats 3 times Oct 29 18:14:41 hpeos003 vmunix: Oct 29 18:14:41 hpeos003 vmunix: SCSI: Read error -- dev: b 31 0x01f000, errno:126, resid:  2048, Oct 29 18:14:41 hpeos003 vmunix:        blkno: 8, sectno: 16, offset: 8192, bcount: 2048.  Oct 29 18:14:46 hpeos003 vmunix: LVM: Failed to restore PV 0 to VG 0! Identifier mismatch.  root@hpeos003[] root@hpeos003[]  vgchange -a y /dev/vg00  vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/c1t15d0": Cross-device link Volume group "/dev/vg00" has been successfully changed. You have mail in /var/mail/root root@hpeos003[]

I need to restore the LVM headers onto the disk using vgcfgrestore and then try to activate the volume group to allow the kernel to recognize the new disk as being part of this volume group:

 root@hpeos003[]  vgcfgrestore -l -n /dev/vg00  Volume Group Configuration information in "/etc/lvmconf/vg00.conf" VG Name /dev/vg00  ---- Physical volumes : 2 ----    /dev/rdsk/c1t15d0 (Bootable)    /dev/rdsk/c3t15d0 (Bootable) root@hpeos003[]  vgcfgrestore -n /dev/vg00 /dev/rdsk/c1t15d0  Volume Group configuration has been restored to /dev/rdsk/c1t15d0 root@hpeos003[]  vgchange -a y /dev/vg00  Volume group "/dev/vg00" has been successfully changed. root@hpeos003[]

Because this is a new disk, LVM will have marked all extents as being stale. We will need to resynchronize the entire volume group:

 root@hpeos003[]  lvdisplay -v /dev/vg00/lvol*  grep stale  wc -l  1114 root@hpeos003[] root@hpeos003[]  vgsync /dev/vg00  Resynchronized logical volume "/dev/vg00/lvol1". Resynchronized logical volume "/dev/vg00/lvol2". Resynchronized logical volume "/dev/vg00/lvol3". Resynchronized logical volume "/dev/vg00/lvol4". Resynchronized logical volume "/dev/vg00/lvol5". Resynchronized logical volume "/dev/vg00/lvol6". Resynchronized logical volume "/dev/vg00/lvol7". Resynchronized logical volume "/dev/vg00/lvol8". Resynchronized logical volume "/dev/vg00/lvol9". Resynchronized volume group "/dev/vg00". root@hpeos003[]

Because this can take some time to complete, it might have been a good idea to run vgsync in the background. If this was a non-boot disk, we would be finished; however, this is vg00 , so our job is far from over. We need to reinstate the boot and LIF data:

 root@hpeos003[]  lifls -l /dev/rdsk/c1t15d0  lifls: Can't list /dev/rdsk/c1t15d0; not a LIF volume root@hpeos003[]  mkboot -b /dev/rdsk/c3t15d0 /dev/rdsk/c1t15d0  root@hpeos003[]  lifls -l /dev/rdsk/c1t15d0  volume ISL10 data size 7984 directory size 8 filename   type   start   size     implement  created =============================================================== ODE        -12960 584     848      0          02/07/08 12:33:46 MAPFILE    -12277 1432    128      0          02/07/08 12:33:46 SYSLIB     -12280 1560    353      0          02/07/08 12:33:46 CONFIGDATA -12278 1920    235      0          02/07/08 12:33:46 SLMOD2     -12276 2160    141      0          02/07/08 12:33:46 SLDEV2     -12276 2304    135      0          02/07/08 12:33:46 SLDRV2     -12276 2440    205      0          02/07/08 12:33:46 SLSCSI2    -12276 2648    131      0          02/07/08 12:33:46 MAPPER2    -12279 2784    142      0          02/07/08 12:33:46 IOTEST2    -12279 2928    411      0          02/07/08 12:33:46 PERFVER2   -12279 3344    124      0          02/07/08 12:33:46 PVCU       -12801 3472    64       0          02/07/08 12:33:46 SSINFO     -12286 3536    2        0          02/07/08 12:33:46 ISL        -12800 3544    306      0          00/11/08 20:49:59 AUTO       -12289 3856    1        0          00/11/08 20:49:59 HPUX       -12928 3864    848      0          00/11/08 20:50:00 LABEL      BIN    4712    8        0          03/10/01 15:59:56 root@hpeos003[] root@hpeos003[]  mkboot -a "hpux -lq" /dev/rdsk/c1t15d0  root@hpeos003[]  lifcp /dev/rdsk/c1t15d0:AUTO -  hpux -lq root@hpeos003[] root@hpeos003[]  lvlnboot -vR /dev/vg00  Boot Definitions for Volume Group /dev/vg00: Physical Volumes belonging in Root Volume Group:         /dev/dsk/c1t15d0 (0/0/1/1.15.0) -- Boot Disk         /dev/dsk/c3t15d0 (0/0/2/1.15.0) -- Boot Disk Boot: lvol1     on:     /dev/dsk/c1t15d0                         /dev/dsk/c3t15d0 Root: lvol3     on:     /dev/dsk/c1t15d0                         /dev/dsk/c3t15d0 Swap: lvol2     on:     /dev/dsk/c1t15d0                         /dev/dsk/c3t15d0 Dump: lvol2     on:     /dev/dsk/c1t15d0, 0 Volume Group configuration for /dev/vg00 has been saved in /etc/lvmconf/vg00.conf root@hpeos003[]

This now looks okay.

6.2.4 Lose a disk, and sustain a reboot before the disk can be replaced

There is essentially no difference in this scenario except that if the volume group in question is vg00 , you might not realize that your system has rebooted: it could happen at 02:00 and unless you have some form of automated monitoring, you may not realize it has happened . Consequently, you may be running with a system that is lacking a fundamental high availability feature: your root/boot disk has no mirroring functionality. If the volume group in question is a data volume group, you may well be in a situation where the volume group has not been activated; if you do not have a quorum (more than 50 percent of the disk online), the volume group will not be activated at boot time. Overriding quorum for a non-vg00 volume group requires that you manually modify the startup script /sbin/lvmrc to use the “q n option to vgchange . I would be very careful in such an instance. If you lose three out of six disks, it may be okay to override the quorum, i.e., you lost one half of your mirror configuration, but beyond that I think it is more than a little suspicious if I don't have quorum for a volume group. In our example with vg00 earlier, we only had two disks in vg00 , the original and a mirror, so overriding quorum was acceptable. In this simple example, I want to show you the system attempting to boot from a broken Primary bootpath but succeeding because we have stipulated the “lq option to hpux in order to override quorum. We will also see LVM failing to activate a data volume group at boot time, because we do not have quorum.

Processor is booting from first available device. To discontinue, press any key within 10 seconds. 10 seconds expired. Proceeding... Trying Primary Boot Path ------------------------ Booting... Boot IO Dependent Code (IODC) revision 1 IPL error: bad LIF magic. .... FAILED. Trying Alternate Boot Path -------------------------- Boot IO Dependent Code (IODC) revision 1 .... SUCCEEDED! HARD Booted. ISL Revision A.00.43 Apr 12, 2000 ISL booting hpux -lq Boot : disk(0/0/2/1.15.0.0.0.0.0;0)/stand/vmunix 10485760 + 1781760 + 1515760 start 0x1f8fe8 alloc_pdc_pages: Relocating PDC from 0xf0f0000000 to 0x3fb01000. gate64: sysvec_vaddr = 0xc0002000 for 2 pages NOTICE: nfs3_link(): File system was registered at index 4. NOTICE: autofs_link(): File system was registered at index 5. root@hpeos003[] vgdisplay /dev/vgora1 vgdisplay: Volume group not activated. vgdisplay: Cannot display volume group "/dev/vgora1". root@hpeos003[] root@hpeos003[] vgchange -a y /dev/vgora1 vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/ c4t9d0": Cross-device link vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/ c4t10d0": Cross-device link vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/ c4t11d0": Cross-device link vgchange: Warning: couldn't query physical volume "/dev/dsk/ c4t9d0": The specified path does not correspond to physical volume attached to this volume group vgchange: Warning: couldn't query physical volume "/dev/dsk/c4t10d0": The specified path does not correspond to physical volume attached to this volume group vgchange: Warning: couldn't query physical volume "/dev/dsk/c4t11d0": The specified path does not correspond to physical volume attached to this volume group vgchange: Warning: couldn't query physical volume "/dev/dsk/c4t11d0": The specified path does not correspond to physical volume attached to this volume group vgchange: Warning: couldn't query all of the physical volumes. vgchange: Couldn't activate volume group "/dev/vgora1": Quorum not present, or some physical volume(s) are missing. root@hpeos003[]

At this point, I would need to have the faulty disk replaced and instigate a recovery of vg00 and vgora1 . The recovery of vg00 follows the same tasks detailed in the previous example. I will document the recovery of vgora1 .

 root@hpeos003[]  vgcfgrestore -l -n /dev/vgora1  Volume Group Configuration information in "/etc/lvmconf/vgora1.conf" VG Name /dev/vgora1  ---- Physical volumes : 6 ----     /dev/rdsk/c0t1d0 (Non-bootable)     /dev/rdsk/c0t2d0 (Non-bootable)     /dev/rdsk/c0t3d0 (Non-bootable)     /dev/rdsk/c4t9d0 (Non-bootable)    /dev/rdsk/c4t10d0 (Non-bootable)    /dev/rdsk/c4t11d0 (Non-bootable) root@hpeos003[] root@hpeos003[]  vgcfgrestore -n /dev/vgora1 /dev/rdsk/c4t9d0  Volume Group configuration has been restored to /dev/rdsk/c4t9d0 root@hpeos003[]  vgcfgrestore -n /dev/vgora1 /dev/rdsk/c4t10d0  Volume Group configuration has been restored to /dev/rdsk/c4t10d0 root@hpeos003[]  vgcfgrestore -n /dev/vgora1 /dev/rdsk/c4t11d0  Volume Group configuration has been restored to /dev/rdsk/c4t11d0 root@hpeos003[] root@hpeos003[]  vgchange -a y /dev/vgora1  Activated volume group Volume group "/dev/vgora1" has been successfully changed. root@hpeos003[] root@hpeos003[]  vgsync /dev/vgora1  Resynchronized logical volume "/dev/vgora1/db". Resynchronized volume group "/dev/vgora1". root@hpeos003[]

As you can see, the process follows a very similar sequence as the process when we recovered vg00 earlier.

6.2.5 Spare volumes

The idea behind a spare volume is the same concept as a hot-standby . The disk is up and running, part of the volume group, sitting there waiting for the failure of a live disk. The spare volume is used in conjunction with mirroring. In the event of a live disk failing, the spare volume will automatically be written to in order re-establish the mirror configuration. This maintains our high-availability configuration with no administrative input. We can then schedule for the failed disk to be replaced. A major drawback with spare volumes is that you are losing all use of the spare volume. It cannot be used for anything other than being a spare volume; you cannot create any logical volumes on it. The main advantage is the possibility of automatically maintaining a high-availability configuration in the event of a disk failure. Most high-availability disk arrays offer this as an option, whereas if you enable a hot-standby, it is expected that you are going to lose capacity and performance. (I know some disk arrays, such as the HP Virtual Array, spread the hot-standby capacity across all spindles, allowing you to utilize all disks in the array. That's a specific, and rather clever, solution.)

If we take our current volume group /dev/vgora1 , we are not currently using disk c4t10d0 . We could configure this as a spare volume. In the event that we were to lose c4t9d0 , all the mirrored volumes configured on c4t9d0 would be re-established on c4t10d0 :

 root@hpeos003[]  pvchange -z y /dev/dsk/c4t10d0  Physical volume "/dev/dsk/c4t10d0" has been successfully changed. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  vgdisplay /dev/vgora1  --- Volume groups --- VG Name                     /dev/vgora1 VG Write Access             read/write VG Status                   available Max LV                      255 Cur LV                      4 Open LV                     4 Max PV                      16 Cur PV                      6 Act PV                      6 Max PE per PV               17501 VGDA                        12 PE Size (Mbytes)            4 Total PE                    87495 Alloc PE                    1700 Free PE                     85795 Total PVG                   2   Total Spare PVs             1     Total Spare PVs in use      0   root@hpeos003[]

All we need to do is wait for c4t9d0 to fail

6.2.6 Conclusions on mirroring

Mirroring is one of the most common high-availability tasks undertaken by LVM or any advanced disk management software. Without mirroring, we are extremely vulnerable to a loss of access to our data and, hence, our applications and consequently the ability of our organizations to function as normal (read: make money ).

There are still other tasks I want to perform concerning mirroring, such as splitting and merging mirrors. I will wait until we talk about filesystems and especially VxFS snapshots. I know it might sound a bit weird to split up a discussion regarding mirroring with a discussion regarding filesystems, but this whole book is about getting the job done . I have tried to approach subjects on a non-theoretical basis. I am trying to convey my hands-on experiences to you from an on-the-job mindset. I hope that makes sense.