6.1 LVM Striping (RAID 0)

     

As discussed elsewhere, striping can give us performance benefits as we spread the overall IO burden across many spindles. It's like trying to dig a ditch with only one worker; he might be a very hard worker, but in the end your ditch will get dug quicker if you have more workers all working equally hard.

Disk striping comes at a cost; if we lose a disk in our stripe set, we lose the entire logical volume. This poses the problem of striping and mirroring; RAID 0/1. Can LVM do it? The answer is yes and no. The reason we have two bi-polar answers is due (mostly) to the original design of LVM in regard of mirroring and striping.

LVM tracks changes in mirrored volumes via a concept known as a Logical Track Group . An LTG is 256KB in size. LVM striping allows us to set up a stripe size as small as 4KB. This poses a problem for the mirroring software because it is managing disk space in chunks of 256KB while striping is managing disk space in chunks as small as 4KB. The result is that kilobyte-striping (as I affectionately call it) and mirroring are incompatible and the options do not work together. However, when LVM was first launched, kilobyte-striping was not available. It wasn't long before some clever administrators realized that we could build a logical volume one extent at a time across multiple physical volumes that effectively gave us an extend-based striped logical volume that looks like this:

 

 root@hpeos003[]  lvcreate -n stripy /dev/vgora1  Logical volume "/dev/vgora1/stripy" has been successfully created with character device "/dev/vgora1/rstripy". Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  PV1=/dev/dsk/c0t1d0  root@hpeos003[]  PV2=/dev/dsk/c0t2d0  root@hpeos003[]  PV3=/dev/dsk/c0t3d0  root@hpeos003[]  SIZE=300  root@hpeos003[]  COUNT=1  root@hpeos003[]  while [ $COUNT -le $SIZE ]  >  do  >  lvextend -l $COUNT /dev/vgora1/stripy $PV1  >  let COUNT=COUNT+1  >  lvextend -l $COUNT /dev/vgora1/stripy $PV2  >  let COUNT=COUNT+1  >  lvextend -l $COUNT /dev/vgora1/stripy $PV3  >  let COUNT=COUNT+1  >  done  Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf ... Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/stripy  more  --- Logical volumes --- LV Name                     /dev/vgora1/stripy VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               0 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            1200 Current LE                  300 Allocated PE                300 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    100       100    /dev/dsk/c0t2d0    100       100    /dev/dsk/c0t3d0    100       100    --- Logical extents ---    LE    PV1                PE1   Status 1    00000 /dev/dsk/c0t1d0    00250 current    00001 /dev/dsk/c0t2d0    00000 current    00002 /dev/dsk/c0t3d0    00000 current    00003 /dev/dsk/c0t1d0    00251 current    00004 /dev/dsk/c0t2d0    00001 current    00005 /dev/dsk/c0t3d0    00001 current ...    00297 /dev/dsk/c0t1d0    00349 current    00298 /dev/dsk/c0t2d0    00099 current    00299 /dev/dsk/c0t3d0    00099 current root@hpeos003[] 

IMPORTANT

The first thing to note here is that when we implement striping, all disks in the stripe set should be on separate controllers in order to avoid a single controller being the bottleneck. In my case, all three disks are on the same interface. This is less than optimal. These examples are for demonstration purposes only!


This type of extent-based striping is now known as a Distributed Logical Volume (more on that in a moment). One aspect of Distributed volumes (and my stripy volume above) is that they can be mirrored. The reason is that an extent is at least 1MB in size and, thus, mirroring has no problem in tracking changes in the mirror because the LTG is within a single extent. In our hand-crafted semi-Distributed volume, we can now set up mirroring. The question is how many disks do we need to set up a mirror copy of the data. The technical answer is simply one. As long as we can fit all the extents from the original volume inside a single disk, there is no reason why LVM would stop us. Remember that the default strictness criteria dictate that the mirror extents are on a different volume than the original.

 

 root@hpeos003[]  lvextend -m 1 /dev/vgora1/stripy /dev/dsk/c4t9d0  The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/stripy  more  --- Logical volumes --- LV Name                     /dev/vgora1/stripy VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            1200 Current LE                  300 Allocated PE                600 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    100       100    /dev/dsk/c0t2d0    100       100    /dev/dsk/c0t3d0    100       100    /dev/dsk/c4t9d0    300       300    --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00250 current  /dev/dsk/c4t9d0    00250 current    00001 /dev/dsk/c0t2d0    00000 current  /dev/dsk/c4t9d0    00251 current    00002 /dev/dsk/c0t3d0    00000 current  /dev/dsk/c4t9d0    00252 current    00003 /dev/dsk/c0t1d0    00251 current  /dev/dsk/c4t9d0    00253 current    00004 /dev/dsk/c0t2d0    00001 current  /dev/dsk/c4t9d0    00254 current    00005 /dev/dsk/c0t3d0    00001 current  /dev/dsk/c4t9d0    00255 current    00297 /dev/dsk/c0t1d0    00349 current  /dev/dsk/c4t9d0    00547 current    00298 /dev/dsk/c0t2d0    00099 current  /dev/dsk/c4t9d0    00548 current    00299 /dev/dsk/c0t3d0    00099 current  /dev/dsk/c4t9d0    00549 current root@hpeos003[] 

This is less than optimal from a high-availability perspective, but it shows you that just about anything is possible; as long as the mirrored extent doesn't land on the original volume, you could be in a position where an original extent existed on c0t1d0 and its mirror was on c0t2d0 . This is obviously something you wouldn't normally want to create but is possible.

In my mode of thinking, I would call our stripy volume a mirrored-striped volume, i.e., we have a striped volume that is mirrored, as opposed to a striped-mirror , which I would say is a mirrored volume where the mirror is striped. In such a scenario, the original volume does not need to be striped but usually is. This latter case is not an easy volume to create with LVM.

The new way to establish an extent-based striped volume is via a concept known as a Distributed volume . In LVM to create a Distributed volume, we use the “D y option to lvcreate . This requires that we have the original volumes in their own PVG, i.e., we enforce PVG-strict allocation (-s g ). The idea here is that, should we want to mirror a Distributed volume, all of the extents in the mirror will definitely not land on any physical volumes from the original volume.

 

 root@hpeos003[]  cat /etc/lvmpvg  VG      /dev/vgora1 PVG     PVG0 /dev/dsk/c0t1d0 /dev/dsk/c0t2d0 /dev/dsk/c0t3d0 PVG     PVG1 /dev/dsk/c4t9d0 /dev/dsk/c4t10d0 /dev/dsk/c4t11d0 root@hpeos003[] root@hpeos003[]  lvcreate -D y -s g -L 1200 -n Dstripe  /dev/vgora1  Logical volume "/dev/vgora1/Dstripe" has been successfully created with character device "/dev/vgora1/rDstripe". Logical volume "/dev/vgora1/Dstripe" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] 

When we mirror this volume, LVM will distribute the extents across all disks in the other PVG:

 

 root@hpeos003[]  lvextend -m 1 /dev/vgora1/Dstripe  The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vgora1/Dstripe" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/Dstripe  more  --- Logical volumes --- LV Name                     /dev/vgora1/Dstripe VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               1 Consistency Recovery        MWC Schedule                    parallel LV Size (Mbytes)            1200 Current LE                  300 Allocated PE                600 Stripes                     0 Stripe Size (Kbytes)        0 Bad block                   on Allocation                  PVG-strict/distributed IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    100       100    /dev/dsk/c0t2d0    100       100    /dev/dsk/c0t3d0    100       100    /dev/dsk/c4t9d0    100       100    /dev/dsk/c4t10d0   100       100    /dev/dsk/c4t11d0   100       100    --- Logical extents ---    LE    PV1                PE1   Status 1 PV2                PE2   Status 2    00000 /dev/dsk/c0t1d0    00350 current  /dev/dsk/c4t9d0    00550 current    00001 /dev/dsk/c0t2d0    00100 current  /dev/dsk/c4t10d0   00000 current    00002 /dev/dsk/c0t3d0    00100 current  /dev/dsk/c4t11d0   00000 current    00003 /dev/dsk/c0t1d0    00351 current  /dev/dsk/c4t9d0    00551 current    00004 /dev/dsk/c0t2d0    00101 current  /dev/dsk/c4t10d0   00001 current    00005 /dev/dsk/c0t3d0    00101 current  /dev/dsk/c4t11d0   00001 current ...    00297 /dev/dsk/c0t1d0    00449 current  /dev/dsk/c4t9d0    00649 current    00298 /dev/dsk/c0t2d0    00199 current  /dev/dsk/c4t10d0   00099 current    00299 /dev/dsk/c0t3d0    00199 current  /dev/dsk/c4t11d0   00099 current root@hpeos003[] 

With this allocation policy, we are not allowed to have all of our Distributed mirror extents on the same physical volume. Distributed means exactly that: distributed original extents and distributed mirror extents.

The drawback of extent-based striping is that we are not matching the underlying application IO size with the stripe size, e.g., a filesystem block/extent or RDBMS IO size and, hence, our IO is not being distributed over all disks in an optimal way. The only way we can attempt to match the stripe size to the underlying application IO size is to use LVM kilobyte-striping . This is relatively simple to set up. Here, I am using a three-disk stripe set (-i 3 ) with a stripe size of 64KB (-I 64 ). If I don't specify the disks to stripe across, LVM will choose the first three disk in the volume group that accommodate the volume.

 

 root@hpeos003[]  lvcreate -L 1200 -i 3 -I 64 -n Kstripe /dev/vgora1  Logical volume "/dev/vgora1/Kstripe" has been successfully created with character device "/dev/vgora1/rKstripe". Logical volume "/dev/vgora1/Kstripe" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[]  lvdisplay -v /dev/vgora1/Kstripe  more  --- Logical volumes --- LV Name                     /dev/vgora1/Kstripe VG Name                     /dev/vgora1 LV Permission               read/write LV Status                   available/syncd Mirror copies               0 Consistency Recovery        MWC Schedule                    striped LV Size (Mbytes)            1200 Current LE                  300 Allocated PE                300   Stripes                     3     Stripe Size (Kbytes)        64   Bad block                   on Allocation                  strict IO Timeout (Seconds)        default    --- Distribution of logical volume ---    PV Name            LE on PV  PE on PV    /dev/dsk/c0t1d0    100       100    /dev/dsk/c0t2d0    100       100    /dev/dsk/c0t3d0    100       100    --- Logical extents ---    LE    PV1                PE1   Status 1    00000 /dev/dsk/c0t1d0    00450 current    00001 /dev/dsk/c0t2d0    00200 current    00002 /dev/dsk/c0t3d0    00200 current ...    00297 /dev/dsk/c0t1d0    00549 current    00298 /dev/dsk/c0t2d0    00299 current    00299 /dev/dsk/c0t3d0    00299 current root@hpeos003[] 

If I try to mirror this volume, LVM will not allow it to happen; the options are incompatible:

 

 root@hpeos003[]  lvextend -m 1 /dev/vgora1/Kstripe  Striped mirrors are not supported. To enable mirroring options (-m, -M, -c), do not graphics/ccc.gif specify the striping options (-i, -I) when creating logical volumes. root@hpeos003[] 

Although this can be viewed as better for performance, we are very vulnerable to a loss of data should a single disk in this configuration fail, rendering the whole Kstripe volume unusable. I know of few installations that run with this configuration in a commercial environment without having some form of hardware redundancy built into their configuration. Some people think that LVMs kilobyte striping doesn't go far enough as far as performance criteria is concerned .

IMPORTANT

  • Striping and mirroring is only supported by LVM with Distributed Volumes.

  • Distributed volumes utilize stripes that are the size of one entire extent.

  • As such Distributed volumes and mirroring are a less-than -optimal solution for RAID 0/1.


Another aspect of LVM relates to its ability to position data on a particular area of a disk. This technique can maximize or at least standardize IO performance from disks in a stripe set by ensuring that all data is positioned in the same area of each disk in the stripe set. Other disk management products such as VxVM allow you to place your data on a specific area of the disk. One way of ensuring that data is in the same area of each disk is to create the striped volumes as the first volumes in the volume group. In this way, we can ensure that the stripes are distributed evenly over all disks, at the beginning of each disk in the stripe set. The only other way (it's a kludge ) to do this with LVM is to follow these steps:

  1. Choose your specific area of the disk, e.g., the middle of the disk.

  2. Create dummy volume(s) until you reach your specified area.

  3. Create your real volume(s).

  4. Remove the dummy volume(s).

The main problem with this type of configuration is that if you want to extend such a volume, you would have had to put in place some additional dummy volumes after the real volume to preserve some space for future requirements (I suppose you could have turn off extent allocation for that physical volume: pvchange “x n <dsk> ).



HP-UX CSE(c) Official Study Guide and Desk Reference
HP-UX CSE(c) Official Study Guide and Desk Reference
ISBN: N/A
EAN: N/A
Year: 2006
Pages: 434

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net