As discussed elsewhere, striping can give us performance benefits as we spread the overall IO burden across many spindles. It's like trying to dig a ditch with only one worker; he might be a very hard worker, but in the end your ditch will get dug quicker if you have more workers all working equally hard. Disk striping comes at a cost; if we lose a disk in our stripe set, we lose the entire logical volume. This poses the problem of striping and mirroring; RAID 0/1. Can LVM do it? The answer is yes and no. The reason we have two bi-polar answers is due (mostly) to the original design of LVM in regard of mirroring and striping. LVM tracks changes in mirrored volumes via a concept known as a Logical Track Group . An LTG is 256KB in size. LVM striping allows us to set up a stripe size as small as 4KB. This poses a problem for the mirroring software because it is managing disk space in chunks of 256KB while striping is managing disk space in chunks as small as 4KB. The result is that kilobyte-striping (as I affectionately call it) and mirroring are incompatible and the options do not work together. However, when LVM was first launched, kilobyte-striping was not available. It wasn't long before some clever administrators realized that we could build a logical volume one extent at a time across multiple physical volumes that effectively gave us an extend-based striped logical volume that looks like this: root@hpeos003[] lvcreate -n stripy /dev/vgora1 Logical volume "/dev/vgora1/stripy" has been successfully created with character device "/dev/vgora1/rstripy". Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[] PV1=/dev/dsk/c0t1d0 root@hpeos003[] PV2=/dev/dsk/c0t2d0 root@hpeos003[] PV3=/dev/dsk/c0t3d0 root@hpeos003[] SIZE=300 root@hpeos003[] COUNT=1 root@hpeos003[] while [ $COUNT -le $SIZE ] > do > lvextend -l $COUNT /dev/vgora1/stripy $PV1 > let COUNT=COUNT+1 > lvextend -l $COUNT /dev/vgora1/stripy $PV2 > let COUNT=COUNT+1 > lvextend -l $COUNT /dev/vgora1/stripy $PV3 > let COUNT=COUNT+1 > done Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf ... Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[] lvdisplay -v /dev/vgora1/stripy more --- Logical volumes --- LV Name /dev/vgora1/stripy VG Name /dev/vgora1 LV Permission read/write LV Status available/syncd Mirror copies 0 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 1200 Current LE 300 Allocated PE 300 Stripes 0 Stripe Size (Kbytes) 0 Bad block on Allocation strict IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c0t1d0 100 100 /dev/dsk/c0t2d0 100 100 /dev/dsk/c0t3d0 100 100 --- Logical extents --- LE PV1 PE1 Status 1 00000 /dev/dsk/c0t1d0 00250 current 00001 /dev/dsk/c0t2d0 00000 current 00002 /dev/dsk/c0t3d0 00000 current 00003 /dev/dsk/c0t1d0 00251 current 00004 /dev/dsk/c0t2d0 00001 current 00005 /dev/dsk/c0t3d0 00001 current ... 00297 /dev/dsk/c0t1d0 00349 current 00298 /dev/dsk/c0t2d0 00099 current 00299 /dev/dsk/c0t3d0 00099 current root@hpeos003[] IMPORTANT The first thing to note here is that when we implement striping, all disks in the stripe set should be on separate controllers in order to avoid a single controller being the bottleneck. In my case, all three disks are on the same interface. This is less than optimal. These examples are for demonstration purposes only! | This type of extent-based striping is now known as a Distributed Logical Volume (more on that in a moment). One aspect of Distributed volumes (and my stripy volume above) is that they can be mirrored. The reason is that an extent is at least 1MB in size and, thus, mirroring has no problem in tracking changes in the mirror because the LTG is within a single extent. In our hand-crafted semi-Distributed volume, we can now set up mirroring. The question is how many disks do we need to set up a mirror copy of the data. The technical answer is simply one. As long as we can fit all the extents from the original volume inside a single disk, there is no reason why LVM would stop us. Remember that the default strictness criteria dictate that the mirror extents are on a different volume than the original. root@hpeos003[] lvextend -m 1 /dev/vgora1/stripy /dev/dsk/c4t9d0 The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vgora1/stripy" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[] lvdisplay -v /dev/vgora1/stripy more --- Logical volumes --- LV Name /dev/vgora1/stripy VG Name /dev/vgora1 LV Permission read/write LV Status available/syncd Mirror copies 1 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 1200 Current LE 300 Allocated PE 600 Stripes 0 Stripe Size (Kbytes) 0 Bad block on Allocation strict IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c0t1d0 100 100 /dev/dsk/c0t2d0 100 100 /dev/dsk/c0t3d0 100 100 /dev/dsk/c4t9d0 300 300 --- Logical extents --- LE PV1 PE1 Status 1 PV2 PE2 Status 2 00000 /dev/dsk/c0t1d0 00250 current /dev/dsk/c4t9d0 00250 current 00001 /dev/dsk/c0t2d0 00000 current /dev/dsk/c4t9d0 00251 current 00002 /dev/dsk/c0t3d0 00000 current /dev/dsk/c4t9d0 00252 current 00003 /dev/dsk/c0t1d0 00251 current /dev/dsk/c4t9d0 00253 current 00004 /dev/dsk/c0t2d0 00001 current /dev/dsk/c4t9d0 00254 current 00005 /dev/dsk/c0t3d0 00001 current /dev/dsk/c4t9d0 00255 current 00297 /dev/dsk/c0t1d0 00349 current /dev/dsk/c4t9d0 00547 current 00298 /dev/dsk/c0t2d0 00099 current /dev/dsk/c4t9d0 00548 current 00299 /dev/dsk/c0t3d0 00099 current /dev/dsk/c4t9d0 00549 current root@hpeos003[] This is less than optimal from a high-availability perspective, but it shows you that just about anything is possible; as long as the mirrored extent doesn't land on the original volume, you could be in a position where an original extent existed on c0t1d0 and its mirror was on c0t2d0 . This is obviously something you wouldn't normally want to create but is possible. In my mode of thinking, I would call our stripy volume a mirrored-striped volume, i.e., we have a striped volume that is mirrored, as opposed to a striped-mirror , which I would say is a mirrored volume where the mirror is striped. In such a scenario, the original volume does not need to be striped but usually is. This latter case is not an easy volume to create with LVM. The new way to establish an extent-based striped volume is via a concept known as a Distributed volume . In LVM to create a Distributed volume, we use the “D y option to lvcreate . This requires that we have the original volumes in their own PVG, i.e., we enforce PVG-strict allocation (-s g ). The idea here is that, should we want to mirror a Distributed volume, all of the extents in the mirror will definitely not land on any physical volumes from the original volume. root@hpeos003[] cat /etc/lvmpvg VG /dev/vgora1 PVG PVG0 /dev/dsk/c0t1d0 /dev/dsk/c0t2d0 /dev/dsk/c0t3d0 PVG PVG1 /dev/dsk/c4t9d0 /dev/dsk/c4t10d0 /dev/dsk/c4t11d0 root@hpeos003[] root@hpeos003[] lvcreate -D y -s g -L 1200 -n Dstripe /dev/vgora1 Logical volume "/dev/vgora1/Dstripe" has been successfully created with character device "/dev/vgora1/rDstripe". Logical volume "/dev/vgora1/Dstripe" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] When we mirror this volume, LVM will distribute the extents across all disks in the other PVG: root@hpeos003[] lvextend -m 1 /dev/vgora1/Dstripe The newly allocated mirrors are now being synchronized. This operation will take some time. Please wait .... Logical volume "/dev/vgora1/Dstripe" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[] lvdisplay -v /dev/vgora1/Dstripe more --- Logical volumes --- LV Name /dev/vgora1/Dstripe VG Name /dev/vgora1 LV Permission read/write LV Status available/syncd Mirror copies 1 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 1200 Current LE 300 Allocated PE 600 Stripes 0 Stripe Size (Kbytes) 0 Bad block on Allocation PVG-strict/distributed IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c0t1d0 100 100 /dev/dsk/c0t2d0 100 100 /dev/dsk/c0t3d0 100 100 /dev/dsk/c4t9d0 100 100 /dev/dsk/c4t10d0 100 100 /dev/dsk/c4t11d0 100 100 --- Logical extents --- LE PV1 PE1 Status 1 PV2 PE2 Status 2 00000 /dev/dsk/c0t1d0 00350 current /dev/dsk/c4t9d0 00550 current 00001 /dev/dsk/c0t2d0 00100 current /dev/dsk/c4t10d0 00000 current 00002 /dev/dsk/c0t3d0 00100 current /dev/dsk/c4t11d0 00000 current 00003 /dev/dsk/c0t1d0 00351 current /dev/dsk/c4t9d0 00551 current 00004 /dev/dsk/c0t2d0 00101 current /dev/dsk/c4t10d0 00001 current 00005 /dev/dsk/c0t3d0 00101 current /dev/dsk/c4t11d0 00001 current ... 00297 /dev/dsk/c0t1d0 00449 current /dev/dsk/c4t9d0 00649 current 00298 /dev/dsk/c0t2d0 00199 current /dev/dsk/c4t10d0 00099 current 00299 /dev/dsk/c0t3d0 00199 current /dev/dsk/c4t11d0 00099 current root@hpeos003[] With this allocation policy, we are not allowed to have all of our Distributed mirror extents on the same physical volume. Distributed means exactly that: distributed original extents and distributed mirror extents. The drawback of extent-based striping is that we are not matching the underlying application IO size with the stripe size, e.g., a filesystem block/extent or RDBMS IO size and, hence, our IO is not being distributed over all disks in an optimal way. The only way we can attempt to match the stripe size to the underlying application IO size is to use LVM kilobyte-striping . This is relatively simple to set up. Here, I am using a three-disk stripe set (-i 3 ) with a stripe size of 64KB (-I 64 ). If I don't specify the disks to stripe across, LVM will choose the first three disk in the volume group that accommodate the volume. root@hpeos003[] lvcreate -L 1200 -i 3 -I 64 -n Kstripe /dev/vgora1 Logical volume "/dev/vgora1/Kstripe" has been successfully created with character device "/dev/vgora1/rKstripe". Logical volume "/dev/vgora1/Kstripe" has been successfully extended. Volume Group configuration for /dev/vgora1 has been saved in /etc/lvmconf/vgora1.conf root@hpeos003[] root@hpeos003[] lvdisplay -v /dev/vgora1/Kstripe more --- Logical volumes --- LV Name /dev/vgora1/Kstripe VG Name /dev/vgora1 LV Permission read/write LV Status available/syncd Mirror copies 0 Consistency Recovery MWC Schedule striped LV Size (Mbytes) 1200 Current LE 300 Allocated PE 300 Stripes 3 Stripe Size (Kbytes) 64 Bad block on Allocation strict IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c0t1d0 100 100 /dev/dsk/c0t2d0 100 100 /dev/dsk/c0t3d0 100 100 --- Logical extents --- LE PV1 PE1 Status 1 00000 /dev/dsk/c0t1d0 00450 current 00001 /dev/dsk/c0t2d0 00200 current 00002 /dev/dsk/c0t3d0 00200 current ... 00297 /dev/dsk/c0t1d0 00549 current 00298 /dev/dsk/c0t2d0 00299 current 00299 /dev/dsk/c0t3d0 00299 current root@hpeos003[] If I try to mirror this volume, LVM will not allow it to happen; the options are incompatible: root@hpeos003[] lvextend -m 1 /dev/vgora1/Kstripe Striped mirrors are not supported. To enable mirroring options (-m, -M, -c), do not specify the striping options (-i, -I) when creating logical volumes. root@hpeos003[] Although this can be viewed as better for performance, we are very vulnerable to a loss of data should a single disk in this configuration fail, rendering the whole Kstripe volume unusable. I know of few installations that run with this configuration in a commercial environment without having some form of hardware redundancy built into their configuration. Some people think that LVMs kilobyte striping doesn't go far enough as far as performance criteria is concerned . IMPORTANT -
Striping and mirroring is only supported by LVM with Distributed Volumes. -
Distributed volumes utilize stripes that are the size of one entire extent. -
As such Distributed volumes and mirroring are a less-than -optimal solution for RAID 0/1. | Another aspect of LVM relates to its ability to position data on a particular area of a disk. This technique can maximize or at least standardize IO performance from disks in a stripe set by ensuring that all data is positioned in the same area of each disk in the stripe set. Other disk management products such as VxVM allow you to place your data on a specific area of the disk. One way of ensuring that data is in the same area of each disk is to create the striped volumes as the first volumes in the volume group. In this way, we can ensure that the stripes are distributed evenly over all disks, at the beginning of each disk in the stripe set. The only other way (it's a kludge ) to do this with LVM is to follow these steps: -
Choose your specific area of the disk, e.g., the middle of the disk. -
Create dummy volume(s) until you reach your specified area. -
Create your real volume(s). -
Remove the dummy volume(s). The main problem with this type of configuration is that if you want to extend such a volume, you would have had to put in place some additional dummy volumes after the real volume to preserve some space for future requirements (I suppose you could have turn off extent allocation for that physical volume: pvchange “x n <dsk> ). |