7.8 Using Spare Disks

     

Another way to avoid the problem of losing a disk is to add spare volumes into the disk group . If a disk fails, the vxrelocd (started at boot time) will try to relocate subdisks for redundant volumes, utilizing spare space in the disk group. If you have striped/mirrored/RAID 5 volumes, this may not be possible if you want to maintain the integrity of the layout policy you have adopted. In such situations, you will simply have to accept the drop in redundancy until you can have the disk replaced . This is where Spare Disks come to the fore. They work under the same principle as a Spare PV in LVM whereby they will be used only in the event of a disk failure (although you can override this in VxVM by explicitly stating the disk name on a vxmake / vxassist command line). The process is not too complicated and can improve your chances of sustaining multiple disk failures:

  1. Initialize a free disk.

     

     root@hpeos003[]  vxdisk init c4t14d0 nlog=2 nconfig=2  root@hpeos003[] 

  2. Add the disk to the disk group.

     

     root@hpeos003[]  vxdg -g ora1 adddisk ora_spare=c4t14d0  root@hpeos003[]  vxprint -ht -g ora1 ora_spare  DM NAME         DEVICE       TYPE     PRIVLEN  PUBLEN   STATE dm ora_spare    c4t14d0      simple   1024     71682048 - root@hpeos003[] 

  3. Mark the disk as a spare disk.

     

     root@hpeos003[]  vxedit -g ora1 set spare=on ora_spare  root@hpeos003[] root@hpeos003[]  vxprint -g ora1  TY NAME         ASSOC      KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0 dg ora1         ora1       -        -        -        -        -       - dm ora_disk1    c0t4d0     -        71682048 -        -        -       - dm ora_disk2    c0t5d0     -        71682048 -        -        -       - dm ora_disk3    c4t12d0    -        71682048 -        -        -       - dm ora_disk4    c4t13d0    -        71682048 -        -        -       -   dm ora_spare    c4t14d0    -        71682048 -        SPARE    -       -   v  archive      RAID 5      ENABLED  4194304  -        ACTIVE   -       - pl archive-01   archive    ENABLED  4194304  -        ACTIVE   -       - sd ora_disk3-06 archive-01 ENABLED  2097152  0        -        -       - sd ora_disk2-02 archive-01 ENABLED  2097152  0        -        -       - sd ora_disk4-04 archive-01 ENABLED  2097152  0        -        -       - pl archive-02   archive    ENABLED  1440     -        LOG      -       - sd ora_disk1-04 archive-02 ENABLED  1440     0        -        -       - ... root@hpeos003[] 

  4. Wait for a failure.

In this instance, I have pulled ora_disk3 from its cabinet. This will be seen as a disk failure. I received this error in syslog :

 

 Nov 11 16:50:54 hpeos003 vmunix: NOTICE: vxvm:vxdmp: disabled path 31/0x4c000 belonging to graphics/ccc.gif the dmpnode 0/0xc Nov 11 16:50:54 hpeos003 vmunix: Nov 11 16:50:54 hpeos003 vmunix: NOTICE: vxvm:vxdmp: disabled dmpnode 0/0xc Nov 11 16:50:54 hpeos003 vmunix: WARNING: vxvm:vxio: Subdisk ora_disk3-01 block 0: Uncorrectable read error 

I also received this email from the vxrelocd daemon:

 

 root@hpeos003[]  mail  From root@hpeos003 Tue Nov 11 16:51:14 GMT 2003 Received: (from root@localhost)         by hpeos003 (8.11.1 (Revision 1.5) /8.9.3) id hABGpEE13268         for root; Tue, 11 Nov 2003 16:51:14 GMT Date: Tue, 11 Nov 2003 16:51:14 GMT From: root@hpeos003 Message-Id: <200311111651.hABGpEE13268@hpeos003 > To: root@hpeos003   Subject: Attempting VxVM relocation on host hpeos003   Mime-Version: 1.0 Content-Type: text/plain; charset=X-roman8 Content-Transfer-Encoding: 7bit   Volume data2 Subdisk ora_disk3-02 relocated to ora_spare-02,     but not yet recovered.   ? 

The length of time this process takes to complete will depend on the number of subdisks that need to be relocated. When complete, root will receive another email from vxrelocd in this form:

 

 root@hpeos003[]  mail  From root@hpeos003 Tue Nov 11 16:57:45 GMT 2003 Received: (from root@localhost)         by hpeos003 (8.11.1 (Revision 1.5) /8.9.3) id hABGviV13409         for root; Tue, 11 Nov 2003 16:57:44 GMT Date: Tue, 11 Nov 2003 16:57:44 GMT From: root@hpeos003 Message-Id: <200311111657.hABGviV13409@hpeos003 > To: root@hpeos003   Subject: Attempting VxVM relocation on host hpeos003   Mime-Version: 1.0 Content-Type: text/plain; charset=X-roman8 Content-Transfer-Encoding: 7bit Status: RO   Recovery complete for volume data2 in disk group ora1.   ? 

In the meantime, I can organize a replacement disk to replace the failed disk. As we can see from vxprint , the subdisks that have been relocated are now housed on the Spare disk:

 

 root@hpeos003[]  vxprint -g ora1 data2  TY NAME         ASSOC      KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0 v  data2        fsgen      ENABLED  4194304  -        ACTIVE   -       - pl data2-01     data2      ENABLED  4194304  -        ACTIVE   -       - sd ora_disk1-03 data2-01   ENABLED  2097152  0        -        -       -   sd ora_spare-02 data2-01   ENABLED  2097152  0        -        -       -   pl data2-02     data2      ENABLED  4194304  -        ACTIVE   -       - sd ora_disk2-03 data2-02   ENABLED  2097152  0        -        -       - sd ora_disk4-03 data2-02   ENABLED  2097152  0        -        -       - pl data2-03     data2      ENABLED  LOGONLY  -        ACTIVE   -       - sd ora_disk1-02 data2-03   ENABLED  66       LOG      -        -       - root@hpeos003[] 

Once we have replaced the disk, we can choose to un-relocate the subdisks if we choose. This is entirely up to the administrator, but it's a good idea in that if we had a design with specific High Availability and/or Performance attributes, it will return our configuration to its original state. This process is going to be IO intensive because we move subdisks back to their original location. We may decide to wait until a quiet, off-line time to run the vxunreloc command.

 

 root@hpeos003[]  /etc/vx/bin/vxunreloc -g ora1 ora_disk3 &  root@hpeos003[] 

Finally, we should take care in choosing which disks are to be selected as spare disks. If they are located on an interface with other high-activity disks, it may be that overall IO performance is significantly degraded during a hot relocation. However, where interfaces are scarce , most administrators will deem the benefits to High Availability in light of the fact that we hope we will never need to use the Spare Disk.



HP-UX CSE(c) Official Study Guide and Desk Reference
HP-UX CSE(c) Official Study Guide and Desk Reference
ISBN: N/A
EAN: N/A
Year: 2006
Pages: 434

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net