An important task for any administrator is to be able to recover volumes in the event of losing a physical disk. If we have employed redundancy techniques for our all volumes , we can sustain the loss of a single disk. With LVM, we had to get involved with commands like vgcfgrestore . VxVM has an equivalent command. The command is dgcfgrestore , and its sister command dgcfgbackup . We can run these commands at any time. They will create a file in the directory /etc/vxvmconf . It's worthwhile to make sure that this directory exists, because the dgcfgbackup command will fail if the directory doesn't exist. root@hpeos003[] dgcfgbackup ora1 mv: /etc/vxvmconf/ora1.conf: rename: No such file or directory root@hpeos003[] mkdir /etc/vxvmconf root@hpeos003[] dgcfgbackup ora1 root@hpeos003[] ll /etc/vxvmconf total 66 -rw-rw-rw- 1 root sys 33086 Nov 11 00:33 ora1.conf root@hpeos003[] root@hpeos003[] more /etc/vxvmconf/ora1.conf VxVM_DG_Config_Backup_File: ora1 vol chkpt1 tutil0=" tutil1=" tutil2=" kstate=ENABLED r_all=GEN_DET_SPARSE r_some=GEN_DET_SPARSE w_all=GEN_DET_SPARSE w_some=GEN_DET_SPARSE lasterr=0 use_type=fsgen fstype=" comment=" putil0=" putil1=" putil2=" state="ACTIVE writeback=on writecopy=off specify_writecopy=off logging=off has_logs=off root@hpeos003[] We need to use the dgcfgrestore command when we have initialized disks without the ability to store the configuration database or when we have a single-disk disk group . In most cases, we have disk groups of more than one disk. In such situations, if we lose a physical disk, we don't need to use the dgcfgrestore command. As soon as we add the repaired disk back into the disk group, the configuration information stored on every disk in the disk group will be copied to the new disk. Here's an example where I have lost the disk ora_disk3 (= c4t12d0 ). The first time I try to perform IO to the disk and the IO times out, we will see errors appear in syslog of the following form: Nov 11 01:20:21 hpeos003 vmunix: NOTICE: vxvm:vxdmp: disabled path 31/0x4c000 belonging to the dmpnode 0/0xc Nov 11 01:20:21 hpeos003 vmunix: NOTICE: vxvm:vxdmp: disabled dmpnode 0/0xc Nov 11 01:20:21 hpeos003 vmunix: WARNING: vxvm:vxio: Subdisk ora_disk3-01 block 0: Uncorrectable read error If you look closely at the errors, you can deduce where the problem lies; 31/0x4c000 is the major/minor number of the disk that failed, and we can see errors relating to the names of subdisks. A message is usually sent to the root user as well: root@hpeos003[] vxvm:vxconfigd: NOTICE: Offlining config copy 1 on disk c4t12d0: Reason: Disk write failure vxvm:vxconfigd: NOTICE: Offlining config copy 2 on disk c4t12d0: Reason: Disk write failure vxvm:vxconfigd: NOTICE: Detached disk ora_disk3 You have mail in /var/mail/root root@hpeos003[] VxVM also sends an email to the root user: Relocation was not successful for subdisks on disk ora_disk3 in volume archive in disk group ora1. No replacement was made and the disk is still unusable. The following volumes have storage on ora_disk3: data2 archive These volumes are still usable, but the redundancy of those volumes is reduced. Any RAID 5 volumes with storage on the failed disk may become unusable in the face of further failures. The following volumes: dbvol logvol have data on ora_disk3 but have no other usable mirrors on other disks. These volumes are now unusable and the data on them is unavailable. These volumes must have their data restored. ? The disk will now be flagged as being offline and disabled . A FAILED disk is a disk on which VxVM cannot read its private or public region . A FAILING disk is where VxVM can still read the private region of the disk. Affected plexes are marked with a state of IOFAIL . If possible, subdisks will be relocated to spare disks (more on that later): root@hpeos003[] vxdisk list DEVICE TYPE DISK GROUP STATUS c0t0d0 simple - - LVM c0t1d0 simple - - LVM c0t2d0 simple - - LVM c0t3d0 simple - - LVM c0t4d0 simple ora_disk1 ora1 online c0t5d0 simple ora_disk2 ora1 online c1t15d0 simple - - LVM c3t15d0 simple disk01 rootdg online c4t8d0 simple - - LVM c4t9d0 simple - - LVM c4t10d0 simple - - LVM c4t11d0 simple - - LVM c4t12d0 simple - - online c4t13d0 simple ora_disk4 ora1 online c4t14d0 simple - - online invalid c5t0d0 simple - - LVM c5t1d0 simple - - LVM c5t2d0 simple - - LVM c5t3d0 simple - - LVM c5t4d0 simple - - online c5t5d0 simple - - online - - ora_disk3 ora1 failed was:c4t12d0 root@hpeos003[] We can query the status of the disk as well as the state of volumes to see which volumes are still online and active. root@hpeos003[] vxdisk list c4t12d0 vxvm:vxdisk: ERROR: Device c4t12d0: get_contents failed: Disk device is offline Device: c4t12d0 devicetag: c4t12d0 type: simple flags: online error private autoconfig pubpaths: block=/dev/vx/dmp/c4t12d0 char=/dev/vx/rdmp/c4t12d0 Multipathing information: numpaths: 1 c4t12d0 state=disabled root@hpeos003[] root@hpeos003[] root@hpeos003[] vxprint -g ora1 TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 dg ora1 ora1 - - - - - - dm ora_disk1 c0t4d0 - 71682048 - - - - dm ora_disk2 c0t5d0 - 71682048 - - - - dm ora_disk3 - - - - NODEVICE - - dm ora_disk4 c4t13d0 - 71682048 - - - - v archive RAID 5 ENABLED 4194304 - ACTIVE - - pl archive-01 archive ENABLED 4194304 - ACTIVE - - sd ora_disk3-06 archive-01 DISABLED 2097152 0 NODEVICE - - sd ora_disk2-02 archive-01 ENABLED 2097152 0 - - - sd ora_disk4-04 archive-01 ENABLED 2097152 0 - - - pl archive-02 archive ENABLED 1440 - LOG - - sd ora_disk1-04 archive-02 ENABLED 1440 0 - - - v chkpt1 fsgen ENABLED 5242880 - ACTIVE ATT1 - pl chkpt1-01 chkpt1 ENABLED 5242880 - ACTIVE - - sd ora_disk4-01 chkpt1-01 ENABLED 5242880 0 - - - pl chkpt1-02 chkpt1 ENABLED 5242880 - STALE ATT - sd ora_disk1-06 chkpt1-02 ENABLED 5242880 0 - - - v chkpt2 fsgen ENABLED 102400 - ACTIVE - - pl chkpt2-01 chkpt2 ENABLED 102400 - ACTIVE - - sd ora_disk4-02 chkpt2-01 ENABLED 102400 0 - - - pl chkpt2-02 chkpt2 DISABLED 102400 - RECOVER - - sd ora_disk1-07 chkpt2-02 ENABLED 102400 0 - - - v data2 fsgen ENABLED 4194304 - ACTIVE - - pl data2-01 data2 DISABLED 4194304 - NODEVICE - - sd ora_disk1-03 data2-01 ENABLED 2097152 0 - - - sd ora_disk3-02 data2-01 DISABLED 2097152 0 NODEVICE - - pl data2-02 data2 ENABLED 4194304 - ACTIVE - - sd ora_disk2-03 data2-02 ENABLED 2097152 0 - - - sd ora_disk4-03 data2-02 ENABLED 2097152 0 - - - pl data2-03 data2 ENABLED LOGONLY - ACTIVE - - sd ora_disk1-02 data2-03 ENABLED 66 LOG - - - v data3 fsgen ENABLED 4194304 - ACTIVE - - pl data3-03 data3 ENABLED 4194304 - ACTIVE - - sv data3-S01 data3-03 DISABLED 2097152 0 - - - sv data3-S02 data3-03 ENABLED 2097152 0 - - - v data3-L01 fsgen DISABLED 2097152 - ACTIVE - - pl data3-P01 data3-L01 DISABLED 2097152 - ACTIVE - - sd ora_disk1-05 data3-P01 ENABLED 2097152 0 - - - pl data3-P02 data3-L01 DISABLED 2097152 - ACTIVE - - sd ora_disk2-05 data3-P02 ENABLED 2097152 0 - - - v data3-L02 fsgen ENABLED 2097152 - ACTIVE - - pl data3-P03 data3-L02 DISABLED 2097152 - RECOVER - - sd ora_disk1-08 data3-P03 ENABLED 2097152 0 - - - pl data3-P04 data3-L02 ENABLED 2097152 - ACTIVE - - sd ora_disk4-05 data3-P04 ENABLED 2097152 0 - - - v dbvol fsgen DISABLED 10485760 - ACTIVE - - pl dbvol-01 dbvol DISABLED 10485792 - NODEVICE - - sd ora_disk1-01 dbvol-01 ENABLED 3495264 0 - - - sd ora_disk2-01 dbvol-01 ENABLED 3495264 0 - - - sd ora_disk3-01 dbvol-01 DISABLED 3495264 0 NODEVICE - - v logvol fsgen DISABLED 31457280 - ACTIVE - - pl logvol-01 logvol DISABLED 31457280 - NODEVICE - - sd oralog01 logvol-01 ENABLED 10485760 0 - - - sd oralog02 logvol-01 DISABLED 10485760 0 NODEVICE - - sd oralog03 logvol-01 ENABLED 10485760 0 - - - root@hpeos003[] Volumes that are still ENABLED are said to be redundant, i.e., they have redundancy (mirroring, RAID 5) built into their configuration. Volumes that are DISABLED are said to be non-redundant . When we recover from this situation, any non-redundant volumes will have data missing from them, which we will have to recover using a previous backup. The recovery process we are about to go through is similar, in theory, to recovering LVM structures, i.e., we recover the structure of the disk group (the private region ). Recovering the data is either the job of mirroring/RAID 5 or a job for our backup tapes. -
Replace the failed disk with a new one. The new disk need not be attached at the same hardware path but should be the same size and specification as the original disk. We will the initialize the disk: root@hpeos003[] vxdisk init c4t12d0 nlog=2 nconfig= 2 root@hpeos003[] -
Attach the new disk into the disk group using the original disk media name. root@hpeos003[] vxdg -g ora1 -k adddisk ora_disk3=c4t12d0 root@hpeos003[] root@hpeos003[] vxdisk list DEVICE TYPE DISK GROUP STATUS c0t0d0 simple - - LVM c0t1d0 simple - - LVM c0t2d0 simple - - LVM c0t3d0 simple - - LVM c0t4d0 simple ora_disk1 ora1 online c0t5d0 simple ora_disk2 ora1 online c1t15d0 simple - - LVM c3t15d0 simple disk01 rootdg online c4t8d0 simple - - LVM c4t9d0 simple - - LVM c4t10d0 simple - - LVM c4t11d0 simple - - LVM c4t12d0 simple ora_disk3 ora1 online c4t13d0 simple ora_disk4 ora1 online c4t14d0 simple - - online invalid c5t0d0 simple - - LVM c5t1d0 simple - - LVM c5t2d0 simple - - LVM c5t3d0 simple - - LVM c5t4d0 simple - - online c5t5d0 simple - - online root@hpeos003[] -
Recover all redundant volumes. root@hpeos003[] vxrecover -bs root@hpeos003[] This can take some time to complete depending on the number of volumes that need recovering as well as the use of DRL for mirroring. -
Start non-redundant volumes. Non-redundant volumes will remain DISABLED . root@hpeos003[] vxprint -g ora1 dbvol logvol TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 v dbvol fsgen DISABLED 10485760 - ACTIVE - - pl dbvol-01 dbvol DISABLED 10485792 - RECOVER - - sd ora_disk1-01 dbvol-01 ENABLED 3495264 0 - - - sd ora_disk2-01 dbvol-01 ENABLED 3495264 0 - - - sd ora_disk3-01 dbvol-01 ENABLED 3495264 0 - - - v logvol fsgen DISABLED 31457280 - ACTIVE - - pl logvol-01 logvol DISABLED 31457280 - RECOVER - - sd oralog01 logvol-01 ENABLED 10485760 0 - - - sd oralog02 logvol-01 ENABLED 10485760 0 - - - sd oralog03 logvol-01 ENABLED 10485760 0 - - - root@hpeos003[] The state of RECOVER means that VxVM knows the data in that plex needs recovering. Because we have no other plexes from which to recover this volume, we have no choice but to force-start the volume in order to start a process of recovering the data from some form of backup tape. root@hpeos003[] vxvol -g ora1 -f start dbvol root@hpeos003[] vxvol -g ora1 -f start logvol root@hpeos003[] vxinfo -p -g ora1 vol dbvol fsgen Started plex dbvol-01 ACTIVE vol data3-L02 fsgen Started plex data3-P03 ACTIVE plex data3-P04 ACTIVE vol data3-L01 fsgen Started plex data3-P01 ACTIVE plex data3-P02 ACTIVE vol data3 fsgen Started plex data3-03 ACTIVE vol data2 fsgen Started plex data2-01 ACTIVE plex data2-02 ACTIVE plex data2-03 ACTIVE vol chkpt2 fsgen Started plex chkpt2-01 ACTIVE plex chkpt2-02 ACTIVE vol chkpt1 fsgen Started plex chkpt1-01 ACTIVE plex chkpt1-02 ACTIVE vol logvol fsgen Started plex logvol-01 ACTIVE vol archive RAID 5 Started plex archive-01 ACTIVE plex archive-02 LOG root@hpeos003[] -
Recover the data for non-redundant volumes. Because we have lost a large chunk of data from the volume, it is likely we will need to recover the entire volume from backup tapes. If the volume contained a filesystem, we will need to fix the filesystem (run the fsck command), mount this filesystem, and then recover that data from tape. If this was a FAILING disk, the process of recovery may be slightly different: -
Establish that the disk is producing intermittent faults. This is a tricky one to diagnose. If you are seeing multiple SCSI lbolt errors or if you see NO_HW listed in an ioscan command, it may be that a cable/connector is malfunctioning. On a SAN, it may be that a switch port is malfunctioning. In this situation, hardware troubleshooting comes to the fore. This can be time consuming and costly if we need to replace components . If it is simply a loose cable, then we can simply force HP-UX to rescan for devices, i.e., run ioscan “fnC disk . -
Force VxVM to reread the private area of all disks: vxdctl enable . -
Reattach the device to the disk media record: vxreattach . -
Recover the redundant volumes: vxrecover . -
Restart the non-redundant volumes: vxvol “g <disk group> “f start <volume> . -
Recover non-redundant volumes. This can involve fixing the filesystems (running the fsck command) and possibly recovering corrupt data files from backup tapes. If this is happening on a regular (or mostly regular) basis, I would consider having a hardware engineer perform some diagnostic testing on the device and try to schedule some planned down time in order to replace the device. Knowing and understanding Kernel and Volume/Plex states is an important part of failed/failing disk administration. Volumes and plexes will have these states change depending on the actions we take. Here are the Kernel states we see with vxprint (in Table 7-4): Table 7-4. Kernel states State | Description | ENABLED | The object is able to perform I/O to both the public and private regions . | DETACHED | Considered the maintenance mode where plex operations and low-level instructions are possible to the private region. IO to the public region is not possible. | DISABLED | No IO is possible to the object. The object is effectively offline. | Associated with these kernel states, we have Volume and Plex states (see Table 7-5). Together, the Kernel and the Volume/Plex states should give us some idea as to what actions to take next . Table 7-5. Volume/Plex states State | Object | Description | CLEAN | Volume/Plex | The object has a good copy of the data. This is a normal state for a stopped volume to be in. A volume that has been stopped by an administrator will see a state of DISABLED/CLEAN. We can use the vxvol start command to enable IO to the volume. | ACTIVE | Volume/Plex | Indicates the object is or was started and able to perform IO. In order to have a full-functioning volume, we are aiming for all objects to be ENABLED/ACTIVE. Depending on the combination of kernel and volume/plex states will determine the next action. | STALE | Plex | The data in the plex is not synchronized with the data in a CLEAN plex. | OFFLINE | Plex | Usually as a result of the vxmend off command issued by an administrator. No IO is performed to the plex and will become outdated over time. When brought online ( vxmend on ), the plex state will change to stale. | NODEVICE | Volume/Plex | No plexes have an accessible disk below them, or the disk below the plex has failed. | IOFAIL | Plex | IO to the public region failed. VxVM must still determine whether the disk has actually failed because IO to the private region is still possible. May indicate a FAILING disk. | RECOVER | Plex | Once a failed disk has been fixed and returned to the disk group, a previously ACTIVE plex will be marked as RECOVER. If the volume is redundant, we can recover a CLEAN plex. | REMOVED | Plex | Same as NODEVICE, except this was manually performed by an administrator. | SYNC | Volume | Plexes are involved in resynchronization activities. | NEEDSYNC | Volume | Same as SYNC, except that the read thread to perform the synchronization has not been started yet. | EMPTY | Volume/Plex | Usually only seen when creating a volume using vxmake . A plex has not yet been defined as having good CLEAN data. | SNAPDONE | Plex | Same as ACTIVE/CLEAN, but for a plex synchronized by the snapstart operation. | SNAPATT | Plex | A snapshot object that is currently being synchronized (STALE). | TEMP | Plex/Volume | Usually only seen during other synchronization operations. Volumes/plexes in this state should not be used. | Simply knowing these states is not enough to be able to perform credible recovery of a failed disk. We need to understand and be able to react to different combinations of kernel and volume/plex states (see Table 7-6). Here are some common combinations and an appropriate Next Step . These Next Steps should not be viewed in isolation. Some of them are appropriate for redundant volumes (e.g., vxrecover ), while others are appropriate for non-redundant volumes (e.g., vxvol “f start ): Table 7-6. Kernel/Volume States and the Next Step Kernel/Volume or Plex State | Next Step | DISABLED/NODEVICE | For a FAILING disk # vxdctl enable # vxreattach # vxrecover For a FAILED disk # vxdisk init # vxdg -k adddisk # vxrecover # vxvol -f start | DISABLED/IOFAIL DETTACHED/IOFAIL | # vxrecover | DISABLED/STALE | # vxrecover | DISABLED/ACTIVE | # vxrecover -s | DISABLED/OFFLINE | # vxmend on | DISABLED/REMOVED | # vxdg -k adddisk | The use of the vxmend command is discussed in the Veritas literature. The vxmend command can change the state of volumes and plexes depending on what is required, e.g., changing the state of a STALE plex to CLEAN via the vxmend fix CLEAN command. This can be useful but also very dangerous. When synchronizing a volume, we will want to synchronize from a CLEAN plex to all STALE plexes. Deciding which plex has the good data can be quite difficult. We would need some underlying application utility to analyze the data in the volume, which is not trivial. If such a situation is possible, then we could do the following: -
Set the state of all plexes to STALE . -
Set the state of the good plex to CLEAN . -
Recover the volume with vxrecover “s . |