I/O Barriers are used in TruCluster Server version 5 to ensure data integrity in the cluster. For systems to be allowed to access storage, the member must be a part of the cluster and the cluster must have quorum. Barriers come into play when certain abnormal conditions occur such as member panics or member hangs or "normal" conditions like issuing a halt command, pushing the halt button, or pulling a system's plug. In these situations, CNX requests DRD to impose barriers against writes to most disk units. Note that certain devices (listed below) do not have barriers erected around them and thus are not suitable for critical data, since the integrity of that data cannot be ensured without the barrier(s). Devices that are excluded from the barrier list are: swap devices, quorum disks, boot devices, and tape drives.
The SCSI-3 Persistent Reserve command set can be used by the TruCluster Server barrier mechanism for many disks (i.e., HSZ80, HSG80, and HSV110 storage) but in some isolated cases can cause a problem since the reservations persist (this is only a problem in cases where the disks are moved from a cluster to a standalone system, when performing cluster recovery, or when reinstalling). For example, if an HSG80's disks were re-allocated away from a cluster and to a standalone system, that standalone system may not be able to access the disks because of a persistent reservation on those devices.
You can see the persistent reservations by using the "cleanPR show" command.
# cleanPR show cleanPR Version: 1.5 ... Checking device 5 1 100 Key Entry 0: 0x30001 Key Entry 1: 0x30002 Key Entry 1: 0x30002 Key Entry 3: 0x30001 Key Entry 6: 0x30001 Key Entry 6: 0x30002 Key Entry 6: 0x30001 Key Entry 6: 0x30002 ... Total of 5 devices found w/Persistent Reservations Total of 0 devices cleared of Persistent Reservations
Or if you're more curious, you can check the unit, for example, on an HSG80 to see if there is a persistent reservation on it.
HSG80> show d100 LUN Uses Used by ------------------------------------------------------------------------------ D100 DISK30100 LUN ID: 6000-1FE1-000B-1A40-0009-0361-3888-0062 NOIDENTIFIER Switches: RUN NOWRITE_PROTECT READ_CACHE READAHEAD_CACHE WRITEBACK_CACHE MAX_READ_CACHED_TRANSFER_SIZE = 32 MAX_WRITE_CACHED_TRANSFER_SIZE = 32 Access: ALL State: ONLINE to this controller Persistent reserved NOPREFERRED_PATH Size: 17769177 blocks Geometry (C/H/S): ( 5258 / 20 / 169 )
Did you notice the "Persistent reserved" under "State"? That tells us that unit D100 has a PR. Let's continue to track this particular unit down and see who set the PR. We happen to know that unit D100 on the HSG80 is dsk108 on our cluster (if we didn't know this, we could track it down based on the WWID).
# hwmgr -show scsi -full |grep -E "dsk|6000-1fe1-000b-1a40-0009-0361-3888-0062" 272: 22 skipper disk none 2 4 dsk105 [5/1/105] 274: 18 skipper disk none 2 8 dsk106 [5/1/62] 563: 11 skipper disk none 0 8 dsk108 [5/1/100] WWID:01000010:6000-1fe1-000b-1a40-0009-0361-3888-0062 64: 0 skipper disk none 2 1 dsk9 [0/0/0] 67: 3 skipper disk none 2 4 dsk1 [5/1/1] 68: 4 skipper disk none 2 4 dsk2 [5/1/2] 69: 5 skipper disk none 0 5 dsk3 [5/2/3] 74: 8 skipper disk none 0 4 dsk15 [5/1/9] 75: 9 skipper disk none 0 4 dsk8 [5/1/8] 151: 23 skipper disk none 2 4 dsk20 [5/1/20]
# hwmgr -show scsi -full -id 563 | more SCSI DEVICE DEVICE DRIVER NUM DEVICE FIRST HWID: DEVICEID HOSTNAME TYPE SUBTYPE OWNER PATH FILE VALID PATH --------------------------------------------------------------------------------------------- 563: 18 gilligan disk none 0 8 dsk108 [5/1/100] WWID:01000010:6000-1fe1-000b-1a40-0009-0361-3888-0062 BUS TARGET LUN PATH STATE ------------------------------------- 5 1 100 valid 5 2 100 valid 5 3 100 valid 5 4 100 valid 6 1 100 valid 6 4 100 valid 6 3 100 valid 6 2 100 valid
And by using scu (8) we can see what reservation keys are being used (just as cleanPR show indicated).
# scu scu> set nexus bus 5 target 1 lun 100 Device: HSG80, Bus: 5, Target: 1, Lun: 100, Type: Direct Access
scu> show reservations Persistent Reservation Header: Generation Value: 28 Additional Length: 144 Reservation Descriptors: Reservation Key: 0x30001 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30002 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30002 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30001 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30001 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30002 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30001 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0 Reservation Key: 0x30002 Scope-Specific Address: 0 Reservation Type: 0x5 (Write Exclusive Registrants Only) Reservation Scope: 0 (LU - full logical unit) Extent Length: 0
The reservation keys are based on the CSID (cluster system id); in our case, 0x30001 and 0x30002. For more on the CSID, see section 17.2.3.2.
# clu_get_info -full | grep -E "Hostname|csid" Hostname = skipper.dec.com csid = 0x30001 Hostname = gilligan.dec.com csid = 0x30002
If the storage in question is not part of a cluster but the base OS system still cannot access the storage, the cleanPR command may be used to clear the persistent reservation (issued from a standalone system).
# cleanPR clean cleanPR Version: 1.5 WARNING This shell script will clear all Persistent Reservations from the HSX80 devices attached to this system. WARNING Do you wish to proceed ? <y/n> [n]: y Removing Persistent Reservations from all HSX80 devices... Checking HSG80 at /dev/rdisk/dsk108a (SCSI #5 (SCSI ID #2) (SCSI LUN #100)) Checking HSG80 at /dev/rdisk/dsk6a (SCSI #5 (SCSI ID #9) (SCSI LUN #15)) Total of 0 devices found w/Persistent Reservations Total of 0 devices cleared of Persistent Reservations
Warning | Never use cleanPR clean in a live cluster! Doing so compromises the existing I/O Barriers and could lead to data corruption. |