As the Boy Scouts say, "Be Prepared." So, in planning for the worst, it's a good idea to have certain files on hand—either in electronic form, hardcopy, or both. We'll talk about some of these files and/or command outputs that are invaluable if you are rebuilding a broken cluster, especially if you weren't the mastermind behind the original setup.
First, let's look at command output (hardcopy):
/etc/rc.config*
Lists the configuration variables (both cluster and non-cluster specific).
hwmgr -view hierarchy
Shows the entire hardware database list.
hwmgr -view devices
Shows all devices, especially disks.
clu_check_config
Displays configuration data.
clu_get_info -full
Lists important cluster configuration data including member numbers, versions, etc.
cluamgr -s all
Displays the cluster aliases, including their selection priorities, weights, and router priorities.
caa_stat -t –v
Displays the CAA services and their states and where they're running.
volprint -Aht
Lists all of the LSM volumes including their make-up (plexes, subdisks, disks).
File system layout drawing
Record cluster root, /usr, /var, quorum disk, and member disks at least.
Storage map (really important if a SAN, need a picture)
This picture that you provide should include the controllers, switches, storage controllers, and which ports are used and for what.
Now for the files either backed up onto tape or otherwise accessible when booted from the installation CD:
volsave
Can be used to help recreate the LSM configuration.
sys_check's escalate.tar
Contains a great deal of useful configuration information.
/etc/*
Contains lots of system configuration files and is relatively small.
/var/cluster/caa/
The CAA profiles and configuration.
/cluster/admin
Cluster log files and configuration information.
Configuration data from the boot disk CNX partition.
Save a copy of the clu_bdmgr.conf file which is located in the /cluster/members/member<ID>/boot_partition/etc directory.
Inventory of site-specific CDSLs.
Save a copy of the /var/adm/cdsl_admin.inv file.
Store the makeup of your AdvFS domains.
# ls -lR /etc/fdmns > etc_fdmns_links
Store the all-important LSM volumes makeup in a file.
# volprint -Aht > lsm_volumes
Save the license PAKs
In the Korn shell:
# for i in $( lmf list | awk '/^[A-Z0-9][A-Z0-9].*/ { print $1 }') do lmf issue \ /.root/PAKS/${i}.pak $i lmf register - < \ /.root/PAKS/${i}.pak done
# lmf reset
In case you have to re-register the licenses.
Save the disk labels
In the Korn shell:
# for i in $( hwmgr view devices \ | awk '/dsk/ { print $2 }' \ | cut -d/ -f4 ) do i=${i%c} disklabel -r $i > $i.lbl done
In case you have to replace disks.
/var/adm/patch/log
If patches are installed, the log directory is useful to know which patch kit is installed.
/var/evm/adm
If site-defined EVM templates, channels, and filters were created.
/etc/dt
If CDE customization has been done.
/sbin/init.d
If there are site-defined startup scripts.
Initial disk used to create the cluster
Could be used in recovery.
Note the sys_check (8) utility places some good data in /var/recover/sys_check. This data can also be useful to repair a damaged system, and you should consider adding this to the list of files/directories to backup.
# ls apply.ksh devls etcfiles lmf vfs.stanza clufiles devlsL hsz map consolevars disklabels inet.stanza proc.stanza
Finally a complete set of backups for cluster root, /usr, /var, each member boot disk, and, of course, your data. These backups might be necessary if the problem was particularly severe but in most cases will not be required.