Like file systems and databases, a directory's data is stored on disk drives . This data can become damaged for various reasons, including the following:
You can back up your directory data in two ways: by using traditional backup techniques such as tape backups and disk mirroring, or by using directory replication. Each approach has advantages, and a comprehensive backup plan benefits from using both approaches simultaneously . We'll discuss the advantages of each approach throughout this chapter. Backing Up and Restoring Directory Data Using Traditional TechniquesJust like user files stored on a file server, your directory data should be backed up periodically to some sort of medium such as magnetic tape. You can also back up to an alternate disk drive locally or over a network. The backups can be used to restore the data in the event of data damage or loss. However, backing up a directory is different from backing up a file system in several important ways:
Your actual backup procedure depends on the particular directory server software you use. To back up Netscape Directory Server 6, you use the db2bak script, which is found in the server instance directory install-root/slapd-instance- name . This script, which can be run while the server is running, copies the database to a backup directory.
You can then copy the database files to backup media such as magnetic tape, or you can copy the backup files to a remote file system. You can also initiate the backup remotely using the db2bak.pl script. Examples that show how to use the db2bak script are included in Chapter 4, Overview of Netscape Directory Server. For more information, see the Netscape Directory Server 6 Administrator's Guide . By default, the db2bak script creates a new directory under the bak directory each time it is run. You can also specify a destination directory by passing it as an argument to the db2bak script. It is not possible to back up a Netscape Directory Server 6 database directly to tape. You need to take a snapshot using the db2bak script and then archive the database to tape. Therefore, to perform a backup you need enough free disk space to hold a copy of the database. After the backup has been archived to tape and verified , the snapshot can be deleted. Note If you have extra free disk space, it is beneficial to keep several days of snapshots online, even though they have been archived to tape. In the event that erroneous data is placed in your directory, the snapshots can be used to restore the directory, speeding the recovery process. Restoring Directory Data from a SnapshotTo restore a Netscape Directory Server 6 database from a snapshot, you must shut down the server and use the bak2db script, or you can use the bak2db.pl script to initiate a restore operation remotely (the database being restored will be taken offline if you use the bak2db.pl script). Both scripts can be found in the server instance directory install-root/slapd-instance-name . You give the full pathname of the backup directory as the argument to the script (restore the snapshot from tape to disk first, if required). The bak2db script will copy the snapshot files into the database directory, including any transaction logs required. After the bak2db script completes, you can start the directory server. Examples that show how to use the bak2db script are included in Chapter 4, Overview of Netscape Directory Server. Backing Up to LDIF FilesWhat if your directory server software does not support online backup? Another way to back up your directory data is to read all the entries using the Lightweight Directory Access Protocol (LDAP) and write them to disk or tape in LDAP Directory Interchange Format (LDIF). For example, you can use the ldapsearch command-line utility to read all directory entries underneath dc=example,dc=com in the following command: ldapsearch -h directory.example.com -p 389 -D "cn=Directory Manager" -w secret -s sub -b "dc=example, dc=com" "(objectclass=*)" Note The ldapsearch command should all be typed on a single line; it doesn't appear on one line in this book because of page size constraints. This sample command connects to the LDAP server running on port 389 on host directory.example.com , authenticates as cn=Directory Manager with the password secret , and performs a subtree search rooted at dc=example,dc=com . (These parameters should be tailored for your environment.) The filter, (objectclass=*) , matches any entry. The server will respond by sending all entries within the subtree dc=example,dc=com , and the client will generate LDIF output that can be redirected to a file or tape. There are some caveats with this approach. First, the ldapsearch command as given would not return any operational attributes such as modifiersName and modifyTimeStamp . These attributes must be mentioned explicitly on the command line, as illustrated here: ldapsearch -h directory.example.com -p 389 -D "cn=Directory Manager" -w secret -s sub -b "dc=example, dc=com" "(objectclass=*)" "*" modifiersName modifyTimeStamp creatorsName createTimeStamp The additional arguments at the end of the command line specify that all user attributes are to be returned (as called for by the asterisk), along with four operational attributes ( modifiersName , modifyTimeStamp , creatorsName , createTimeStamp ). The second caveat is that the command-line utility may return entries that do not reside on the specified server. This can happen if your directory is distributed across multiple servers using chaining or referrals (as discussed in Chapter 10, Topology Design). Receiving these entries may be what you want, of course, if your aim is to copy all your directory entries across your whole distributed directory. If this is not what you want, you can use the -R command-line flag to ldapsearch to instruct it not to follow referrals. If your directory uses chaining, however, there is no way to ask for entries local to only one server. A third caveat is that if the directory is modified while the backup is under way, certain inconsistencies may result. For example, if a group references members, those members may not appear in the backup snapshot if they are added while the directory is being backed up. The final caveat is that data backed up in this manner is likely to be incomplete. Quite a bit of additional information may be stored in the directory to support its operation, such as superior and subordinate knowledge. The ManageDSAIT control, available on LDAPv3-compliant servers, can be used to access this type of data (this control is discussed in Chapter 2, Introduction to LDAP). The bottom line is that using a client to read the data over LDAP may not produce a backup that can be used to recover from complete data loss. If your directory server software supports direct backup of its database files, that method is preferred. Restoring Data from LDIF FilesTo restore data in an LDIF file, you simply shut down the directory server and import the LDIF file. For example, to restore a Netscape Directory Server 6 database from an LDIF file, stop the server using the stop-slapd script, import the LDIF file using the ldif2db script, and start the server using the start-slapd script. All of these scripts are documented in the Netscape Directory Server 6 Administrator's Guide . Other Things to Back UpAlthough backing up your critical directory data is important, additional information should probably also be backed up. For example, you should back up the configuration files for your directory server when you back up your directory data; re-creating them from scratch would be time-consuming . In addition, your directory schema configuration and access control information may reside in separate files or databases; make sure that these are backed up as well. Consult your directory server documentation to learn about other configuration files and data that you should back up. Tip When you're maintaining a complex set of configuration files like those in most directory server software, it's beneficial to keep a history of changes made to them. Having such a record allows you to revert to an older version of the configuration if an error is made. One way of creating this history is to use a revision control system such as the Source Code Control System (SCCS), Revision Control System (RCS), or Concurrent Version Control System (CVS), all of which are available on Unix platforms. Revision control systems allow you to retrieve any previous version of a file's contents and determine who made a particular change. Such tools can also help protect against erroneous configuration changes.
Using Replication for Backup and RestoreAlthough traditional backup techniques can protect you against many types of problems, they have one major drawback: Restoration of data is time-consuming. Copying data from the backup media to the server may take many minutes or even hours for a large directory. However, if you use replication as your primary means of providing redundancy and fault tolerance, you can avoid the costly downtime that would be required for the restoration phase. Because replicas are online copies of your critical directory data, no delay is incurred while data is being restored from backup media. If a server fails, you can simply remove it from the set of replicas and repair the failure. After the server is repaired, you can bring it back online and re-establish it as a replica. Users will generally be unaware of the problem, as long as there is sufficient capacity across the remaining replicas to handle the client load. Using replication for backup and restoration has another advantage: Directory data is usually more up-to-date on replicas than on backup tapes. Of course, most directories support loose replica consistency; that is, it's possible for changes to be held on a replica for some time before being propagated to other replicas. Thus there is no absolute guarantee that all replicas are completely in sync at any given time. Directory server software that supports multimaster replication makes this backup procedure simple. Because any server can accept updates in such a configuration, there is no loss of functionality when a server is missing from the replica set. Figure 17.1 shows the process of repairing a failed server in a multimaster environment. In the situation depicted, Replica 3 experiences a failure of its disk drive. The server is removed from the set of replicas, and the disk drive is replaced . The replica is returned to the set of replicas as soon as its directory data is reinitialized from one of the other replicas. Figure 17.1. Multimaster Replication Provides Protection against Server Failure
If your directory server software supports only single-master replication, the backup process can become more complicated. In single-master replication, only the master server can be updated; all other servers are read-only copies. If the master server fails, no updates can be processed by the directory until the master is repaired and brought back online (see Figure 17.2), or until a different server is designated as the master (see Figure 17.3). Figure 17.2. Single-Master Replication: A Master Server Fails, Is Repaired, and Is Brought Back Online
Figure 17.3. Single-Master Replication: A Master Server Fails, and a Slave Server Is Converted into a Master
Converting a slave server to a master can be complicated. At the very least, the new master must be configured to send replication updates to all the consumer servers covered by the failed master (excluding the new master). In addition, it may be necessary to update state information stored in each replica, such as the last update number received from the master, to reflect the update numbers on the new master. Because this can be complicated, make sure you understand the process thoroughly. For more information, consult your directory server software manual. Using Replication and Traditional Backup Techniques TogetherAlthough replication provides redundancy and high availability in the event of a single server failure, it cannot protect you from incorrect data being placed in your directory. For example, if an automatic data update process begins erroneously deleting entries from your directory, replication unfortunately ensures that these incorrect updates end up on all your replicas! To protect yourself from this type of problem, periodically back up directory data to media such as magnetic tape. Figure 17.4 shows a hybrid approach that uses both backup methods . One of the replicas is equipped with a tape backup unit, which is used to back up the directory contents periodically. Figure 17.4. A Hybrid Approach to Backup, Incorporating Both Replication and Traditional Backup Techniques
Safeguarding Your BackupsIt's important to keep your backup media in a safe place. This means protecting them from damaging environments such as extreme temperatures and magnetic fields. You should also keep copies of your backups off-site. In the event of a disaster such as a fire, the off-site backups can be used to restore data to replacement servers. You can outsource the transportation and storage of off-site backups to reduce costs. We'll discuss disasters and disaster recovery in detail shortly. Another important aspect of safeguarding your backups is the physical security of the backup media storage facility. No matter how good your privacy and security design, it is all for naught if someone can just stroll up to the cabinet holding your backup tapes and walk off with a copy of your sensitive directory data. Backups should be well secured; this might mean placing them in a locked, fireproof vault in an off-site facility and using a bonded courier service to transport them. Finally, consider storing your off-site backups in a location unlikely to be affected by a natural disaster that damages or destroys your main location. For example, an off-site backup storage facility ten miles away may be adequate in the midwestern United States, where the primary natural disaster is severe weather. In California, however, where the primary concern is earthquakes, it may be prudent to store off-site backups at a more distant location, in a region that is less seismically active. Verifying Your BackupsBacking up your critical directory data is important. However, the act of backing up is futile if the backups can't be restored. You need to take steps to check the integrity of the backups you produce. At the very least, check that your backup media are readable and free of media errors. How you verify your backups depends on the server software you use. One option is to restore the backup onto a test server, start the server, and then verify that the content of the server is correct. This approach may not always be practical, however. For example, Novell eDirectory copies data from the entire set of distributed servers onto a single backup. Restoring the directory to a test server may be impractical because of the number of partitions in the directory; it may be difficult to fit them all onto a temporary server used for verification. If you run Novell eDirectory, consult your backup software documentation for instructions on verifying backups. Verify backups immediately after they complete. If the verification fails, the cause of the failure should be determined and fixed, and another backup should be performed immediately. You may also want to incorporate your backup process into your monitoring system so that you'll know if something goes wrong. Be especially careful to verify your backups fully when you initially develop and deploy your backup procedure or change it in any significant way. Like any other software, backup software can contain bugs. It's always better to discover this, and other flaws in your backup procedures, before you need to restore critical directory data from your backup. |