The purpose of data maintenance is to ensure that the data in your directory service has the highest possible quality. Quality of data has several aspects, but we will focus primarily on the accuracy and timeliness of data. Naturally you will want to check the quality of your data both to monitor how well your data maintenance procedures are working and to get an idea of the kind of service you're providing to the users of your directory. Bad data can creep into your directory service from various directions, including the following:
Methods of Checking QualityThere are several methods for checking the quality of data in your directory. The following are three common methods:
Checking a source of truth and spot-checking are techniques that can be used to check the syntactic validity of information even when no source-of-truth database exists or is accessible. For example, you could read all (or a sampling) of the e-mail address attributes in the directory and determine whether they are syntactically valid. Implications of Checking QualityIt's important to consider the implications of the methods you use to check the quality of your data for the operation of your directory service. Be sure to choose a method that does not significantly reduce directory performance. Depending on the method you select, you may have to make a trade-off between how often you check for quality and the accuracy of your checking methods. The main concerns in this area are methods that cause an excessive load on the directory or cause the directory to be unavailable. For example, consider a method that requires reading over LDAP all the entries in your directory. Your directory might have the capacity to respond to this kind of request without degrading performance for other users, but then again it might not. If you use a method like this, you can run the check at night or at another off-peak time when the directory has plenty of extra capacity to respond to the data-checking requests . However, such an arrangement may be difficult if your directory operates in a global environment in which there is no off-peak time. Another approach then is to create a dedicated directory replica that does nothing but process these data verification tasks . Consider also a method that requires you to dump your directory's data to a file. Some directory server software allows you to perform this operation without taking the service down, but some does not. If you are planning to use this method, make sure that the software you choose supports online dumps or that your service can tolerate the downtime. Remember that you have replication to help with the availability problem, so consider taking down a replica to produce the extract instead of taking down the master server. Also consider producing your own extract over LDAP, but be careful you don't degrade performance as discussed earlier. Correcting Bad DataWhatever method you use to check the quality of your data, be sure to investigate the cause whenever you encounter an error. Identifying the cause will help you correct problems with the system that produced the bad data. Although this kind of investigation can be time-consuming and expensive, it's usually well worth it. You'll often find that many errors are caused by the same underlying problem. Fixing that problem can dramatically increase the quality of your data. Bad data may be caused by many underlying problems, some of which were already discussed briefly . Systematic errors in programs or procedures should be treated as bugs and corrected. Bad data introduced through human error might be the result of inadequate training or documentation for either users or administrators; increasing the quality and coverage of this training and documentation can cause corresponding improvements in the quality of your data. Human error can also be the result of poor software design. Spend time with users and administrators responsible for updating the directory, and observe the steps they take when maintaining data. Observing others will help you spot flaws in the software and procedures they use. Finally, even if you can't eliminate poor data coming into your directory, you can mitigate the damage by installing data validation filters. As mentioned earlier, these filters can be installed in directory clients that users and administrators use to update the directory, or they can be installed in the directory service itself. |