As pointed out at the beginning of this chapter, there are a number of benefits to constraints and referential integrity. They are easy to configure, all client applications must abide by them, and they can improve the overall integrity of your data.
But the ease with which constraints can be added to a data dictionary can lead to their overuse. When overused, they can prevent valid data from being entered into your database. And in the case of referential integrity constraints, they can produce an unexpected increase in resource use on the server.
The following sections take a final look at each of the type of constraints provided by a data dictionary. In each case, the limits of each approach are discussed.
Field-level constraints permit you to improve the integrity of data in individual columns by defining meaningful limits to the range of values that a field can store. They are especially useful for fields designed to hold data that have a fixed range. For example, a field designed to hold the relative humidity of a meteorological measurement has a fixed range of between 0 and 100. Relative humidity cannot be lower than 0, nor can it be greater than 100 percent. Placing a minimum and maximum value on a relative humidity field of 0 and 100, respectively, makes some sense.
But some domains do not have clear limits. For example, imagine that you have a date of birth (DOB) field in your employee table. Should you put a minimum date on the field? Probably not. In this case, the use of a minimum value on a date of birth field doesn’t help your data much. On the one hand, if you set the minimum value on a date of birth field to 1/1/1920, you might actually prevent an employee’s valid birthday from being entered. On the other, permitting DOB values as early as 1/1/1880 does little to prevent inaccurate data.
The issue here is fixed limits versus artificial ones. Percentages, such as relative humidity, have well-defined limits. When a field represents a domain that does not have physical limits, it is probably best to avoid using minimum and maximum limits. Instead, you may want to employ client-side code to ask the user to confirm data values that lie in extreme, yet potentially valid, ranges.
You should also be careful when setting the Null Valid field to False, meaning that NULL values are not allowed for a particular field. In reality, most tables have only a few fields whose values absolutely must be entered before you can accept the record. Most of these are fields that uniquely identify a record, such as the fields associated with a table’s primary key.
What you want to avoid doing is setting Null Valid to False for fields that are not actually essential. For example, you may need the invoice date in your invoice table. But if the end user does not have the order date when they are entering the record, requiring them to supply a non-null value for this field either provokes them to enter fake data or prevents them from entering the record altogether.
Record-level constraints are more flexible than field-level constraints in that they can include some basic conditional logic. For example, a record-level constraint can be used to verify that a credit card field has been assigned a value, but only when the payment type is credit card.
The problem with record-level constraints is that it can be challenging to embody a complicated condition using a Boolean expression. When you must use a complicated Boolean expression with several conditional parts, it is very important to test the expression thoroughly to ensure that it does not reject records that are valid, and at the same time does not accept invalid records.
An alternative to using a complex record-level constraint is a trigger. Triggers can be written in a variety of high-level languages that can express complex conditional logic more easily than a Boolean expression. In addition, you can embed comments in your trigger code that identify what exactly your conditions are testing for. This is not something that you can do with a record-level constraint. Using triggers is described in Chapter 8.
Referential integrity definitions are valuable, but they tend to get overused. Most database applications contain many related tables. But just because two tables contain related data does not mean that there is a parent/child relationship that needs to be managed through referential integrity.
There are two primary problems caused by the overuse of referential integrity definitions. The first is that few related tables require the tight controls imposed by referential integrity. Referential integrity restricts your ability to delete and change parent table records, as well as prohibiting the insertion of child records for which there is no parent. When applied to the wrong tables, a user’s attempt to add, edit, or delete data may be rejected, even when that operation is legitimate.
The second problem is related to resources. Whenever a table involved in a referential integrity definition is written to, the associated tables and their auto-open indexes are opened. This happens even if you are making a change only to the one table you opened.
Extensive use of referential integrity may require you to configure ADS to use additional work areas to handle the increases in table usage, and can also make it more difficult to obtain an exclusive lock on a table. (A table opened as a result of a referential integrity definition cannot be opened exclusively by another user.)
The solution is to use referential integrity sparingly. Most applications have only one or two pairs of tables whose related data is so critical to the application that referential integrity definitions are required. For many other related but nonessential tables, the data integrity provided by referential integrity is not worth the restrictions and increased resource usage they produce.
In the next chapter you will learn how to create and use views.