DBMS-Enforced Data Integrity | Inside Microsoft SQL Server 7.0 (Mps)

A database is only as useful as the user 's confidence in it. That's why the server must enforce data integrity rules and business policies. SQL Server enforces data integrity within the database itself, guaranteeing that complex business policies are followed and that mandatory relationships between data elements are complied with.

Because SQL Server's client/server architecture allows you to use a variety of front-end applications to manipulate and present the same data from the server, it would be cumbersome to encode all the necessary integrity constraints, security permissions, and business rules into each application. If business policies were all coded in the front-end applications, every application would need to be modified every time a business policy changed. Even if you attempted to encode business rules into every client application, the danger of an application misbehaving still exists. Most applications can't be fully trusted. Only the server can act as the final arbiter, and the server must not provide a back door for a poorly written or malicious application to subvert its integrity.

SQL Server uses advanced data integrity features, such as stored procedures, declarative referential integrity (DRI), datatypes, constraints, rules, defaults, and triggers, to enforce data integrity. Each of these features has its own use within a database; combining these integrity features can make your database flexible and easy to manage, yet secure.

Declarative Data Integrity

Enforcing data integrity means SQL Server is making sure that all our data is valid. There are several aspects to this concept of data integrity. First is domain integrity. A central tenet of relational database theory is that every tuple of every relation (more colloquially, every row of every table ) can be uniquely identified. The attribute or combination of attributes (the column or combination of columns ) that ensures uniqueness is known as the primary key . A table can have only one primary key. SQL Server allows you, when defining a table, to designate the columns that make up the primary key. This is known as a PRIMARY KEY constraint . SQL Server uses this PRIMARY KEY constraint to guarantee that the uniqueness of the designated columns is never violated. Domain integrity is enforced for a table by making sure that it has a primary key.

Sometimes multiple columns of a table can uniquely identify a rowfor example, an employee table might have an employee ID ( emp_id ) and a Social Security number ( soc_sec_num ) column, and both are considered unique. Such columns are often referred to as alternate or candidate keys . These keys must also be unique. Although a table can have only one primary key, it can have multiple alternate keys. SQL Server supports the multiple alternate key concept via UNIQUE constraints . When a column or combination of columns is declared unique, SQL Server prevents any record from being added or updated that would violate this uniqueness.

Assigning an arbitrary unique number as the primary key when no natural or convenient key exists is often most efficient. For example, businesses commonly use customer numbers or account numbers as unique identifiers or primary keys. SQL Server makes it easy to efficiently generate unique numbers by allowing one column in a table to have the Identity property . You use the Identity property to make sure that each value in the column is unique and that the values will increment (or decrement) by the amount you specify from a starting point that you specify. (A column having the Identity property will typically also have a PRIMARY KEY or UNIQUE constraint, but this isn't required.)

The second type of data integrity is referential integrity. SQL Server enforces logical relationships between tables with FOREIGN KEY constraints . A foreign key in a table is a column or combination of columns that match the primary key (or possibly an alternate key) of another table. The logical relationship between those two tables is the basis of the relational model.

For instance, the simple SELECT example shown in the Transact -SQL section includes a titles table and a publishers table. The titles table column title_id (title ID) is its primary key. The publishers table column pub_id (publisher ID) is its primary key. The titles table also includes a pub_id column, which isn't the primary key because a publisher can publish multiple titles. Instead, pub_id is a foreign key, and it references the primary key of the publishers table. After this relationship is declared when the table is defined, SQL Server ensures that a title can't be entered unless a valid publisher for it is in the database, and that a publisher can't be deleted if any titles in the database reference that publisher. Enforcing referential integrity means that the relationship between primary keys and foreign keys is never violated.

The third type of data integrity is domain integrity, which means that individual data values must meet certain criteria. SQL Server makes sure that any data entered matches the type and range of the specified data type and, for example, allows a NULL value to be entered only if the column has been declared as allowing NULLs. SQL Server supports a wide range of datatypes, allowing for great flexibility with efficient storage. This ability to enforce data integrity based on the constraints you define when you create the database tables is known as declarative data integrity.

Datatypes

SQL Server datatypes provide the simplest form of data integrity by restricting the types of information (for example, characters , numbers, or dates) that can be stored in the columns of the database tables. You can also design your own datatypes ( user-defined datatypes) to supplement those supplied by the system. For example, you could define a state_code datatype as two characters (CHAR(2)); SQL Server would then accept only two-character state codes. A user-defined datatype can be used to define columns in any table. An advantage of user-defined datatypes is that rules and defaults, which are discussed in the next two sections, can be bound to them for use in multiple tables, eliminating the need to include these types of checks in the front-end application.

CHECK Constraints and Rules

CHECK constraints and rules are integrity constraints that go beyond those implied by a column's datatype. Whenever a user enters a value, SQL Server checks that value against any CHECK constraint or rule created for the specified column to ensure that only values that adhere to the definition of the constraint or rule are accepted.

Although CHECK constraints and rules are essentially equivalent in functionality, CHECK constraints are easier to use and provide more flexibility. A CHECK constraint can be conveniently defined when a column is defined, and constraints can be defined on multiple columns. Rules, however, must be defined and then bound to a column or user-defined datatype separately. While a column or user-defined datatype can have only one rule associated with it, a CHECK constraint can reference multiple columns in the same table, or it can reference one of the built-in functions that SQL Server provides.

Both CHECK constraints and rules can require that a value fall within a particular range, match a particular pattern, or match one of the entries in a specified list. An advantage of CHECK constraints is that they can depend on either the value of another field or fields in the row or on the value returned by one of the system-supplied functions. A rule can't reference other fields. As an example of applying a CHECK constraint or rule, a database containing information on senior citizens could have the CHECK constraint or rule "age field must contain a value between 65 and 120 years ." A birth certificate database could require that the date in the birth_date field be some date prior to the current date.

Defaults

Defaults allow you to specify a value that SQL Server inserts if no explicit value is entered in a particular field. For example, you could set the current date as the default value for an order_date field in a customer order record. Then, if a user or front-end application doesn't make an entry in the order_date field, SQL Server automatically inserts the current date. You can also use the keyword DEFAULT as a placeholder in an INSERT or UPDATE statement, instructing SQL Server to set the value to the declared default value.

Triggers

Triggers are a special type of stored procedure. Stored procedures can be executed only when explicitly called; triggers are automatically invoked, or triggered, by SQL Server, and this is their main advantage. Triggers are associated with particular pieces of data and are called automatically whenever an attempt to modify that data is made, no matter what causes the modification (a user's entry or an application action).

Conceptually, triggers are similar to a CHECK constraint or rule. SQL Server automatically activates triggers, constraints, and rules whenever an attempt is made to modify the data they protect. CHECK constraints and rules then perform fairly simple types of checks on the datafor example, "make sure the age field has a value between 0 and 120." Triggers, on the other hand, can perform extremely elaborate restrictions on the data, which helps to ensure that the rules by which your business operates can't be subverted. Because triggers are a form of stored procedure, they have the full power of the Transact-SQL language at their disposal, and they can invoke other stored and extended stored procedures. You can write a trigger that enforces complex business rules, such as the following:

Don't accept an order:

If the customer has any past due accounts with us

OR

If the customer has a bad credit rating by ACME Credit Service (with the trigger calling an extended procedure that automatically dials up ACME to get the credit rating)

OR

If the order is for more than $50,000 and the customer has had an account with us for less than six months

This integrity check is quite powerful. Yet the trigger to enforce it is simple to write. Triggers can also enforce referential integrity, ensuring that relationships between tables are maintained . For example, a trigger can prohibit a customer record from being deleted if open orders exist for the customer, or it can prohibit any new order for a customer for which no record exists. Triggers can go beyond simply insisting that relationships exist: they can perform referential actions . This means that triggers can cause changes to ripple through to other tables. For example, if you want to drop a delinquent customer from your system and delete all of that customer's active orders, a trigger on the customer table could automatically delete all entries in the orders table.

Triggers automatically execute whenever a specified change to a data object is attempted. A trigger executes once per statement, even if multiple rows are affected. It has access to the before and after images of the data. (These before and after images are reconstructed from the transaction log into pseudotables that can be accessed from within the trigger.) The trigger can then take further action, including rolling back the transaction. Although you can use triggers to enforce referential integrity, it is usually more convenient to establish these relationships when you create the tables by using declarative referential integrity.