3 4
Tables in a SQL Server database can include a number of different types of properties that ensure the integrity of the data. These properties include data types, NOT NULL definitions, DEFAULT definitions, IDENTITY properties, constraints, rules, triggers, and indexes. This lesson will introduce you to all of these methods for enforcing data integrity and to the types of data integrity supported by SQL Server. This lesson also discusses the various types of data integrity, including entity integrity, domain integrity, referential integrity, and user-defined integrity.
Enforcing data integrity ensures the quality of data in the database. For example, suppose that you create a Customers table in your database. The value in the Cust_ID column should uniquely identify each customer who is entered into the table. As a result, if a customer has a Cust_ID of 438, no other customer should have a Cust_ID value of 438. Next, suppose that you have a Cust_Rating column that is used to rate each customer with a rating from 1 through 8. In this case, the Cust_Rating column should not accept a value of 9 or any number other than 1 through 8. In both cases, you must use one of the methods supported by SQL Server to enforce the integrity of the data.
SQL Server supports a number of methods that you can use to enforce data integrity, including data types, NOT NULL definitions, DEFAULT definitions, IDENTITY properties, constraints, rules, triggers, and indexes. In Chapter 4, "Implementing SQL Server Databases and Tables," you already learned about several of these methods for enforcing data integrity. A brief description of these methods is included here to provide you with a cohesive overview of ways to enforce data integrity. Some of these table properties, such as NOT NULL and DEFAULT definitions, are sometimes considered a type of constraint. For the purposes of this training kit, however, they are being treated as separate from constraints.
A data type is an attribute that specifies the type of data (character, integer, binary, and so on) that can be stored in a column, parameter, or variable. SQL Server provides a set of system-supplied data types, although you can create user-defined data types that are based on the system-supplied data types. System-supplied data types define all of the types of data that you can use with SQL Server. Data types can be used to enforce data integrity because the data entered or changed must conform to the type specified for the object. For example, you cannot store someone's last name in a column defined with the datetime data type, because a datetime column accepts only valid dates. For more information about data types, refer to Chapter 4, "Implementing SQL Server Databases and Tables."
The nullability of a column determines whether the rows in the table can contain a null value for that column. A null value is not the same as zero, blank, or a zero-length character string such as " ". Null means that no entry has been made. The presence of null usually implies that the value is either unknown or undefined. You define the nullability of a column when you define the column, when you either create or modify a table. Whether you define columns that allow null values or columns that do not allow null values, you should always use the NULL or NOT NULL keywords in every column definition because of the complexities of how null is handled in SQL Server. The NULL keyword is used if null values are allowed in the column, and the NOT NULL keywords are used if null values are not allowed. For more information about data types, refer to Chapter 4, "Implementing SQL Server Databases and Tables."
Defaults specify what values are used in a column if you do not specify a value for the column when inserting a row. DEFAULT definitions can be created when the table is created (as part of the table definition) or can be added to an existing table. Each column in a table can contain a single DEFAULT definition. For more information about default values, refer to Chapter 4.
For each table, a single identifier column can be created that contains system-generated sequential values that uniquely identify each row within the table. Identifier columns usually contain values unique within the table for which they are defined. In other words, it is possible for tables containing identifier columns to contain the same identity values used by another table. This situation is usually not a problem, however, because the identifier values are typically used only within the context of a single table—and the identifier columns do not relate to other identifier columns in other tables. For more information about the IDENTITY property, refer to Chapter 4.
Constraints enable you to define the way that SQL Server automatically enforces the integrity of a database. Constraints define rules regarding the values allowed in columns and are the standard mechanism for enforcing integrity. Using constraints is preferred to using triggers, rules, or defaults. The query optimizer also uses constraint definitions to build high-performance query execution plans. Lesson 2 discusses constraints in more detail.
Rules are a backward-compatibility feature that performs some of the same functions as CHECK constraints. CHECK constraints are the preferred, standard way to restrict the values in a column. CHECK constraints are also more concise than rules; there can be only one rule applied to a column, but multiple CHECK constraints can be applied. CHECK constraints are specified as part of the CREATE TABLE statement, while rules are created as separate objects and are then bound to the column.
You should use the CREATE RULE statement to first create the rule, and then you should use the sp_bindrule system stored procedure to bind that rule to a column or to a user-defined data type. For more information about using CREATE RULE or sp_bindrule, refer to the Transact-SQL reference in SQL Server Books Online.
Triggers are a special class of stored procedures defined to execute automatically when an UPDATE, INSERT, or DELETE statement is issued against a table or a view. Triggers are powerful tools that sites can use to enforce their business rules automatically when data is modified. Triggers can extend the integrity checking logic of SQL Server constraints, defaults, and rules (although constraints and defaults should be used instead whenever they provide all of the needed functionality). Triggers are discussed in more detail in Chapter 9, "Implementing Triggers."
An index is a structure that orders the values of one or more columns in a database table. An index provides pointers to the data values stored in specified columns of the table and then orders those pointers according to the sort order you specify. The database uses the index much as you use an index in a book: it searches the index to find a particular value and then follows the pointer to the row containing that value. A unique index enforces a column's uniqueness. Indexes are discussed in more detail in Chapter 11, "Implementing Indexes."
SQL Server supports four types of data integrity: entity integrity, domain integrity, referential integrity, and user-defined integrity.
NOTE
Entity integrity defines a row as a unique instance of an entity for a particular table. Entity integrity enforces the integrity of the identifier column or the primary key of a table (through indexes, UNIQUE constraints, PRIMARY KEY constraints, or IDENTITY properties).
Domain integrity is the validity of entries for a given column. You can enforce domain integrity by restricting the type (through data types), the format (through CHECK constraints and rules), or the range of possible values (through FOREIGN KEY constraints, CHECK constraints, DEFAULT definitions, NOT NULL definitions, and rules).
Referential integrity preserves the defined relationships between tables when records are entered or deleted. In SQL Server, referential integrity is based on relationships between foreign keys and primary keys or between foreign keys and unique keys (through FOREIGN KEY and CHECK constraints). Referential integrity ensures that key values are consistent across tables. Such consistency requires that there be no references to non-existent values and that, if a key value changes, all references to it change consistently throughout the database.
When you enforce referential integrity, SQL Server prevents users from doing any of the following:
For example, with the Sales and Titles tables in the Pubs database, referential integrity is based on the relationship between the foreign key (title_id) in the Sales table and the primary key (title_id) in the Titles table, as shown in Figure 5.1.
Figure 5.1 Referential integrity between the Sales table and the Titles table.
User-defined integrity enables you to define specific business rules that do not fall into one of the other integrity categories. All of the integrity categories support user-defined integrity (all column-level and table-level constraints in the CREATE TABLE statement, stored procedures, and triggers).
In this exercise, you will view several tables in the Pubs database and identify properties used to ensure data integrity. To perform this exercise, you should be logged into your Windows 2000 Server computer as Administrator.
USE Pubs GO sp_help Employee
In this statement, you are identifying the database containing the table that you want to view and you are using the sp_help system stored procedure to view information about the Employee table in the Pubs database.
Information about the Employee table appears in the Grids tab of the Results pane.
What types of constraints have been defined for the Employee table?
Which columns in the Employee table allow null values?
sp_help Publishers
Information about the Publishers table appears in the Grids tab of the Results pane.
What types of constraints have been defined for the Publishers table?
Which column in the Publishers table is the identity column?
Which columns in the Publishers table have been defined with the char data type?
sp_help Titles
Information about the Titles table appears in the Grids tab of the Results pane.
Which column in the Publishers table has been defined with a user-defined data type?
How many columns in the Titles table allow null values?
Tables in a SQL Server database can include a number of different types of properties that ensure the integrity of the data. These properties include data types, NOT NULL definitions, DEFAULT definitions, IDENTITY properties, constraints, rules, triggers, and indexes. SQL Server supports four types of data integrity: entity integrity, domain integrity, referential integrity, and user-defined integrity. Entity integrity defines a row as a unique instance of an entity for a particular table. Domain integrity is the validity of entries for a given column. Referential integrity preserves the defined relationships between tables when records are entered or deleted. User-defined integrity enables you to define specific business rules that do not fall into one of the other integrity categories.