Lesson 1:Introduction to Data Integrity | MCSE Training Kit (Exam 70-229): Microsoft SQL Server(TM) 2000 Database Design and Implementation (Pro Certification)

3 4

Tables in a SQL Server database can include a number of different types of properties that ensure the integrity of the data. These properties include data types, NOT NULL definitions, DEFAULT definitions, IDENTITY properties, constraints, rules, triggers, and indexes. This lesson will introduce you to all of these methods for enforcing data integrity and to the types of data integrity supported by SQL Server. This lesson also discusses the various types of data integrity, including entity integrity, domain integrity, referential integrity, and user-defined integrity.

After this lesson, you will be able to:

Identify the various table properties that can be used to ensure the integrity of data in a SQL Server database.
Identify the various types of data integrity that SQL Server supports.

Estimated lesson time: 25 minutes

Enforcing Data Integrity

Enforcing data integrity ensures the quality of data in the database. For example, suppose that you create a Customers table in your database. The value in the Cust_ID column should uniquely identify each customer who is entered into the table. As a result, if a customer has a Cust_ID of 438, no other customer should have a Cust_ID value of 438. Next, suppose that you have a Cust_Rating column that is used to rate each customer with a rating from 1 through 8. In this case, the Cust_Rating column should not accept a value of 9 or any number other than 1 through 8. In both cases, you must use one of the methods supported by SQL Server to enforce the integrity of the data.

SQL Server supports a number of methods that you can use to enforce data integrity, including data types, NOT NULL definitions, DEFAULT definitions, IDENTITY properties, constraints, rules, triggers, and indexes. In Chapter 4, "Implementing SQL Server Databases and Tables," you already learned about several of these methods for enforcing data integrity. A brief description of these methods is included here to provide you with a cohesive overview of ways to enforce data integrity. Some of these table properties, such as NOT NULL and DEFAULT definitions, are sometimes considered a type of constraint. For the purposes of this training kit, however, they are being treated as separate from constraints.

Data Types

A data type is an attribute that specifies the type of data (character, integer, binary, and so on) that can be stored in a column, parameter, or variable. SQL Server provides a set of system-supplied data types, although you can create user-defined data types that are based on the system-supplied data types. System-supplied data types define all of the types of data that you can use with SQL Server. Data types can be used to enforce data integrity because the data entered or changed must conform to the type specified for the object. For example, you cannot store someone's last name in a column defined with the datetime data type, because a datetime column accepts only valid dates. For more information about data types, refer to Chapter 4, "Implementing SQL Server Databases and Tables."

NOT NULL Definitions

The nullability of a column determines whether the rows in the table can contain a null value for that column. A null value is not the same as zero, blank, or a zero-length character string such as " ". Null means that no entry has been made. The presence of null usually implies that the value is either unknown or undefined. You define the nullability of a column when you define the column, when you either create or modify a table. Whether you define columns that allow null values or columns that do not allow null values, you should always use the NULL or NOT NULL keywords in every column definition because of the complexities of how null is handled in SQL Server. The NULL keyword is used if null values are allowed in the column, and the NOT NULL keywords are used if null values are not allowed. For more information about data types, refer to Chapter 4, "Implementing SQL Server Databases and Tables."

DEFAULT Definitions

Defaults specify what values are used in a column if you do not specify a value for the column when inserting a row. DEFAULT definitions can be created when the table is created (as part of the table definition) or can be added to an existing table. Each column in a table can contain a single DEFAULT definition. For more information about default values, refer to Chapter 4.

IDENTITY Properties

For each table, a single identifier column can be created that contains system-generated sequential values that uniquely identify each row within the table. Identifier columns usually contain values unique within the table for which they are defined. In other words, it is possible for tables containing identifier columns to contain the same identity values used by another table. This situation is usually not a problem, however, because the identifier values are typically used only within the context of a single table—and the identifier columns do not relate to other identifier columns in other tables. For more information about the IDENTITY property, refer to Chapter 4.

Constraints

Constraints enable you to define the way that SQL Server automatically enforces the integrity of a database. Constraints define rules regarding the values allowed in columns and are the standard mechanism for enforcing integrity. Using constraints is preferred to using triggers, rules, or defaults. The query optimizer also uses constraint definitions to build high-performance query execution plans. Lesson 2 discusses constraints in more detail.

Rules

Rules are a backward-compatibility feature that performs some of the same functions as CHECK constraints. CHECK constraints are the preferred, standard way to restrict the values in a column. CHECK constraints are also more concise than rules; there can be only one rule applied to a column, but multiple CHECK constraints can be applied. CHECK constraints are specified as part of the CREATE TABLE statement, while rules are created as separate objects and are then bound to the column.

You should use the CREATE RULE statement to first create the rule, and then you should use the sp_bindrule system stored procedure to bind that rule to a column or to a user-defined data type. For more information about using CREATE RULE or sp_bindrule, refer to the Transact-SQL reference in SQL Server Books Online.

Triggers

Triggers are a special class of stored procedures defined to execute automatically when an UPDATE, INSERT, or DELETE statement is issued against a table or a view. Triggers are powerful tools that sites can use to enforce their business rules automatically when data is modified. Triggers can extend the integrity checking logic of SQL Server constraints, defaults, and rules (although constraints and defaults should be used instead whenever they provide all of the needed functionality). Triggers are discussed in more detail in Chapter 9, "Implementing Triggers."

Indexes

An index is a structure that orders the values of one or more columns in a database table. An index provides pointers to the data values stored in specified columns of the table and then orders those pointers according to the sort order you specify. The database uses the index much as you use an index in a book: it searches the index to find a particular value and then follows the pointer to the row containing that value. A unique index enforces a column's uniqueness. Indexes are discussed in more detail in Chapter 11, "Implementing Indexes."

Types of Data Integrity

SQL Server supports four types of data integrity: entity integrity, domain integrity, referential integrity, and user-defined integrity.

NOTE

You might find that some documentation defines more (or less) than four types of data integrity or defines types of data integrity different from the four included here. However, the four discussed here are generally considered the main types of data integrity.

Entity Integrity

Entity integrity defines a row as a unique instance of an entity for a particular table. Entity integrity enforces the integrity of the identifier column or the primary key of a table (through indexes, UNIQUE constraints, PRIMARY KEY constraints, or IDENTITY properties).

Domain Integrity

Domain integrity is the validity of entries for a given column. You can enforce domain integrity by restricting the type (through data types), the format (through CHECK constraints and rules), or the range of possible values (through FOREIGN KEY constraints, CHECK constraints, DEFAULT definitions, NOT NULL definitions, and rules).

Referential Integrity

Referential integrity preserves the defined relationships between tables when records are entered or deleted. In SQL Server, referential integrity is based on relationships between foreign keys and primary keys or between foreign keys and unique keys (through FOREIGN KEY and CHECK constraints). Referential integrity ensures that key values are consistent across tables. Such consistency requires that there be no references to non-existent values and that, if a key value changes, all references to it change consistently throughout the database.

When you enforce referential integrity, SQL Server prevents users from doing any of the following:

Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are related records in the foreign table

For example, with the Sales and Titles tables in the Pubs database, referential integrity is based on the relationship between the foreign key (title_id) in the Sales table and the primary key (title_id) in the Titles table, as shown in Figure 5.1.

figure 5.1-referential integrity between the sales table and the titles table.

Figure 5.1 Referential integrity between the Sales table and the Titles table.

User-Defined Integrity

User-defined integrity enables you to define specific business rules that do not fall into one of the other integrity categories. All of the integrity categories support user-defined integrity (all column-level and table-level constraints in the CREATE TABLE statement, stored procedures, and triggers).

Exercise 1: Identifying the Properties Used to Ensure Data Integrity

In this exercise, you will view several tables in the Pubs database and identify properties used to ensure data integrity. To perform this exercise, you should be logged into your Windows 2000 Server computer as Administrator.

To identify properties in the Employee table

Open SQL Query Analyzer and connect to your local server.
In the Editor pane of the Query window, enter the following Transact-SQL code:

 USE Pubs GO sp_help Employee

In this statement, you are identifying the database containing the table that you want to view and you are using the sp_help system stored procedure to view information about the Employee table in the Pubs database.

Execute the Transact-SQL statement.

Information about the Employee table appears in the Grids tab of the Results pane.

Scroll through the result on the Grids tab. Identify the data types, nullability, DEFAULT definitions, IDENTITY property, indexes, and constraints.

What types of constraints have been defined for the Employee table?

Which columns in the Employee table allow null values?

To identify properties in the Publishers table

In the Editor pane of the Query window, enter and execute the following Transact-SQL code:

 sp_help Publishers

Information about the Publishers table appears in the Grids tab of the Results pane.

Scroll through the result on the Grids tab. Identify the data types, nullability, DEFAULT definitions, IDENTITY property, indexes, and constraints.

What types of constraints have been defined for the Publishers table?

Which column in the Publishers table is the identity column?

Which columns in the Publishers table have been defined with the char data type?

To identify properties in the Titles table

In the Editor pane of the Query window, enter and execute the following Transact-SQL code:

 sp_help Titles

Information about the Titles table appears in the Grids tab of the Results pane.

Scroll through the result on the Grids tab. Identify the data types, nullability, DEFAULT definitions, IDENTITY property, indexes, and constraints.

Which column in the Publishers table has been defined with a user-defined data type?

How many columns in the Titles table allow null values?

Close SQL Query Analyzer.

Lesson Summary

Tables in a SQL Server database can include a number of different types of properties that ensure the integrity of the data. These properties include data types, NOT NULL definitions, DEFAULT definitions, IDENTITY properties, constraints, rules, triggers, and indexes. SQL Server supports four types of data integrity: entity integrity, domain integrity, referential integrity, and user-defined integrity. Entity integrity defines a row as a unique instance of an entity for a particular table. Domain integrity is the validity of entries for a given column. Referential integrity preserves the defined relationships between tables when records are entered or deleted. User-defined integrity enables you to define specific business rules that do not fall into one of the other integrity categories.