Planning Integrity | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

In Chapter 6 we tackled the CREATE TABLE statement, but the primary objective of the methods discussed and the code demonstrated was to create tables, not to demonstrate the installing of integrity mechanisms. In this chapter, we go a step further and code the formal or declarative integrity constraint definitions into the CREATE TABLE and ALTER TABLE statements.

SQL Server provides the following built-in integrity or constraint mechanisms:

The capability to provide default values and thereby avoid NULL
The capability to code check constraint expressions that evaluate the values in the SQL statement to determine if they are allowed to be saved to the column
The capability to code referential integrity constraints (cross-referencing foreign key columns)
The capability to declare primary and unique keys that ensure uniqueness in a column or through the combination of columns
The capability to use triggers as a form of constraint, which can also provide a trans- or intra-database integrity checking facility Triggers are covered in the next chapter.

When you first create a table, you are in fact laying down the first integrity constraints in your database because you create columns that have different data types for different data. For example, you would not store a noncharacter value in a character data type, and you would not try to store a character in an integer data type.

When you design a database, one of the basic rules to follow is to use common sense. If you are storing integer data, then store it in integer columns; date-to-date columns; character-to-character columns; and so forth. Naturally you will come to situations that will require you to decide between variations of a data type (small integers, integers, or big integers), or to make decisions that rely on the precision and scale of data, such as values of type real or float, currency, and so on. Other times, your choice of data type will be related to storage requirements, system resources, and so forth. An example would be deciding to switch to the bigint data type because of the need to store very large numbers. But you would not incur the storage overhead of a big integer (bigint) if you were storing nothing larger than 99.

The format of the data being stored is also a consideration, because modern database systems do not store only characters and numerals any more. They also store binary information, images, objects, bitmaps, large amounts of text, and so forth. And SQL Server 2005 also allows us to build our own user-defined data types, which are discussed in Chapter 15.

The Integrity Plan

The integrity plan is one of the most important sections of your overall database definition, model, or architecture. You can use the flow diagram in Figure 12–2 to build your integrity plan.

image from book
Figure 12–2: Creating an integrity plan

Gather Business Rules

This section of the integrity plan identifies business rules that will impact the database model and architecture. These rules, as discussed earlier, differ from the integrity issues that are built in to the relational model from the get-go, such as referential integrity.

Sit down with the people responsible for establishing the business rules, or provide the facilities in your model and code for later easy incorporation of constraints to cater to the business rules.

Identify Key Integrity Issues to Cover

These issues will be referenced in the architecture or will become patently part of it. For example, you might want to list attributes of the data that have legal implications if they render the data questionable. It is also important to plan for the prospect of warehousing the data, and constraining data in the operational databases with analysis in mind will make it easier to transform the data at a later stage.

List Each Integrity Requirement

This step is closely related to the preceding step, and they can be combined. Here you should list the precise integrity needs of the databases and tables right down to the values required.

You could list each table and link it to a list of integrity constraints required on its internals. For example, you will specify referential integrity requirements with linked tables, the primary key column, type constraints, and so on.

Document the Solution for Each Constraint

This section covers how you propose to deal with each integrity requirement. For example, do you plan to create unique constraints directly in T-SQL code or interactively in Management Studio (or possibly through the DMO object library)? Here you would also determine which integrity mechanism would be most suited for the task. For example, would a check constraint work or would you have to code a trigger?

Also document and establish procedures for maintaining and revising the constraints. If constraints are implemented in code, then you need to maintain source, version control, and so forth. You will also need to manage access, permissions to the constraint objects, and so on.

Do Cost Analysis

There is a cost attached to every constraint. The costs are both direct, in terms of their consumption of SQL Server resources and usefulness, and indirect, in terms of the costs of programming, maintenance, documentation, and so on. For example, using a stored procedure to check integrity is more expensive in all terms than using a trigger (besides, there are situations that only a trigger caters to). Using a trigger, on the other hand, is more expensive than using a built-in cascade. And, defaults or constraint objects are more expensive than check constraints and so on.

When performing cost analysis of constraints, you should also consider adherence to standards as a factor. For example, you should use built-in constraints because they are based on ANSI compliance and also conform to relational database rules.

Triggers, on the other hand, can be used to meet ANSI recommendations, but since triggers are entirely procedural, they cannot be considered ANSI-compliant. SQL Server can check code for errors, but it is not able to check trigger code for standards compliance. In fact, even if trigger code is syntactically correct, it might still nuke your data beyond recovery.

Determine Error Checking Requirements

When integrity is violated or a constraint traps problem data, SQL Server will report errors. You need to determine how best to check these errors and establish procedure for using the error logs to trap faults in design and development, data entry, or business-rule integration. In short, error checking and tracking can be used to reduce errors down the road.

Determine Effects and Requirements in the ClientTier

As discussed in Chapter 11, we don’t want to build a client tier that collapses every time we make changes in the data tier, which is why we move the data processing to the data tier and code just about everything in triggers, UDFs, stored procedures, and functions.

If you plan well and manage the database model and architecture properly, the constraints in the data tier should have little impact on the clients. However, you will have a lot to think about if you need to maintain legacy client code that still maintains a lot of client-side data processing and logic.

It is also possible to code constraint or integrity checking in the client tier without adversely affecting development costs and management. For example, it is more useful in terms of server resource conservation to check for format or type and domain integrity violations before the data gets transmitted to the server. A mask over the telephone number in a client application, even a Web browser, will obviate the need for SQL Server to bounce back with an integrity violation that incurs an additional round trip (see the section “Check Constraints” later in this chapter).