The Guru[ap]s Guide to SQL Server[tm] Stored Procedures, XML, and HTML
Authors: Henderson K.
Published year: 2005
Now that you've successfully modeled the relationships between the entities needed to service the property leasing process, you're ready to move on to relational data modeling. Relational (or logical) data modeling is one step closer to the physical implementation of your database than E-R diagramming. Here you'll define primary keys, referential integrity constraints, and the like. Again, a CASE tool is indispensable here. A good CASE tool will not only make the task of modeling much easier, it will also generate the SQL necessary to implement your model.
Before we get started, let me cover some basic relational database and logical data modeling jargon. Table 5-5 lists what I consider essential relational modeling terms. I'm sure you're familiar with most of these already, but a little refresher is still probably worthwhile. Knowing these will help you better understand the discussions that follow. I don't intend for this glossary to be exhaustive, but hopefully you'll find it complete enough to be useful.
Now that you have a few more terms under your belt, let's proceed with the discussion on database design. In this section, you'll take the E-R diagram you built earlier and translate it into a full relational model. The end result of your work will be a Transact-SQL script that you could actually use to create your database objects.
|Database||A collection of data organized into tables. A good metaphor for databases is the file cabinet. Databases contain tables that contain other data, just as file cabinets contain files that contain items of their own.|
|Table||The fundamental data container in a database. You can think of a relational table as a two-dimensional surface divided into rows and columns . To further the file cabinet metaphor, you can equate each table to a file in the filing cabinet. Tables house all the rows of a particular type. Just as a folder in a filing cabinet might contain all the items of a given type (for example, the entirety of a given vendor's invoices), so does a table in a database contain all the rows of a given type.|
|Row||Corresponds to a single real-world data object. It might be an invoice, a withdrawal from a savings account, or a listing in the phone book. Rows are the "meat and potatoes" of the database. You'll often hear rows referred to as records, although SQL purists prefer the use of row. This book uses the two terms interchangeably. In their simplest form, databases and tables are really just mechanisms for organizing rows. In our file cabinet analogy, a row is an item within a file folder in the file cabinet. If the file folder was titled Invoices, we would expect each element therein to be an invoice of some type. Likewise, each row in the Invoice table should contain an invoice of some type.|
|Column||An element within a row. A column represents a characteristic of the object represented by a table row. Columns are often referred to as fields, although SQL purists prefer column. This book uses both terms interchangeably. An example of a column would be the Address column in a Customer table. In and of itself, the address is ambiguous. In the context of its table, though, the column describes a customer's address.|
|Primary key||The column or set of columns in a table that uniquely identifies each row. An example of a primary key would be the InvoiceNo column in an Invoice table. The invoice number in each row of the table is unique to that one record; it is found in no other rows. If you save the value of the InvoiceNo column for a particular row and go elsewhere in the table, you can always return to the original row using only the invoice number as the key. The columns that make up a primary key are normally used to build an index on the table for fast access to its rows.|
|Foreign key||A column or set of columns that is inherited from another table. Usually, the inherited key is the primary key of the related table. A foreign key may or may not also be part of the primary key of its host table. Usually it isn't. For example, returning to the Invoice table mentioned earlier, the CustomerNo column in an Invoice table might be a foreign key that's inherited from the Customer table. The CustomerNo field couldn't be the Invoice table's primary key because it doesn't uniquely identify individual invoice rowsyou might have several invoices for a single customer. However, because customer numbers stored in the Invoice table must be valid, the values stored in the CustomerNo column must be checked against the Customer table. Thus, the CustomerNo column in the Invoice table is a foreign key that is linked relationally to the Customer table's primary key.|
|Candidate key||A column or set of columns within a table that uniquely identifies its rows. Candidate keys are also referred to as unique keys. A table's primary key is selected from its candidate keys.|
|Constraint||A mechanism for ensuring that invalid data is prevented from getting into the database. There are two general types of constraints: referential integrity constraints and domain integrity constraints. Referential integrity constraints ensure that relationships between tables are respected. Domain constraints ensure that values that have incompatible data types, are out of bounds, or are otherwise invalid are prevented from getting into the database.|
|View||A logical representation of a subset of a table's data. Views themselves contain no data. They are SQL queries that can themselves be queried as though they were tables. A view usually provides access to a subset of a table's columns, its rows, or both. A view can implement constraints on the types of data modifications that are permissible on its underlying tables.|
|Trigger||A special type of stored procedure that executes when a given SQL statement is executed against a table. Triggers can be used to ensure referential or domain integrity.|
To get started, load your E-R diagram into your relational data modeling tool (often, they're the same tool or part of the same suite). One of the first things you should specify for your new model is the target DBMS platform. Assuming your tool supports it, select SQL Server and the version of it that you'll be working withfor example, SQL Server 7.0 or SQL Server 2000 (some tools refer to SQL Server 2000 as version 8.0). Figure 5-21 illustrates what the model might look like once it's loaded.
The first thing you need to do with your new model is construct a data dictionary. Many relational modeling tools support data dictionary creation of some type through domains. A domain, you may recall, defines the type of data a column can contain. It's the rough equivalent of a user -defined data type in SQL Server. Regardless of whether you're using a CASE tool, the process of constructing a data dictionary (also known as an attribute or field repository ) is quite straightforward. To build a data dictionary, follow these steps:
Create a master list of all the columns contained in the tables in your model. This doesn't need to be done in any particular toolyou just need to derive a master column list before proceeding.
Create domain definitions for those columns contained in more than one table or likely to be (such as through a foreign key reference).
Define business rules for each domain in your data dictionary. Column-level business rules control the data values that columns will accept. You might, for example, specify that the Rent column must always be nonzero or that the EndDate column in the Lease table must always be greater than the BeginDate column.
Apply the domains you've devised to the columns in your tables. Rather than being based on the stock data types supported by SQL Server, some of your columns will be based on domains instead. These domains will bring with them certain special characteristics, such as limiting the values a column will allow. We'll use your modeling tool to take the domains you define and apply them to the columns in your model.
Table 5-6 lists the domains needed by your model as it now stands. Add each one to your tool's concept of a data dictionary or field/column repository.
If your tool supports the concept of optionality or nullability, make the TComments domain optional. As a rule, comments-type columns should not be required.
Notice that I've prefixed each domain name with a T. Here, T signifies type. It's a personal preference of mine. Use what works best for you. Of the various traditional programming language elements, domains most closely resemble typedefs (user-defined data type definitions in languages such as C and C++), so I treat domains and typedefs similarly. Figure 5-22 illustrates domain definition.
Now that you've finished setting up your domain repository, you're ready to apply it to the columns in your tables. You'll also associate those columns that aren't based on domains directly with base types.
Using your tool's facility for doing so, edit each column in the model so that its domain matches that listed in Table 5-7. Repeat this process for all the tables in your model.
As you define your columns, think about efficiency. Ask yourself, What's the smallest data type I can use for this column and still store the largest value it might ever have? Here are some general tips for efficiently sizing columns:
If a field will never need to store a value greater than 255, store it as a tinyint. If a column may need to store a value greater than 255 but less than 32,767, use smallint. In general, use the smallest integer type you can for storing integer data.
Use integers instead of floating-point types when you define numeric columns that will not need to store digits to the right of the decimal point.
Use variable character types instead of fixed character types when the length of a character column may vary from row to row.
Use the "small" versions of the datetime and money types (smalldatetime and smallmoney, respectively) when you can live with their loss of precision.
Use the bit type to define Boolean columns instead of using integer or single-character data types.
Describing your model elements via comments is a useful and worthwhile habit. In some shops , commenting in data models isn't even optional. The better modeling tools allow you to attach comments to any modeling element you wish. Oftentimes, these object comments are included in the SQL that the tool generates for you if your target platform supports them. Add comments to your model now to describe the objects it contains. Figure 5-23 illustrates.
Note that most tools allow you to add comments to individual columns, as well. Figure 5-24 illustrates.
Describing your model this way helps clarify your understanding of what you're building. In multiprogrammer projects, it also helps others understand what you were thinking when you designed a given object.
One of the final things you need to do is generate foreign key specifications for your model. Foreign keys are the physical implementation of the entity rela tionships constructed during the E-R modeling phase. Generating foreign keys causes the required columns to be propagated between tables. The way it usually works is that one table inherits the primary key of another. The inherited column or set of columns becomes a foreign key in the inheritor.
Using your tool's method of doing so, generate foreign keys for the tables in your model. You should then see a number of new columns added to your table objects. It's common for modeling tools to prefix these new columns with FK, signifying that they are foreign keys. Figure 5-25 illustrates what you might see.
Your model is now basically complete. You're just about ready to generate the Transact-SQL necessary to create the objects defined by the model.
Before you generate Transact-SQL to create your database objects, you should verify the integrity of your model. Most modeling tools include a facility of some type to assist with this. If your tool supports such a facility, invoke it now to check the integrity of the model.
Checking model integrity before generating T-SQL can save you from having to drop and recreate database objects because they were defined incorrectly. Good modeling tools check a number of aspects of your model and provide a fair measure of whether you're ready to proceed with DDL generation.
You've finished your model and now you're ready to generate the DDL necessary to physically implement it. Assuming your tool supports generating DDL scripts to create the objects contained in a model, use it to generate a T-SQL script that builds your database objects.
Listing 5-1 lists a SQL script that should resemble the one produced by your tool. Because you selected SQL Server as your target platform early on, the DDL generated consists of Transact-SQL.
USE master GO IF DB_ID('rentman') IS NOT NULL DROP DATABASE rentman GO CREATE DATABASE rentman GO USE rentman GO IF (DB_NAME()<>'rentman') BEGIN RAISERROR('Database create failed - aborting script', 20,1) WITH LOG RETURN END GO CREATE RULE RAddition AS @value IN ('Deerfield', 'Firewheel', 'Legacy Hills', 'Switzerland Estates', 'Sherwood', 'Rockknoll') GO CREATE DEFAULT DAddition AS 'Firewheel' GO EXEC sp_addtype TAddition , 'varchar(20) ', 'NOT NULL' EXEC sp_bindrule RAddition, TAddition EXEC sp_bindefault DAddition, TAddition GO EXEC sp_addtype TAddress , 'varchar(30)', 'NOT NULL' GO CREATE RULE RCity AS @value IN ('Oklahoma City', 'Norman', 'Edmond', 'Dallas', 'Garland', 'Plano') GO CREATE DEFAULT DCity AS 'Garland' GO EXEC sp_addtype TCity , 'varchar(30) ', 'NOT NULL' EXEC sp_bindrule RCity, TCity EXEC sp_bindefault DCity, TCity GO EXEC sp_addtype TComments , 'varchar(80)', 'NULL' EXEC sp_addtype TPhone , 'varchar(12)', 'NOT NULL' EXEC sp_addtype TPropertyNo , 'int', 'NOT NULL' GO CREATE DEFAULT DRent AS 750 GO EXEC sp_addtype TRent , 'smallmoney', 'NOT NULL' EXEC sp_bindefault DRent, TRent GO CREATE RULE RRooms AS @value IN (0, 1, 2, 3, 4, 5) GO EXEC sp_addtype TRooms , 'tinyint', 'NOT NULL' EXEC sp_bindrule RRooms, TRooms GO CREATE RULE RSchoolDistrict AS @value IN ('Putnam City', 'Oklahoma City', 'Richardson', 'Edmond', 'East Garland', 'Dallas', 'Plano') GO CREATE DEFAULT DSchoolDistrict AS 'East Garland' GO EXEC sp_addtype TSchoolDistrict , 'varchar(20) ', 'NOT NULL' EXEC sp_bindrule RSchoolDistrict, TSchoolDistrict EXEC sp_bindefault DSchoolDistrict, TSchoolDistrict GO CREATE RULE RState AS @value IN ('OK', 'TX') GO CREATE DEFAULT DState AS 'TX' GO EXEC sp_addtype TState , 'char(2)', 'NOT NULL' EXEC sp_bindrule RState, TState EXEC sp_bindefault DState, TState GO EXEC sp_addtype TTenantNo , 'int', 'NOT NULL' EXEC sp_addtype TYesNo , 'bit', 'NOT NULL' GO CREATE DEFAULT DZip AS '75080' GO EXEC sp_addtype TZip , 'varchar(10)', 'NOT NULL' EXEC sp_bindefault DZip, TZip GO CREATE TABLE PROPERTY ( Property_Number TPropertyNo NOT NULL, Address TAddress NOT NULL, City TCity NOT NULL, State TState NOT NULL, Zip TZip NOT NULL, Addition TAddition NOT NULL, SchoolDistrict TSchoolDistrict NOT NULL, Rent TRent NOT NULL, Deposit smallmoney NOT NULL, LivingAreas TRooms NOT NULL, BedRooms TRooms NOT NULL, BathRooms TRooms NOT NULL, GarageType TRooms NOT NULL, CentralAir TYesNo NOT NULL, CentralHeat TYesNo NOT NULL, GasHeat TYesNo NOT NULL, Refigerator TYesNo NOT NULL, Range TYesNo NOT NULL, DishWasher TYesNo NOT NULL, PrivacyFence TYesNo NOT NULL, LastLawnDate smalldatetime NOT NULL, LastSprayDate TYesNo NOT NULL, PRIMARY KEY (Property_Number) ) GO CREATE TABLE TENANT ( Tenant_Number TTenantNo NOT NULL, Name varchar(30) NOT NULL, Employer varchar(30) NOT NULL, EmployerAddress TAddress NOT NULL, EmployerCity TCity NOT NULL, EmployerState TState NOT NULL, EmployerZip TZip NOT NULL, HomePhone TPhone NOT NULL, WorkPhone TPhone NOT NULL, ICEPhone TPhone NOT NULL, Comments TComments NULL, PRIMARY KEY (Tenant_Number) ) GO CREATE TABLE CALL ( Call_Number int NOT NULL, Call_DateTime smalldatetime NOT NULL, Description varchar(30) NOT NULL, Property_Number TPropertyNo NULL, PRIMARY KEY (Call_Number), CONSTRAINT FK_PROPERTY1 FOREIGN KEY (Property_Number) REFERENCES PROPERTY ) GO CREATE TABLE LEASE ( Lease_Number int NOT NULL, BeginDate smalldatetime NOT NULL, EndDate smalldatetime NOT NULL, MovedInDate smalldatetime NOT NULL, MovedOutDate smalldatetime NOT NULL, Rent TRent NOT NULL, PetDeposit smallmoney NOT NULL, RentDueDay tinyint NOT NULL, LawnService TYesNo NOT NULL, Comments TComments NULL, Property_Number TPropertyNo NOT NULL, Tenant_Number TTenantNo NOT NULL, PRIMARY KEY (Lease_Number), CONSTRAINT FK_PROPERTY2 FOREIGN KEY (Property_Number) REFERENCES PROPERTY, CONSTRAINT FK_TENANT3 FOREIGN KEY (Tenant_Number) REFERENCES TENANT ) GO
Notice that the script begins by creating the logical domains you defined as SQL Server user-defined data types. It then uses these data types to define your database tables. This is a good technique because it encapsulates your business rules as much as possible in reusable data types. You can use these types to build new columns as necessary. For example, if you needed to add another column that permits only 1 or 0, you can reuse the TYesNo type. This is the advantage of embedding your business rules in data types rather that directly in columns.
The one flaw with this approach is that it uses somewhat antiquated constructs for ensuring domain integritynamely RULE and DEFAULT objects. The more modernand certainly more portable methodis to use CHECK and DEFAULT table/column constraints. Good modeling tools allow you to generate Transact-SQL DDL either way: either using RULE and DEFAULT objects or using standard ANSI constraints. The constraints approach is actually more flexible because you can set up CHECK constraints that involve multiple columns (for example, EndDate must be equal to or later than BeginDate). You can't do this with RULE objects. Listing 5-2 shows the script generated using constraints to ensure domain integrity:
USE master GO IF DB_ID('rentman') IS NOT NULL DROP DATABASE rentman GO CREATE DATABASE rentman GO USE rentman GO IF (DB_NAME()<>'rentman') BEGIN RAISERROR('Database create failed - aborting script', 20,1) WITH LOG RETURN END GO EXEC sp_addtype TAddition , 'varchar(20)', 'NOT NULL' EXEC sp_addtype TAddress , 'varchar(30)', 'NOT NULL' EXEC sp_addtype TCity , 'varchar(30)', 'NOT NULL' EXEC sp_addtype TComments , 'varchar(80)', 'NULL' EXEC sp_addtype TPhone , 'varchar(12)', 'NOT NULL' EXEC sp_addtype TPropertyNo , 'int', 'NOT NULL' EXEC sp_addtype TRent , 'smallmoney', 'NOT NULL' EXEC sp_addtype TRooms , 'tinyint', 'NOT NULL' EXEC sp_addtype TSchoolDistrict , 'varchar(20) ', 'NOT NULL' EXEC sp_addtype TState , 'char(2)', 'NOT NULL' EXEC sp_addtype TTenantNo , 'int', 'NOT NULL' EXEC sp_addtype TYesNo , 'bit', 'NOT NULL' EXEC sp_addtype TZip , 'varchar(10)', 'NOT NULL' GO CREATE TABLE PROPERTY ( Property_Number TPropertyNo NOT NULL, Address TAddress NOT NULL, City TCity NOT NULL, State TState NOT NULL DEFAULT 'TX' CHECK (State IN ('OK', 'TX')), Zip TZip NOT NULL DEFAULT '75080', Addition TAddition NOT NULL DEFAULT 'Firewheel' CHECK (Addition IN ('Deerfield', 'Firewheel', 'Legacy Hills', 'Switzerland Estates', 'Sherwood', 'Rockknoll')), SchoolDistrict TSchoolDistrict NOT NULL DEFAULT 'East Garland' CHECK (SchoolDistrict IN ('Putnam City', 'Oklahoma City', 'Richardson', 'Edmond', 'East Garland', 'Dallas', 'Plano')), Rent TRent NOT NULL DEFAULT 750, Deposit TRent NOT NULL DEFAULT 750, LivingAreas TRooms NOT NULL CHECK (LivingAreas BETWEEN 0 AND 5), BedRooms TRooms NOT NULL CHECK (BedRooms BETWEEN 0 AND 5), BathRooms TRooms NOT NULL CHECK (BathRooms BETWEEN 0 AND 5), GarageType TRooms NOT NULL CHECK (GarageType BETWEEN 0 AND 5), CentralAir TYesNo NOT NULL, CentralHeat TYesNo NOT NULL, GasHeat TYesNo NOT NULL, Refigerator TYesNo NOT NULL, Range TYesNo NOT NULL, DishWasher TYesNo NOT NULL, PrivacyFence TYesNo NOT NULL, LastLawnDate smalldatetime NOT NULL, LastSprayDate TYesNo NOT NULL, PRIMARY KEY (Property_Number) ) GO CREATE TABLE TENANT ( Tenant_Number TTenantNo NOT NULL, Name varchar(30) NOT NULL, Employer varchar(30) NOT NULL, EmployerAddress TAddress NOT NULL, EmployerCity TCity NOT NULL DEFAULT 'Garland', EmployerState TState NOT NULL DEFAULT 'TX', EmployerZip TZip NOT NULL DEFAULT '75080', HomePhone TPhone NOT NULL, WorkPhone TPhone NOT NULL, ICEPhone TPhone NOT NULL, Comments TComments NULL, PRIMARY KEY (Tenant_Number) ) GO CREATE TABLE CALL ( Call_Number int NOT NULL, Call_DateTime smalldatetime NOT NULL, Description varchar(30) NOT NULL, Property_Number TPropertyNo NULL, PRIMARY KEY (Call_Number), CONSTRAINT FK_PROPERTY1 FOREIGN KEY (Property_Number) REFERENCES PROPERTY ) GO CREATE TABLE LEASE ( Lease_Number int NOT NULL, BeginDate smalldatetime NOT NULL, EndDate smalldatetime NOT NULL, MovedInDate smalldatetime NOT NULL, MovedOutDate smalldatetime NOT NULL, Rent TRent NOT NULL DEFAULT 750, Deposit TRent NOT NULL DEFAULT 750, RentDueDay tinyint NOT NULL CHECK (RentDueDay BETWEEN 1 AND 15), LawnService TYesNo NOT NULL, Comments TComments NULL, Property_Number TPropertyNo NOT NULL, Tenant_Number TTenantNo NOT NULL, PRIMARY KEY (Lease_Number), CONSTRAINT FK_PROPERTY2 FOREIGN KEY (Property_Number) REFERENCES PROPERTY, CONSTRAINT FK_TENANT3 FOREIGN KEY (Tenant_Number) REFERENCES TENANT, CONSTRAINT CK_ENDDATE4 CHECK (EndDate >= BeginDate), CONSTRAINT CK_MOVEOUTDATE5 CHECK (MovedOutDate >= MovedInDate) ) GO
Your model is basically complete. If your tool supports it, enter a description for your database schema before saving it. Figure 5-26 illustrates.
Even though Enterprise Manager's Database Diagram facility isn't suitable as a general-purpose modeling tool, it does offer some basic physical database modeling facilities. Using the Create New Database Diagram wizard, you can import an existing physical model into a diagram and manipulate it using the Database Diagram facility. It's rudimentary, but it comes free in the box, so it makes sense to use it when you need to do physical modeling and the facility's very basic capabilities meet your needs. Figure 5-27 shows the RENTMAN database loaded into an Enterprise Manager Database Diagram.
The Guru[ap]s Guide to SQL Server[tm] Stored Procedures, XML, and HTML
Authors: Henderson K.
Published year: 2005