Tables | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

In the first part of this book, we discussed the architecture of SQL Server tables and how data is stored in them. Let’s look at the tables in this chapter from a higher level, as objects we need to create and configure for our applications. A database can contain many tables, and the tables can contain many columns designed to hold different data types.

There are several ways that you can go about creating your tables, as described in the following list:

Interactively in Management Studio, choosing the New Table option from the Tables node
Interactively from within a database diagram when you right-click the canvas
Through code; specifically, the CREATE TABLE and ALTER TABLE statements run in a query window
Through SSIS
Using the SQL-SMO object model

Each route represents a different mechanism for creating a table. In this section, we’ll look at the CREATE TABLE and ALTER TABLE statements, along with the interactive design tools in Management Studio. The SQL-SMO object model is covered in Chapter 11. Using SSIS as a database creation tool is beyond the scope of this book.

Before you decide on the most convenient method of getting from point A to point B in table creation and configuration, let’s first talk about database design. Modeling, architecture, design, and metadata creation are essential before we can build our databases and applications.

Granted, you will always be adding more tables, expanding existing ones, and changing the attributes of the tables (refactoring). Initially, however, you need to sketch out the basic design of the database and its tables before creating the objects in the database. You can use sophisticated modeling tools, such as ER-Win, Rational Rose, and Visio, or you can use the limited design tools you have in Management Studio (the more experience you gain in database design, the less tools you will need).

If you don’t have any design tools handy, then a large whiteboard or a single large sheet of paper, pencils, and erasers will do. You could go from the whiteboard or paper to the design tool of your choice, or if you are strong at T-SQL, you could create what you need using the CREATE TABLE statement.

You should design the entire database or application from the get-go, and all the tables you’ll need to get the job done at version one. Working tables piecemeal can be really taxing on the overall development project and is never recommended. You really need to see the entire logical model from a much higher level, rather than working on a table-by-table basis, and this is where modeling can be of use.

You will need to understand and formulate a plan around how the tables are going to be normalized, such as to what levels-3NF, 4NF, and so on; and you need to decide on the placing of primary and foreign keys, indexes, and data types. Constraints are not that important at this stage; I believe it’s more important to get the logical design in place before worrying about the integrity of the data that’s going into the tables. Integrity can come with the data integrity plan, which we discuss in Chapter 12.

So, on to table creation with the help of T-SQL.

CREATE TABLE

The CREATE TABLE statement may seem a little daunting if you have not seen or worked with it before. However, if working in source or script code is your thing, then you’ll find CREATE TABLE (and ALTER TABLE) much more powerful and possibly a lot easier than working with the dialog boxes in Management Studio.

 CREATE TABLE     [ database_name . [ schema_name ] . | schema_name . ] table_name         ( { <column_definition> | <computed_column_definition> }         [ <table_constraint> ] [ ,...n ] )     [ ON { partition_scheme_name ( partition_column_name ) | filegroup         | "default" } ]     [ { TEXTIMAGE_ON { filegroup | "default" } ] [ ; ] <column_definition> ::= column_name <data_type>     [ COLLATE collation_name ]     [ NULL | NOT NULL ]     [         [ CONSTRAINT constraint_name ] DEFAULT constant_expression ]       | [ IDENTITY [ ( seed ,increment ) ] [ NOT FOR REPLICATION ]     ]     [ ROWGUIDCOL ] [ <column_constraint> [ ...n] ] <data type> ::= [type_schema_name . ] type_name    [ (precision [ , scale] | max |       [ {CONTENT | DOCUMENT} ] xml_schema_collection)] <column_constraint> ::= [CONSTRAINT constraint_name] {     {PRIMARY KEY | UNIQUE}        [CLUSTERED | NONCLUSTERED]        [           WITH FILLFACTOR = fillfactor         | WITH ( < index_option > [ , ...n])          ]        [ON {partition_scheme_name (partition_column_name)           | filegroup | "default"}]    | [FOREIGN KEY]         REFERENCES [schema_name .] referenced_table_name [ (ref_column)         [ ON DELETE {NO ACTION | CASCADE | SET NULL | SET DEFAULT}]         [ ON UPDATE {NO ACTION | CASCADE | SET NULL | SET DEFAULT}]         [ NOT FOR REPLICATION] | CHECK [NOT FOR REPLICATION] ( logical_expression) } <computed_column_definition> ::= column_name AS computed_column_expression [ PERSISTED [NOT NULL ] ] [     [ CONSTRAINT constraint_name]     { PRIMARY KEY | UNIQUE}         [ CLUSTERED | NONCLUSTERED]         [             WITH FILLFACTOR=fillfactor           | WITH ( <index_option> [ , ...n])         ]    |   [ FOREIGN KEY]         REFERENCES referenced_table_name [ (ref_column)]         [ ON DELETE {NO ACTION | CASCADE}]         [ ON UPDATE {NO ACTION}]         [ NOT FOR REPLICATION]    |   CHECK [NOT FOR REPLICATION] (logical_expression)    |   ON {partition_scheme_name (partition_column_name)          | filegroup "default"}] ] < table_constraint > ::= [ CONSTRAINT constraint_name] {   { PRIMARY KEY | UNIQUE}       [ CLUSTERED | NONCLUSTERED]                (column [ASC | DESC] [ , ...n])       [           WITH FILLFACTOR = fillfactor          |WITH ( <index_option> [ , ...n])      ]       [ ON {partition_scheme_name (partition_column_name)           | filegroup | "default"}]   | FOREIGN KEY               (column [ ,...n])       REFERENCES referenced_table_name [ (ref_column [ ,...n])]       [ ON DELETE {NO ACTION | CASCADE SET NULL | SET DEFAULT}]       [ ON UPDATE {NO ACTION CASCADE SET NULL | SET DEFAULT}]       [ NOT FOR REPLICATION]   | CHECK [NOT FOR REPLICATION] (logical_expression) } <index_option> ::=      PAD__INDEX={ON | OFF}   | FILLFACTOR=fillfactor   | IGNORE_DUP_KEY={ON | OFF}   | STATISTICS_NORECOMPUTE={ON | OFF}   | ALLOW_ROW_LOCKS={ON | OFF}   | ALLOW_PAGE_LOCKS ={ON | OFF}}

So let’s go through the main arguments.

The database_name option

The database_name argument specifies the name of the database in which the table is created. If you omit the database_name in the script, SQL Server will install the table to the current table by default. You will obviously need a logon for the current connection and an associated user ID in the database specified here; and create table permissions are required.

The owner option

The owner is the name of the user ID that owns the new table; this must be an existing user ID in the database. If you do not specify an owner, SQL Server defaults to the user ID associated with the logon for the current connection in the database.

You can also transfer the ownership if you are a member of the sysadmin fixed server role or the db_owner or db_ddladmin fixed database roles. You can also create the table if all you have is create table rights, but you have to specify a user ID associated with the current logon. If you do not specify an owner but you are a member of the sysadmin fixed server role, the owner of the table will default to the dbo user.

The table_Name option

The table_name that you provide must conform to the rules for identifiers (see Chapter 16). The identifier must also be unique and fully qualified in dot notation. An example of a qualified unique identifier is comptroller.account_pay or dbo.customers.

The table_name can contain a maximum of 128 characters. Temporary table names must be prefixed with a single number sign (#), and they cannot exceed 116 characters. (See Chapter 16 for a discussion on the new table data type, which might suit your purpose better than temporary tables.)

The column_name option

The column_name represents the name of a column in the table. The column names must also conform to the rules for identifiers as discussed previously, and they must be unique in the table you are creating. The only time you do not need to specify a column name is when you create a column to hold timestamp data. SQL Server will just use “timestamp” for the column name if you leave it out.

The computed_column_expression option

The computed_column_expression refers to an expression defining the value of a computed column. A computed column is not physically created and stored in the table but is computed from an expression using other columns in the same table. You can create a computed column to hold a computing expression. For example, the expression “total AS subtotal * discount” returns a single value stored in this column.

The expression you use does not need to be a mathematical expression. It can also be another noncomputed column name, constant, function, variable, or any combination of these connected by one or more operators. The only exception is that the expression cannot be a subquery.

Computed columns can be used in select lists, WHERE clauses, ORDER BY clauses, or any other locations in which regular expressions can be used. The following exceptions, however, are important considerations:

You cannot define a DEFAULT or a FOREIGN KEY constraint or a NOT NULL constraint definition over this so-called column. But the computed column can be used as a key column in an index or as part of any PRIMARY KEY or UNIQUE constraint. The proviso is that the computed column value must be defined by a deterministic expression and the data type of the result must be compatible with index columns.

Using the nondeterministic function GETDATE(), for example, would not fly, because it changes with every invocation and thus cannot be indexed.
You cannot reference a computed column from an INSERT or UPDATE statement.

SQL Server may insert a NULL value into the column, and you can use the COLUMNPROPERTY function (AllowsNull property) to investigate the nullability of the computed columns. You can also call the ISNULL(check_expression, constant) function and if necessary change the value to a nonnull.

ON {filegroup | }

The ON {filegroup | DEFAULT} placeholder specifies a filegroup for the table (see the discussion on filegroups earlier in this chapter). If you do not specify ON, the table will be stored in the default filegroup.

The ON {filegroup | DEFAULT} can also be specified in a PRIMARY KEY or a UNIQUE constraint. If filegroup is specified, however, the index is stored in the named filegroup. If DEFAULT is specified, the index is stored in the default filegroup. If you omit the filegroup from a constraint definition, then the index will be stored in the same filegroup as the table.

When you specify DEFAULT here, you must delimit it, because it not a keyword. You can delimit it with double quotes (for example, ON “DEFAULT”) or with square brackets (for example, ON [DEFAULT]).

Text Images in Filegroups

The TEXTIMAGE_ON {filegroup | DEFAULT} specification indicates that the text, ntext, image, xml, varchar(max), nvarchar(max), varbinary(max), and CLR user-defined type columns are stored on the specified filegroup defined in the preceding argument. However, TEXTIMAGE_ON is not allowed if there are no text, ntext, or image columns in the table. If you do not specify TEXTTMAGE_ON, the text, ntext, and image columns are stored in the same filegroup as the table.

TEXTIMAGE_ON is not allowed if there are no large value columns in the table. TEXTIMAGE_ON cannot be specified if <partition_scheme> is specified. If “default” is specified, or if TEXTIMAGE_ON is not specified at all, the large value columns are stored in the default filegroup. The storage of any large value column data specified in CREATE TABLE cannot be subsequently altered.

The data_type option

The data_type argument specifies the column data type. You can use any system- or user-defined data types. To add the user-defined data type, you can call the system stored procedure sp_addtype in the script, as long as you create it before trying to use it in a table definition.

The NULL/NOT NULL assignment for a user-defined data type can be overridden during the CREATE TABLE statement, but you cannot specify a length for a user-defined data type in the CREATE TABLE statement.

Default

The DEFAULT argument specifies the value that SQL Server will automatically insert in your column if you do not provide a value when using INSERT.

The constant_expression option

The constant_expression placeholder is used to specify a constant, a NULL value, or a system function used as the default value for the column.

Identity

The IDENTITY argument specifies that the respective column is an identity column. These are commonly used in conjunction with PRIMARY KEY constraints to serve as the unique row identifier for the table. The actual IDENTITY value or property can be of the data type tinyint, smallint, int, bigint, or the decimal(p,0), or numeric(p,0) columns. You must also specify both the seed and increment, or you can leave them out altogether, in which case the default of (1,1) will be used (seed, increment).

Seed

The seed is the value SQL Server uses for the very first row loaded into the table. If the seed is “1,” the first row will be “1.”

Increment

The increment specifies the incremental value added to the identity value for the preceding row. For example, if the last row ID was “2785” and the increment is “5,” the next row’s value will be “2790.”

Not for Replication

The NOT FOR REPLICATION argument specifies that the IDENTITY property should not be enforced when a replication logon such as sqlrepl inserts data into the table. Replicated rows must retain the key values assigned in the publishing database, as demonstrated in the preceding chapter. The NOT FOR REPLICATION clause ensures that rows inserted by a replication process are not assigned new identity values. Rows inserted by other processes continue unaffected by this restriction. (You can also manage these values and values inserted by the publishers with CHECK constraints on the local table.)

Rowguidcol

This argument indicates that the new column is a row global uniqueidentifier column. There can be only one uniqueidentifier column per table and the ROWGUIDCOL property can be assigned only to a uniqueidentifier column. (Make sure not to install a ROWGUIDCOL on a database whose compatibility level is 65 or lower [version 6.5]. See Appendix or Books Online for system stored procedure sp_dbcmptlevel.)

It is important to understand that the ROWGUIDCOL property does not enforce uniqueness of the values stored in the column. It also does not behave like the identity column and automatically generate values for new rows inserted into the table. However, you can generate unique values for each column by either using the NEWID() function on INSERT statements or using the NEWID() function as the default for the column (NEWID() is discussed further in Chapter 16).

The collation_name option

The collation_name placeholder specifies the language collation for the column. (Collations are introduced in this chapter.) The collation name can be either a Windows collation name or a SQL collation name. Collations govern the management of text strings in the database, so this argument is applicable only for columns of the char, varchar, text, nchar, nvarchar, and ntext data types. If you do not specify the collation, the column will be assigned either the collation of the user-defined data type (if the column is of a user-defined data type) or the default colla tion of the database. The database collation is either specifically set or inherited from msdb.

The constraint_name option

The constraint_name argument is optional, but when you use it, you need to specify the constraint name, which must be unique to the database.

Null

The NULL | NOT NULL keywords are used to specify whether null values are allowed in the column. NULL is not really a constraint per se but can be specified in the same manner as NOT NULL.

Primary Key

The PRIMARY KEY argument is a constraint that enforces entity integrity for a given column or columns through a unique index.

Unique

The UNIQUE argument specifies the constraint that provides entity integrity for a given column or columns through a unique index. A table can have multiple UNIQUE constraints.

Index Clustering

Index clustering is specified through the CLUSTERED|NONCLUSTERED keywords used to indicate that a clustered or nonclustered index is created for the PRIMARY KEY or UNIQUE constraint. PRIMARY KEY constraints default to CLUSTERED, and UNIQUE constraints default to NONCLUSTERED.

You cannot specify CLUSTERED for more than one constraint in a CREATE TABLE statement, and you’ll see that enforced, even by Management Studio. If you specify CLUSTERED for a UNIQUE constraint and also specify a PRIMARY KEY constraint, the PRIMARY KEY defaults to NONCLUSTERED.

Fillfactor

The argument [WITH FILLFACTOR = fillfactor] specifies how full SQL Server should make each index page when it creates it to store the index data. You can set the fillfactor values from 1 through 100, with a default of 0. A lower fill factor creates the index with more space available for new index entries without having to allocate new space.

Referential Integrity

Referential integrity is specified through the FOREIGN KEY…REFERENCES arguments. The FOREIGN KEY constraints can reference only columns that are PRIMARY KEY or UNIQUE constraints in the referenced table or columns referenced in a UNIQUE INDEX on the referenced table. This is done through the following arguments:

The ref_table argument, which specifies the name of the table referenced by the FOREIGN KEY constraint.
The (ref_column[,…n]) argument, which specifies a column or list of columns from the table referenced by the FOREIGN KEY constraint.

Cascading Referential Integrity

The ON DELETE {CASCADE|NO ACTION} and ON UPDATE {CASCADE|NO ACTION} arguments specify what action is taken on a row in the table created if a referential relationship with a row in a foreign table is established. The default is NO ACTION, which may not be wise (see Chapter 17). However, if CASCADE is specified and a row is deleted or updated in a parent table, then the row it references in the new table is also deleted or updated. If NO ACTION is specified, SQL Server raises an error and the delete or update actions on the rows in the parent table are rolled back. This is further discussed in Chapter 12.

Check

The CHECK argument refers to a constraint that enforces domain integrity by limiting the possible values that can be entered into a column or columns. The NOT FOR REPLICATION keywords are used to prevent the CHECK constraint from being enforced during the distribution process used by replication.

The logical_expression placeholder is a logical expression that returns TRUE or FALSE. The column placeholder is a column or list of columns, in parentheses, used in table constraints to indicate the columns used in the constraint definition.

The [ASC|DESC] placeholder specifies the order in which the column or columns participating in table constraints are sorted. The default is ASC. Finally, the n placeholder indicates that the preceding item can be repeated n number of times.

ALTER TABLE is almost identical to CREATE TABLE, so we will not go through the syntax for it, and besides, you can reference the entire syntax in Books Online.

Creating a table in Enterprise Manager is obviously a lot easier for a novice than going the T-SQL script route. As I mentioned earlier, however, by building a T-SQL script you afford yourself a thorough education on what it takes to build a database and put tables in it. In any event, most CREATE TABLE code only needs to be a few lines.

Creating a Table Using Management Studio

To create a table in Management Studio, you need to perform the following actions:

Drill down to the database into which you need to add the new table, expand the database to reveal the Tables node. Right-click the Tables node and select New Table from the context menu. A table designing workspace will appear on the first tab of the workspace pane.
Create and configure the columns as needed using the Column Properties pane. Note that the Table Designer menu becomes visible when you select New Table. If you chose to show the associated toolbar, it too will appear under the menu bar. The Table Designer lets you create and manage constraints, indexes, and so on. The integrity constraints, checks, and so on are fully discussed in Chapter 12.

The Table Designer essentially does what the T-SQL scripts do.

Working with Tables in Database Diagrams

Database diagrams are not something new. You can use the Database Diagrams workspace to add all the tables you need to relate for a complete record. The difference between the Database Diagrams tool and something like Rational Rose or Visio is that the tables are either added or created on the fly before they are related, while external tools are used to generate code or scripts or routines to run against SQL Server.

Nevertheless, you can start with the database diagram, as is described in the following two sections.

Database Diagrams

To create a diagram with or without tables, do as follows:

Expand the database node for the database you are working with. Right-click the Database Diagrams node and select New Database Diagram from the context menu. The Add Table dialog box opens, from which you can “add” existing tables to the diagram work space. This dialog box is shown in Figure 6–5.

Figure 6–5: Adding tables to a Database Diagram workspace
You can also right-click anywhere in the New Diagram pane and select New Table from the context menu. The Choose Name dialog box loads. Enter the name and click OK. The new table is inserted into the database diagram, as illustrated in Figure 6–6. The features you have to work with here are identical to what is available from the Table Designer menus.

Figure 6–6: Working with the Database Diagram tool

You can now work with the tables and add or delete from the database diagram as you need. The Database Diagram menus include a number of tools to use with diagrams.

For example, if you have too many tables to work with you can zoom into or out of the workspace by selecting the Zoom menu item.

Note

There are several options using Management Studio to generate SQL scripts to create databases and tables. As when using its predecessors, you can right-click any database, table, view, or object and choose to script the target to code as a CREATE or ALTER statement. There is also a button labeled Script on the New Database dialog box (see Figure 6–1) that lets you create scripts. You can then save the script to execute at any time.