Altering a Table | Inside Microsoft SQL Server 2005: The Storage Engine (Solid Quality Learning)

SQL Server 2005 allows existing tables to be modified in several ways. Using the ALTER TABLE command, you can make the following types of changes to an existing table:

Change the data type or NULL property of a single column
Add one or more new columns, with or without defining constraints for those columns
Add one or more constraints
Drop one or more constraints
Drop one or more columns
Enable or disable one or more constraints (applies only to CHECK and FOREIGN KEY constraints)
Enable or disable one or more triggers

Changing a Data Type

By using the ALTER COLUMN clause of ALTER TABLE, you can modify the data type or NULL property of an existing column. But be aware of the following restrictions:

The modified column can't be a text, image, ntext, or rowversion (timestamp) column.
If the modified column is the ROWGUIDCOL for the table, only DROP ROWGUIDCOL is allowed; no data type changes are allowed.
The modified column can't be a computed or replicated column.
The modified column can't have a PRIMARY KEY or FOREIGN KEY constraint defined on it.
The modified column can't be referenced in a computed column.
The modified column can't have the type changed to timestamp.
If the modified column participates in an index, the only type changes that are allowed are increasing the length of a variable-length type (for example, VARCHAR(10) to VARCHAR(20)), changing nullability of the column, or both.
If the modified column has a UNIQUE OR CHECK constraint defined on it, the only change allowed is altering the length of a variable-length column. For a UNIQUE constraint, the new length must be greater than the old length.
If the modified column has a default defined on it, the only changes that are allowed are increasing or decreasing the length of a variable-length type, changing nullability, or changing the precision or scale.
The old type of the column should have an allowed implicit conversion to the new type.
The new type always has ANSI_PADDING semantics if applicable, regardless of the current setting.
If conversion of an old type to a new type causes an overflow (arithmetic or size), the ALTER TABLE statement is aborted.

Here's the syntax and an example of using the ALTER COLUMN clause of the ALTER TABLE statement:

SYNTAX ALTER TABLE table-name ALTER COLUMN column-name         { type_name [(  prec  [, scale] ) ] [COLLATE <collation name> ]           [ NULL | NOT NULL ]           | {ADD | DROP} {ROWGUIDCOL | PERSISTED} } EXAMPLE /* Change the length of the emp_lname column in the employee    table from varchar(15) to varchar(30) */ ALTER TABLE employee ALTER COLUMN emp_name varchar(30)

Adding a New Column

You can add a new column, with or without specifying column-level constraints. You can add only one column for each ALTER TABLE statement. If the new column doesn't allow NULLs and isn't an identity column, the new column must have a default constraint defined. SQL Server populates the new column in every row with a NULL, the appropriate identity value, or the specified default. If the newly added column is nullable and has a default constraint, the existing rows of the table are not filled with the default value, but rather with NULL values. You can override this restriction by using the WITH VALUES clause so that the existing rows of the table are filled with the specified default value.

Adding, Dropping, Disabling, or Enabling a Constraint

You can use ALTER TABLE to add, drop, enable, or disable a constraint. The trickiest part of using ALTER TABLE to manipulate constraints is that the word CHECK can be used in three different ways:

To specify a CHECK constraint.
To defer checking of a newly added constraint. In the following example, we're adding a constraint to validate that cust_id in orders matches a cust_id in customer, but we don't want the constraint applied to existing data:
```
ALTER TABLE orders WITH NOCHECK ADD FOREIGN KEY (cust_id) REFERENCES customer (cust_id)
```
Note
Instead of using WITH NOCHECK, I could use WITH CHECK to force the constraint to be applied to existing data, but that's unnecessary because it's the default behavior.
To enable or disable a constraint. In this example, we enable all the constraints on the employee table:

     ALTER TABLE employee      CHECK CONSTRAINT ALL

The only types of constraints that can be disabled are CHECK constraints and FOREIGN KEY constraints, and disabling tells SQL Server not to validate new data as it is added or updated. You should use caution when disabling and re-enabling constraints. If a constraint was part of the table when the table was created or was added to the table using the WITH CHECK option, SQL Server knows the data conforms to the data integrity requirements of the constraint. The SQL Server optimizer can then take advantage of this knowledge in some cases. For example, if you have a constraint that requires col1 to be greater than 0, and then an application submits a query looking for all rows where col1 < 0, if the constraint has always been in effect, the optimizer will know that no rows can satisfy this query and the plan is a very simple plan. However, if the constraint has been disabled and re-enabled without using the WITH CHECK option, there is no guarantee that some of the data in the table won't meet the integrity requirements. You might not have any data less than or equal to 0, but the SQL Server optimizer will not know that when it is devising the plan.

The catalog views sys.check_constraints and sys.foreign_keys each have a column called is_not_trusted. If you re-enable a constraint and don't use the WITH CHECK option to tell SQL Server to revalidate all existing data, the is_not_trusted column will be set to 1.

Although you cannot use ALTER TABLE to disable or enable a PRIMARY KEY or UNIQUE constraint, you can use the ALTER INDEX command to disable the associated index. I'll discuss ALTER INDEX in Chapter 7. You can use ALTER TABLE to drop PRIMARY KEY and UNIQUE constraints, but you need to be aware that dropping one of these constraints automatically drops the associated index. In fact, the only way to drop those indexes is by altering the table to remove the constraint.

Note

You can't use ALTER TABLE to modify a constraint definition. You must use ALTER TABLE to drop the constraint and then use ALTER TABLE to add a new constraint with the new definition.

Dropping a Column

You can use ALTER TABLE to remove one or more columns from a table. However, you can't drop the following columns:

A replicated column
A column used in an index
A column used in a CHECK, FOREIGN KEY, UNIQUE, or PRIMARY KEY constraint
A column associated with a default defined using the DEFAULT keyword or bound to a default object
A column to which a rule is bound

You can drop a column using the following syntax:

ALTER TABLE table-name     DROP COLUMN column-name [, next-column-name]...

Note

Notice the syntax difference between dropping a column and adding a new column: the word COLUMN is required when dropping a column, but not when you add a new column to a table.

Enabling or Disabling a Trigger

You can enable or disable one or more (or all) triggers on a table using the ALTER TABLE command. Triggers are discussed in Inside Microsoft SQL Server 2005: T-SQL Programming.

Internals of Altering Tables

Note that not all the ALTER TABLE variations require SQL Server to change every row when the ALTER TABLE is issued. SQL Server can carry out an ALTER TABLE command in three basic ways:

It might need to change only metadata.
It might need to examine all the existing data to make sure it is compatible with the change but only need to make changes to metadata.
It might need to physically change every row.

In many cases, SQL Server can just change the metadata (primarily the data seen through sys.columns) to reflect the new structure. In particular, the data isn't touched when a column is dropped, when a new column is added and NULL is assumed as the new value for all rows, when the length of a variable-length column is increased, or when a non-nullable column is changed to allow NULLs. The fact that data isn't touched when a column is dropped means that the disk space of the column is not reclaimed. You might have to reclaim the disk space of a dropped column when the row size of a table approaches or has exceeded its limit. You can reclaim space by creating a clustered index on the table or rebuilding an existing clustered index by using ALTER INDEX, as we'll see in Chapter 7.

Some changes to a table's structure require that the data be examined but not modified. For example, when you change the nullability property to disallow NULLs, SQL Server must first make sure there are no NULLs in the existing rows. A variable-length column can be shortened when all the existing data is within the new limit, so the existing data must be checked. If any rows have data longer than the new limit specified in the ALTER TABLE, the command will fail. So you do need to be aware that for a huge table, this can take some time. Changing a fixed-length column to a shorter type, such as changing an int column to smallint or changing a char(10) to char(8), also requires examining all the data to verify that all the existing values can be stored in the new type. However, even though the new data type takes up fewer bytes, the rows on the physical pages are not modified. If you have created a table with an int column, which needs 4 bytes in each row, all rows will use the full 4 bytes. After altering the table to change the int to smallint, we are restricted in the range of data values we can insert, but the rows continue to use 4 bytes for each value, instead of the 2 bytes that smallint requires. You can verify this by using the DBCC PAGE command. Changing a char(10) to char(8) displays similar behavior, and the rows continue to use 10 bytes, but only 8 are allowed to be inserted until the table is rebuilt by creating or re-creating a clustered index.

Other changes to a table's structure require SQL Server to physically change every row, and as it makes the changes, it has to write the appropriate records to the transaction log, so these changes can be extremely resource intensive for a large table. Another negative side effect in most cases is that when a column is altered to increase its length, the old column is not actually replaced. A new column is added to the table, and DBCC PAGE shows you that the old data is still there. I'll let you explore the page dumps for this situation on your own, but we can see some of this unexpected behavior by just looking at the column offsets using the column detail query that I showed you earlier in this chapter.

First create a table with all fixed-length columns, including a smallint in the first position.

CREATE TABLE change (col1 smallint, col2 char(10), col3 char(5)) Now look at the column offsets: SELECT c.name AS column_name, column_id, max_inrow_length, pc.system_type_id, leaf_offset  FROM sys.system_internals_partition_columns pc     JOIN sys.partitions p       ON p.partition_id = pc.partition_id     JOIN sys.columns c          ON column_id = partition_column_id             AND c.object_id = p.object_id WHERE p.object_id=object_id('fixed') RESULTS: column_name column_id   max_inrow_length system_type_id leaf_offset ----------- ----------- ---------------- -------------- ----------- col1        1           2                52             4 col2        2           10               175            6 col3        3           5                175            16

Now change the smallint to int:

ALTER TABLE change    ALTER COLUMN col1 int

And finally, run the column detail query again to see that col1 now starts much later in the row and that no column starts at offset 4 immediately after the row header information. This new column creation due to an ALTER TABLE takes place even before any data has been placed in the table.

column_name column_id   max_inrow_length system_type_id leaf_offset ----------- ----------- ---------------- -------------- ----------- col1        1           4                56             21 col2        2           10               175            6 col3        3           5                175            16

Another drawback to SQL Server's behavior in not actually dropping the old column is that we are now more severely limited in the size of the row. The row size now includes the old column, which is no longer usable or visible (unless you use DBCC PAGE). For example, if I create a table with a couple of large fixed-length character columns, as shown here, I can then ALTER the char(2000) column to be char(3000).

CREATE TABLE bigchange (col1 smallint, col2 char(2000), col3 char(1000)) ALTER TABLE bigchange    ALTER COLUMN col2 char(3000)

At this point, the length of the rows should be just over 4,000 bytes because there is a 3,000-byte column, a 1,000-byte column, and a smallint. However, if I try to add another 3000-byte column, it will fail.

ALTER TABLE bigchange    ADD col4 char(3000) Msg 1701, Level 16, State 1, Line 1 Creating or altering table 'bigchange' failed because the minimum row size would be 9009, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

However, if I just create a table with two 3,000-byte columns and a 1,000-byte column, there will be no problem.

CREATE TABLE nochange (col1 smallint, col2 char(3000), col3 char(1000), col4 char(3000))

Note that there is no way to ALTER a table to rearrange the logical column order or to add a new column in a particular position in the table. A newly added column always gets the next highest column_id value. When you execute SELECT * on a table or look at the metadata with sp_help, the columns are always returned in column_id order. If you need a different order, you have several options:

Don't use SELECT *; always SELECT a list of columns in the order you want to have them returned.
Create a view on the table that SELECTs the columns in the order you want them, and then you can SELECT * from the view or run sp_help on the view.
Create a new table, copy the data from the old table, drop the old table, and rename the new table to the old name. Don't forget to re-create all constraints, indexes, and triggers.

You might think that SQL Server Management Studio can add a new column in a particular position or rearrange the column order, but this is not true. Behind the scenes, the tool is actually using the preceding third option and creating a completely new table with all new indexes, constraints, and triggers. If you wonder why simply adding a new column to an existing (large) table is taking a long time, this is probably the reason.