3 4
In this section, we'll review some simple but important concepts about tables. We include a sample database table to introduce you to the basic elements of a table, and you'll learn about system data types and how to create and delete user-defined data types.
A table is an object in a database that stores data in a collection of rows and columns. A table is defined by the columns it contains. Data can thus be organized in a spreadsheet-like format, as illustrated in Table 10-1, which shows a sample database table named Product_Info. (In our examples, we will create this table in the MyDB database, which we created in Chapter 9.)
The Product_Info table is used to store information about each product that is for sale in a store. When a product becomes available for sale, its data is added as a new row in this table. The table contains five columns of information: Product_ID, Product_Name, Description, Price, and Brand_ID. Table 10-1 shows a sampling of three rows of data from the Product_Info table. (The T-SQL command used to create this table, without any data, is shown in the section "Selecting the Correct Data Type" later in this chapter. See Chapter 20 for details on using the INSERT command to insert data into a table.)
Table 10-1. Product_Info database
| Product_ID | Product_Name | Description | Price | Brand_ID | 
|---|---|---|---|---|
| 1 | Five-foot tent | For one or two persons | 80.00 | 12 | 
| 2 | Mini-stove | Kerosene-fueled | 20.00 | 33 | 
| 3 | Backpack | Steel-framed | 60.00 | 15 | 
We'll return to this sample database table throughout this chapter to illustrate more complex aspects of table creation. But first we'll continue to explain the basics you'll need to know to create a table.
To define a table, you must decide what columns to define and what type of data, such as character or numerical data, each column is allowed to hold. You should also decide on an allowable range for that data—for example, you could decide to allow up to 30 characters or 4-byte numbers. You specify these attributes by assigning each column a data type, which is a set of attributes that determine what type and range of data that column can hold. SQL Server provides a number of system data types you can use, or you can create your own by building on the system types. (You can't change a system data type, but you can create a completely new type.)
As mentioned, you specify a data type for each column of a table. Assigning a data type to a column sets the following attributes:
Data types can also affect columns for views, parameters in stored procedures, variables, and T-SQL functions that return one or more data values. The built-in data types provided by SQL Server are defined in Table 10-2. SQL Server 2000 introduces three data types—bigint, sql_variant, and table. (With a few exceptions, noted in this table, the same data types are used for all of the mentioned objects.)
Table 10-2. System data types in SQL Server 2000
| Data type | Description | Storage size | 
|---|---|---|
| bigint | An 8-byte integer (whole number). | 8 bytes | 
| binary[(n)] | Fixed-length binary data of n bytes, where n is a value from 1 through 8000. Use binary when data entries in a column are expected to be close to the same size. | n + 4 bytes | 
| bit | Integer data type that can be a value of 1, 0, or NULL. Bit columns cannot have indexes on them | 1 byte for a table with up to 8-bit columns, 2 bytes for a table with 9-bit through 16-bit columns, and so on | 
| char[(n)] | Fixed-length non-Unicode character data with length of n characters, where n is a value from 1 through 8000 | n bytes | 
| cursor | A reference to a cursor. Can be used only for variables and stored procedure parameters | Not applicable | 
| datetime | Date and time data from January 1, 1753 through December 31, 9999, with accuracy to 3.33 milliseconds | 8 bytes | 
| decimal[(p,[s])] or numeric[(p,[s])] | Fixed-precision and fixed-scale numbers. (The data type numeric is a synonym for decimal.)Precision (p) specifies the total number of digits that can be stored, both to the left and to the right of the decimal point. Scale (s) specifies the maximum number of digits that can be stored to the right of the decimal point. Scale must be less than or equal to precision. The minimum precision is 1, and the maximum precision is 28 unless SQL Server is started with the -p parameter, in which case, precision can be up to 38. | 5 through 17 bytes, depending on precision | 
| float[(n)] | Floating-precision numerical data that can range from -1.79E +308 through 1.79E +308. The value n is the number of bits used to store the mantissa of the float number and can range from 1 to 53 | 4 through 8 bytes, depending on precision | 
| image | Used for variable-length binary data longer than 8000 bytes, with a maximum of 2^31 - 1 bytes. An image column entry is a pointer to the location of the image data value. The data is stored separately from the table data | 16 bytes for the pointer | 
| integer or int | Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647) | 4 bytes | 
| money | Monetary data values from -2^63 (-922,337,203,685,477.5808) through 2^63 - 1 (922,337,203,685,477.5807), with accuracy to one ten-thousandth of a monetary unit | 8 bytes | 
| nchar[(n)] | Fixed-length Unicode character data of n characters, where n is a value from 1 through 4000.Unicode characters use 2 bytes per character and can support all international characters- | 2 bytes * the number of characters entered | 
| ntext | Variable-length Unicode data with a maximum length of 2^30 - 1 (1,073,741,823) characters.The column entry for ntext is a pointer to the location of the data. The data is stored separately from the table data | 16 bytes for the pointer and 2 bytes * the number of characters entered for the data | 
| nvarchar | Variable-length Unicode data of n characters, where n is a value from 1 through 4000. Recall that Unicode characters use 2 bytes per character and can support all international characters. | 2 bytes * the number of characters entered | 
| real | Floating-precision numerical data that can range from ?3.40E+38 through 3.40E+38. The synonym for real is float(24) | 4 bytes | 
| smalldatetime | Date and time data from January 1, 1900 through June 6, 2079, with accuracy to the minute (less precise than the datetime data type) | 4 bytes | 
| smallint | Integer data from -2^15 (-32,768) through 2^15 - 1 (32,767) | 2 bytes | 
| smallmoney | Monetary data values from -214,748.3648 through 214,748.3647, with accuracy to one ten-thousandth of a monetary unit | 4 bytes | 
| sql_variant | Allows values of different data types. The data value and data describing that value its base data type, scale, precision, maximum size, and collation are stored in this column | Size varies | 
| sysname | A special, system-supplied, SQL Server user-defined data type. The sysname data type is defined by SQL Server as nvarchar(128), which means that it can contain 128 Unicode characters (or 256 bytes). Use sysname to refer to columns that store object names | 256 bytes | 
| table | Similar to using a temporary table the declaration includes a column list and data types. Can be used to define a local variable or for the return value of a user-defined function. | Varies with table definition | 
| text | Used for variable-length non-Unicode character data longer than 8000 bytes. A text column entry can hold up to 2^31 - 1 characters. It is a pointer to the location of the data value. The data is stored separately from the table data | 16 bytes for the pointer | 
| timestamp | A timestamp column is automatically updated every time a row is inserted or updated. Each table can have only one timestamp column. | 8 bytes | 
| tinyint | Integer data from 0 through 255. | 1 byte | 
| unique-identifier | Stores a 16-byte binary value that is a globally unique identifier (GUID) | 16 bytes | 
| varbinary | Variable-length binary data of n bytes, where n is a value from 1 through 8000. Use varbinary when data entries in a column are expected to vary considerably in size. | Actual length of data entered + 4 bytes | 
| varchar[(n)] | Variable-length non-Unicode character data with a length of n characters, where n is a value from 1 through 8000. | Actual length of data entered | 
Selecting the correct data type for each column is important when creating tables. You want the range of values that you need to store in the column to be permitted, and you want to reject values that should not be permitted (such as a character value in a column that requires a number). Data types will take care of this for you. Two questions can help guide you in choosing data types:
Selecting the appropriate data type is a fairly straightforward process. The data that will be entered in a column must conform to the data type specified for that column. Therefore, you should select the data type that best covers the range of values your column might hold over the lifetime of your application, while limiting the amount of wasted space. Wasted space is space that is allocated for a column entry but not used. For example, suppose you have a column that will need to hold only integer values that range from 1 through 100. The integer data type would certainly allow those values, but each integer takes up 4 bytes of space. The tinyint data type allows values from 0 through 255, and it takes up only 1 byte of space. In this case, tinyint would be the best choice because it would save disk space when storing that column's data.
Next you must determine whether to use fixed-length or variable-length data types. If all of the values in a column will be near the same size, a fixed-length data type is more efficient because of the overhead involved with variable-length types. Generally, variable-length data types are appropriate when you expect a wide variance in the length of the data stored in the column and when the column data will not change often. Variable-length data types include varchar, nvarchar, varbinary, text, ntext, and image. Using variable-length data types can result in significant storage space savings. For example, if you define a fixed-length data type that is big enough to store the largest possible value in a column, all smaller values in that column will take up the same big amount of storage space as the largest value does. This results in a huge waste of space if only a small percentage of rows hold the largest value. If the majority of rows will hold a smaller value, the extra space used to hold those smaller values is wasted space. On the other hand, if you define a variable-length data type, the shorter values will take up only the space they need. But again, variable-length data types will require more processing overhead. So if you don't need the variable-length type, use fixed-length. If it makes sense to use a variable-length type because of the space savings, by all means, use it.
So how do you make choices about data type and length for your tables? Follow the preceding guidelines, and make the choices that best fit your application needs. In general, try not to be wasteful with space when you set your column lengths, and don't forget to consider future requirements.
Before we continue, let's take a look at the T-SQL CREATE TABLE command, which can be used to create the Product_Info table shown in Table 10-1 earlier in this chapter. For this example, we will use only system data types and fixed-length columns.
When you issue T-SQL commands to create a table, the table will be created in whichever database you are currently using. To use a particular database, run the USE database_name command, as shown in the following code. In this example, our database is named MyDB. The keyword GO indicates that any previous commands should now be executed. (See Chapter 13 for more details about using T-SQL.)
USE MyDB GO CREATE TABLE Product_Info ( Product_ID smallint, Product_Name char(20), Description char(30), Price smallmoney, Brand_ID smallint ) GO
Let's look at what happens in the preceding code. After the CREATE TABLE command is given, the table name Product_Info is specified. Between the parentheses, each column is then defined by listing the column name followed by its data type. The lengths for the two char data types are set at 20 and 30 because most names will be 20 characters or fewer and most descriptions will be 30 characters or fewer. Product_ID and Brand_ID are both set to the data type smallint instead of tinyint or int because we anticipate more than 255 products and brand types (the maximum value for tinyint) yet fewer than 32,767 (the maximum value for smallint ). Because we will not need values above 32,767, we would be wasting space if we used the int type.
User-defined data types, or alias data types, are system data types that have been customized. Customizing, or defining, a data type is useful when you have several tables that must store the same type of data in a column and you want to ensure that the corresponding columns in each table have exactly the same type, length, and nullability. You can define a data type with a descriptive name for ease of programming and consistency. Then you can use the data type when you create the various tables.
For example, suppose we have another table—this one named Brands—in the same database as the one the Product_Info table is in. The Brands table will have a Brand_ID column that corresponds to the Brand_ID column in the Product_Info table, and the Brands table will provide the brand name and other pertinent brand information. To ensure that the Brand_ID columns in the two tables have the same data type and do not allow null values, you could create a user-defined data type and assign it to both columns. Now imagine that you have several tables with columns that need to have the same attributes. You might not remember whether you used smallint or tinyint for that column in one table or whether you allowed null values. If you had defined a data type with a descriptive name, you would not have to worry what its attributes were each time you used it, and you would be assured that your data types would be consistent among tables.
NOTE
When you define a data type in a particular database, it can be used in only that database. However, if you define the data type in the model database (the template for all other databases), it will exist in all new databases.
You might want to create data types for data such as phone numbers, postal codes, social security numbers, and any other data that you can define clearly and that you will use in more than one database table. When you define a data type, you must supply the following pieces of information:
Once you have decided on these details, you are ready to create the data type. To create a user-defined data type using Enterprise Manager, follow these steps:
   
  
  
 
Figure 10-2. The User-Defined Data Type Properties window.
If you have created a user-defined data type and are no longer using it (or if you made a mistake when creating it and you want to create it again), you can delete it. To delete a user-defined data type, follow these steps:
   
  
  
 
Figure 10-4. The Drop Objects dialog box.
   
  
The list box on the left side of the Dependencies dialog box displays the database objects that depend on your user-defined data type, and the list box on the right displays the objects that your data type depends on. If your data type is being used by any tables or objects (as the data type in our case is), you will not be allowed to delete it—when you try to do so (by following step 5), an error message will appear, as shown in Figure 10-6.
  
 
Figure 10-6. The error message that appears if you attempt to delete a data type that is in use.
The system stored procedure sp_addtype is a T-SQL command used to add a user-defined data type. Running the command while you are using the model database will enable all new user-defined databases to use the new data type because they are created with the same attributes as the model database. Running the command while you are using a user-defined database will enable the new data type to be used in only that database. (Remember, to use a particular database, you must run the USE database_name command.) The following T-SQL command creates the user-defined data type brand_type in the model database:
USE model GO sp_addtype brand_type, 'smallint', 'NOT NULL' GO
The three parameters for sp_addtype are the name of the user-defined data type, the system data type the new data type is based on, and the nullability of the new data type. This new data type, brand_type, will appear in all new user-defined databases. If you created the user-defined data type in a user database and you want to see the new type in Enterprise Manager, choose Refresh from the Enterprise Manager Action menu.
To drop a user-defined data type that is not in use, run the sp_droptype command in the database in which the data type is defined. The following T-SQL command drops the user-defined data type brand_type from the model database. (Again, if you create this data type in a user database and want Enterprise Manager to show that the type was dropped, choose Refresh from the Enterprise Manager Action menu.)
USE model GO sp_droptype brand_type GO
Let's return to our database table example. We will re-create the Product_Info table, using the new user-defined data type, brand_type, and then we will create the Brands table. It will have a Brand_ID column just as the Product_Info table has, and we will use the same user-defined type, brand_type. First we must drop the old Product_Info table so we can re-create it. The code to perform all these operations appears below.
USE MyDB GO DROP TABLE Product_Info GO CREATE TABLE Product_Info ( Product_ID smallint, Product_Name char(20), Description char(30), Price smallmoney, Brand_ID brand_type ) GO CREATE TABLE Brands ( Brand_ID brand_type, Brand_Name char(30), Supplier_ID smallint ) GO
By assigning the brand_type data type to the Brand_ID columns of both tables, we are guaranteed that the columns will have the same attributes. We don't have to remember the specifics of the underlying data type, but now we can use this data type for all Brand_ID columns.
Who needs to keep track of these user-defined types? The database administrator (DBA) will need to know what user-defined types were used in order to understand whether they are correct and were used properly. (The DBA was probably the person who defined them to begin with, though.) And the application programmer will need to know how these data types are defined when he or she is writing application code. But end users will not need to know about user-defined data types because they will not know the difference.
