|
|
||
|
|
||
|
|
||
After the conceptual model is approved and obtains the
Modeling terminology changes when it
|
Conceptual Model |
Logical Model |
|---|---|
|
Entity |
Table |
|
Attribute |
Column |
|
Relationship |
Foreign Key |
To create a logical model, complete the following steps:
Find the appropriate type for each column.
Find the primary key of each table.
Normalize the database.
Add additional validation rules.
|
More
|
Normalization
is the process in a relational database that both reduces redundancy and
|
Each time you define a column to capture an object attribute, you must define a datatype. The
datatype
is an attribute of each column that specifies what kind of data the column will store. For example, you may want the
SQL Server 2005 supports twenty-eight system datatypes and also allows programmers to create their own. The datatypes provided by the system are shown in Table 5-2.
|
bigint |
binary |
bit |
char |
cursor |
|
datetime |
decimal |
float |
image |
int |
|
money |
nchar |
ntext |
numeric |
nvarchar |
|
real |
smalldatetime |
smallint |
smallmoney |
sql_variant |
|
table |
text |
timestamp |
tinyint |
varbinary |
|
varchar |
uniqueidentifier |
xml |
Defining the appropriate datatype is critical in maintaining the integrity of the database. For example, if you use the char datatype to define the Price column, users may store
To define a datatype graphically, you may use your modeling tool or a T-SQL statement. To define a datatype using Microsoft Visio 2003, perform the following steps.
Select the table you want to model.
If the Database Properties window is not displayed, right-click the table and choose Database Properties from the shortcut menu.
Select Columns from the Categories list in the Database Properties window.
Make sure that the Physical DataType (Microsoft SQL Server) option is selected.
Select the column you want to model from the Column list.
Enter the datatype in the Data Type column, or select the datatype from the Combo box.
To define a datatype using T-SQL language, use the CREATE TABLE statement. This statement defines the table and its columns with the
CREATE TABLE <NameofTable> (<ColumnName> <DataType> , <ColumnName> <DataType) , ...)
For example, the following statement creates a department table.
CREATE TABLE Departments(DepartmentName NVARCHAR(50) , DivisionName NVARCHAR(50) , ModifiedDate DATETIME)
SQL Server 2005 offers wide support for numeric attributes, which include integer datatypes, precise decimal numbers, and approximate
Integers and
|
Datatype |
Storage Space |
Min Value to Max Value |
|---|---|---|
|
bit |
1/8 of byte |
0 to 1 |
|
tinyint |
1 byte |
0 to 255 |
|
smallint |
2 bytes |
32,768 to 32,767 |
|
int |
4 bytes |
2,147,483,648 to 2,147,483,647 |
|
bigint |
8 bytes |
9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
Refer to the following guidelines to help you determine which integer datatype to use in different scenarios.
Use the bit datatype for True/False, Flag, or Status attributes. Examples include Sex (Male/Female), Reorder (True/False), Enable (True/False), and InProcess (0/1). Keep in mind that the bit datatype also
Use the tinyint datatype for very small numbers. Examples include Number of Children, Line Number, Floor, and Age. The tinyint datatype stores only positive numbers (or zero) and does not allow negative numbers.
Use the smallint datatype for small numbers. Examples include Room Number, Years, and Seat Number.
Use the int datatype for almost every number you find in business applications. Examples include Quantity Ordered,
Use the bigint datatype only when very large numbers are required. These numbers are rarely required in business applications, but you may encounter them in scientific situations.
Precise numbers and accounting data
Precise numbers
in T-SQL language allow the storage of numbers with decimal parts. This means that a number may contain fractions as long as it is expressed as a division of a power of ten. Financial and accounting information, including prices and amounts, are
SQL Server 2005 provides four precise datatypes: decimal, money, numeric, and smallmoney. The difference between each datatype is the range of numbers they support and the storage space that each column of that datatype occupies. Other important factors include the datatypes precision and scale.
Precision is the maximum number of digits you want a datatype to store. For example, the number 12,345.12 contains seven digits. Precision includes all of the digits before and after the decimal point. The scale of the datatype is the number of decimal digits; therefore, the scale of 12,345.12 is two.
Table 5-4 provides specific information about precision datatypes that may help you make decisions when choosing these datatypes.
|
Datatype |
Space |
Precision |
Scale |
Min Value to Max Value |
|---|---|---|---|---|
|
decimal |
5 bytes |
9 |
05 |
999,999,999 to 999,999,999 |
|
numeric |
9 bytes |
19 |
09 |
10 19 +1 to 10 19 1 |
|
dec |
13 bytes |
28 |
013 |
10 28 +1 to 10 28 1 |
|
17 bytes |
38 |
017 |
10 38 +1 to 10 38 1 |
|
|
money |
8 bytes |
Approx. 18 |
4 |
922,337,203,685,477.5808 to 922,337,203,685,477.5807 |
|
smallmoney |
4 bytes |
Approx. 9 |
4 |
214,748.3648 to 214,748.3647 |
| Note |
The dec datatype is not included as a system datatype, but can be used to define columns in a table. The dec datatype is part of the SQL-92 ANSI standard. |
The numeric and decimal datatypes are equivalent and can be used with different precisions and
Table 5-5 displays the minimum datatype required to store each number.
|
Datatype |
Example |
Precision |
Scale |
|---|---|---|---|
|
decimal |
54,143.9481 |
9 |
4 |
|
numeric |
98,145,875,298.12 |
13 |
2 |
|
dec |
0.12679 |
5 |
5 |
|
money |
103,785,675.9573 |
Fixed |
Fixed |
|
smallmoney |
45,985.4503 |
Fixed |
Fixed |
Scientific and engineering data
Scientific and engineering applications frequently use very large or very small numbers. For example, astronomy applications may store the mass and
For these purposes, SQL Server 2005 offers approximate number datatypes. These datatypes do not store an exact number, but instead use approximate floating-point data. SQL Server 2005 provides three approximate datatypes: float, real, and double. The real and double datatypes are float datatypes with predefined mantissa bits. Table 5-6 provides specific information about approximate datatypes that may help you make decisions about your choice of datatype.
|
Datatype |
Storage Space |
Mantissa Bits |
|---|---|---|
|
float |
4 |
124 |
|
8 |
2553 |
|
|
real |
4 |
24 |
|
double |
8 |
53 |
| Important |
Do not use approximate number datatypes (float, real, double) to store financial information because aggregations will result in round-off errors. |
Because approximate number datatypes use natural logarithms to store values, the critical factor is the precision of the datatype. The real datatype uses twenty-four bits to store the mantissa, and the double datatype uses fifty-three bits, thereby controlling the precision and size of the datatype.
| Note |
A
mantissa
is the positive
|
Users also need to store string attributes in the database. These attributes save information such as Name, Address, Title, and E-mail. These datatypes are
|
Stringc Types |
Character |
Unicode |
|---|---|---|
|
Fixed |
char |
nchar |
|
Variable |
varchar |
nvarchar |
Variable datatypes use the var prefix, and unicode datatypes use the n prefix.
Unicode and character datatypes
Unicode and character datatypes use different
Unicode datatypes
are an industry standard designed to allow text and symbols from all languages to be consistently represented and manipulated by computers. Therefore, information stored in unicode datatypes can be consistently communicated between computers, thereby always resulting in the decoding of the original character. In contrast,
character datatypes
use collations to control the physical storage of character strings by using a specific code page. If the code page
The main advantage of character datatypes is efficiency. In general, character datatypes are more efficient than unicode datatypes because most collations use one byte per character, while their unicode
Fixed and variable-length datatypes
Another option to consider when using string datatypes is whether to use fixed or variable-length datatypes.
Fixed datatypes
(char, nchar) use a permanent amount of space regardless of the value of the column or variable. For example, if you declare a column as type char(10) and it has the value of Test, it will
In contrast, variable-length datatypes adjust their storage to accommodate the value of the column, but require two additional bytes to control the length of the value. Therefore, if you declare a column as type varchar(10) and a row has the value of Test, it will employ six bytes to store the value, two bytes for length control, and four bytes to store the actual value. The same column using nvarchar(10) will use ten bytes (two for length, eight for the actual value), and SQL Server 2005 will not add additional spaces.
String syntax To define string attributes in T-SQL, use the following syntax.
CREATE TABLE <NameofTable> (<ColumnName> CHAR(<Size>)
, <ColumnName> VARCHAR(<MaxSize>)
<ColumnName> NCHAR(<Size>)
, <ColumnName> NVARCHAR(<MaxSize>)
, ...)
For example, the following code will create a Department table to store the attributes Code, Name, and AlternateName.
CREATE TABLE Departments(DepartmentCode CHAR(6) , DepartmentName VARCHAR(50) , AlternateName NVARCHAR(50))
Character datatypes (char, varchar) support up to 8,000 characters per column, and unicode datatypes (nchar, nvarchar) can hold up to 4,000 characters.
Long text datatypes When the attribute you are capturing exceeds the maximum capacity of string datatypes, you may use the varchar(max) or nvarchar(max) datatype. Use these datatypes when the size of column values varies significantly and when the size might exceed 8,000 bytes. These datatypes support up to 2,147,483,647 characters (two gigabytes). The varchar(max) datatype uses collation to code characters, and nvarchar(max) uses unicode. Remember that unicode datatypes occupy two bytes per character; therefore, nvarchar(max) holds only 1,073,741,823 characters.
| Note |
SQL Server 2005 also supports the text and ntext datatypes. Avoid using these datatypes because they are
|
Choosing the appropriate datatype Refer to the following guidelines to help you determine which datatype to use in each situation.
Appropriately size the column to collect the values of the attribute, but do not add additional unnecessary space. Using an adequate length will aid in data validation and provide better performance.
Use fixed-length datatypes (char, nchar) when the values of the attribute do not vary considerably in length. For example, to store a Social Security number, use a char datatype instead of varchar. This will result in better storage usage and performance.
Use variable-length datatypes (varchar, nvarchar) when the values of the attribute vary considerably in length. For example, to store an address attribute, use a varchar datatype instead of char. This will result in better storage usage and performance.
Use character datatypes when all users of the database utilize the same language or when you can set a general collation to represent all characters in their languages. Using character datatypes will result in more efficient databases.
Use unicode datatypes when users will store strings from different writing systems (e.g., Latin, Kanji, Arabic, Greek) and you cannot set a general collation to represent all characters in their languages. Unicode will support a wider range of characters and avoid those errors produced when using different code pages.
In business applications, the need to store date and time attributes occurs frequently. For example, you may want to store the date and time that an order is received or shipped, the date an employee is
|
Type |
Size |
Accuracy |
Min to Max Value |
|---|---|---|---|
|
datetime |
8 bytes |
3.33
|
January 1, 1753, to December 31, 9999 |
|
smalldatetime |
4 bytes |
1 minute |
January 1, 1900, to June 6, 2079 |
| Best Practices |
Do not use string datatypes (char, nchar, varchar, nvarchar) to store dates. Storing dates as strings makes data validation more difficult, exposes the application to internalization errors, and is frequently detrimental to performance. |
Some applications need to store images (such as JPG, GIF, and BMP files) or documents (such as Microsoft Excel workbooks or Microsoft Word documents). To support those needs, SQL Server 2005 supports binary datatypes. Binary datatypes are raw sequences of bytes that SQL Server 2005 does not try to encode or decode as it does with string datatypes. Binary information can be stored in three datatypes: binary(n), varbinary(n), and varbinary(max), as detailed below in Table 5-9.
|
Datatype |
Description |
Max Width |
|---|---|---|
|
binary(n) |
Fixed-length binary data |
8000 bytes |
|
varbinary(n) |
Variable-length binary data |
8000 bytes |
|
varbinary(max) |
Long binary data |
2,147,483,647 bytes |
| Best Practices |
SQL Server 2005 also supports the image datatype. Avoid using the image datatype because it is marked as obsolete, and future versions will remove it. |
The following is an example of how to declare binary datatypes in SQL server 2005.
CREATE TABLE Applicants(, ApplicantID INT , Name VARCHAR(25) , LastName VARCHAR(25) , Curriculum VARBINARY(MAX) , PictureVARBINARY(300000))
SQL Server 2005 also offers additional datatypes, which are listed below.
Uniqueidentifier
A datatype used to stored globally unique identifiers (GUIDs). A GUID is a numbering mechanism that
XML
The xml datatype is one of the most interesting and complex datatypes
T-SQL
CREATE TYPE PhoneNumber FROM VARCHAR(14);
Common language runtime-based UDTs If you want to create your own datatypes without being limited by system datatypes, you may create your own common language runtime (CLR) UDTs using .NET languages such as Visual Basic.NET, C#.NET, and C++.NET. To create a CLR UDT, you must create a structure and use the Microsoft.SqlServer.Server.SqlUserDefinedTypeAttribute . After the structure is created, it must be compiled as an assembly and deployed in the database. Finally, the server must be enabled to execute CLR code.
Determining how your database will implement integrity is an important part of the database designing process. To do so, you must identify which values are valid for each column in your database. After the values have been identified, you then create constraints that help keep bad data from entering your database. SQL Server 2005 provides the following constraints: NOT NULL, DEFAULT, PRIMARY KEY, UNIQUE, CHECK, and FOREIGN KEY.
The NOT NULL constraint is equivalent to declaring an attribute as being required instead of optional. As a result, it will not allow rows that do not have a value in that column. For example, in the following table, the Number, Name, Social Security Number, and Hire Date columns are required and the Termination Date is optional.
CREATE TABLE Employee (EmployeeNumber INT NOT NULL
, EmployeeName VARCHAR(50) NOT NULL
, EmployeeSSN VARCHAR(15) NOT NULL
, HireDate SMALLDATETIME NOT NULL
, TerminationDate SMALLDATETIME NULL)
Try to design all columns with the NOT NULL constraint, and allow NULL values only in rare cases.
Another useful constraint is DEFAULT , which allows you to declare a column value if one is not specified when the row is inserted. For example, if you have an E-mail column in the Employee table, recently hired employees may not have an e-mail account. If you do not want the column to support NULL values, you may create a DEFAULT constraint. The following code creates the table.
CREATE TABLE Employee (EmployeeNumberINTNOT NULL
, EmployeeNameVARCHAR(50)NOT NULL
, EmployeeSSNVARCHAR(15)NOT NULL
, EmailAccountVARCHAR(50)NOT NULL
DEFAULT('unknown@mycompany.com')
, HireDateSMALLDATETIMENOT NULL
, TerminationDateSMALLDATETIMENULL)
Because a table is a set of rows, a row cannot be identified unless you define a primary key. The primary key is a column or combination of columns that identifies each row in the table. For example, the EmployeeNumber column identifies each row in the Employee table because it is a value that is both required (NOT NULL) and distinct in each row (employees do not share employee numbers). The following code creates the PRIMARY KEY constraint of the employee table.
CREATE TABLE Employee (EmployeeNumber INT NOT NULL
PRIMARY KEY
, EmployeeName VARCHAR(50) NOT NULL
, EmployeeSSN VARCHAR(15) NOT NULL
, EmailAccount VARCHAR(50) NOT NULL
DEFAULT('unknown@mycompany.com')
, HireDate SMALLDATETIME NOT NULL
, TerminationDate SMALLDATETIME NULL)
When the primary key involves multiple columns, the PRIMARY KEY constraint is defined at the table level with the following syntax.
CREATE TABLE StateProvince (CountryCode CHAR(3) NOT NULL
, StateCode CHAR(3) NOT NULL
, StateProvinceName VARCHAR(50) NOT NULL
PRIMARY KEY (CountryCode, StateCode))
When you want to enforce the stipulation that a value cannot be repeated in a table, you may define a UNIQUE constraint . For example, if you want the database to control that no employees use the same Social Security number, use a UNIQUE constraint. The following code will check that a Social Security number is used in only one row and that two rows do not have the same value.
CREATE TABLE Employee (EmployeeNumber INT NOT NULL
PRIMARY KEY
, EmployeeName VARCHAR(50) NOT NULL
, EmployeeSSN VARCHAR(15) NOT NULL
UNIQUE)
UNIQUE constraints may also limit a combination of multiple columns. For example, if you want to limit repeated products in a purchase order, the following UNIQUE constraint may help you.
CREATE TABLE OrderDetails (OrderId CHAR(6) NOT NULL
, OrderLine INT NOT NULL
, ProductNumber CHAR(10) NOT NULL
, Quantity DECIMAL(10,2) NOT NULL
, Price DECIMAL(10,2) NOT NULL
, PRIMARY KEY(OrderId, OrderLine)
, UNIQUE(OrderId, ProductNumber))
Certain columns should be limited to a range of values, a pattern, or a certain condition. For example, the Sex column should be limited to Male or Female, the Amount column to positive values, and Social Security Numbers to a string pattern such as ###-##-####. To define such constraints, use CHECK expressions, such as those shown in the code below.
CREATE TABLE Employee (EmployeeNumberINT NOT NULL
CHECK(EmployeeNumber>0)
, EmployeeSSN VARCHAR(15) NOT NULL
CHECK(EmployeeSSN LIKE '[0-9][0-9][0-9][0-9]-[0-9][0-9]-
[0-9][0-9]-[0-9][0-9][0-9][0-9]')
, HireDate SMALLDATETIME NOT NULL
CHECK(HireDate>'19950101')
, TerminationDateSMALLDATETIMENULL
CHECK(TerminationDate IS NULL OR TerminationDate>HireDate))
The FOREIGN KEY constraint limits the values of a column to those that can be found in another table. For example, if you create a DepartmentCode column in the Employee table, you want the database to guarantee that this department exists in the Department table. The FOREIGN KEY constraint links the values of the column with the PRIMARY KEY or UNIQUE constraint in the referenced table, as shown in the following example.
CREATE TABLE StatesProvinces (CountryCode CHAR(3) NOT NULL
FOREIGN KEY REFERENCES Countries(CountryCode)
, StateCode CHAR(3) NOT NULL
, StateProvinceName VARCHAR(50) NOT NULL
, PRIMARY KEY (CountryCode, StateCode)
, UNIQUE (CountryCode, StateProvinceName))
|
|
||
|
|
||
|
|
||