3 4
SQL Server 2000 data is stored in databases. Physically, a database consists of two or more files on one or more disks. This physical implementation is visible only to database administrators, and is transparent to users. The physical optimization of the database is primarily the responsibility of the database administrator. This topic is discussed in Chapter 6.
Logically, a database is structured into components that are visible to users, such as tables, views, and stored procedures. The logical optimization of the database (such as the design of tables and indexes) is primarily the responsibility of the database designer. The topic of logical optimization is beyond the scope of this book but is covered in Designing Relational Database Systems by Rebecca M. Riordan (Microsoft Press, 1999, ISBN 0-7356-0634-X).
Each instance of SQL Server 2000 has four system databases. Table 1.6 lists each of these system databases and briefly describes their function.
In addition, each instance of SQL Server 2000 has one or more user databases. The pubs and Northwind user databases are sample databases that ship with SQL Server 2000. Given sufficient system resources, each instance of SQL Server 2000 can handle thousands of users working in multiple databases simultaneously. See Figure 1.5.
Table 1.6 System Databases in SQL Server 2000
System Database | Description |
master | Records all of the system-level information for a SQL Server 2000 system, including all other databases, login accounts, and system configuration settings. |
tempdb | Stores all temporary tables and stored procedures created by users, as well as temporary worktables used by the relational database engine itself. |
model | Serves as the template that is used whenever a new database is created. |
msdb | SQL Server Agent uses this system database for scheduling alerts and jobs, and recording operators. |
Figure 1.5
SQL Server working with multiple user databases.
Each database consists of at least one data file and one transaction log file. These files are not shared with any other database. To optimize performance and to provide fault tolerance, data and log files are typically spread across multiple drives and frequently use a redundant array of independent disks (RAID).
SQL Server 2000 allocates space from a data file for tables and indexes in 64-KB blocks called extents. Each extent consists of eight contiguous pages of 8 KB each. There are two types of extents: uniform extents that are owned by a single object, and mixed extents that are shared by up to eight objects.
A page is the fundamental unit of data storage in SQL Server 2000, with the page size being 8 KB. In general, data pages store data in rows on each data page. The maximum amount of data contained in a single row is 8060 bytes. Data rows are either organized in some kind of order based on a key in a clustered index (such as zip code), or stored in no particular order if no clustered index exists. The beginning of each page contains a 96-byte header that is used to store system information, such as the amount of free space available on the page.
The transaction log file resides in one or more separate physical files from the data files and contains a series of log records, rather than pages allocated from extents. To optimize performance and aid in redundancy, transaction log files are typically placed on separate disks from data files, and are frequently mirrored using RAID.
Data in SQL Server 2000 is organized into database objects that are visible to users when they connect to a database. Table 1.7 lists these objects and briefly describes their function.
Table 1.7 Database Objects in SQL Server 2000
Database Object | Description |
Tables | A table generally consists of columns and rows of data in a format similar to that of a spreadsheet. Each row in the table represents a unique record, and each column represents a field within the record. A data type specifies what type of data can be stored in a column. |
Views | Views can restrict the rows or the columns of a table that are visible, or can combine data from multiple tables to appear like a single table. A view can also aggregate columns. |
Indexes | An index is a structure associated with a table or view that speeds retrieval of rows from the table or view. Table indexes are either clustered or nonclustered. Clustering means the data is physically ordered based on the index key. |
Keys | A key is a column or group of columns that uniquely identifies a row (PRIMARY KEY), defines the relationship between two tables (FOREIGN KEY), or is used to build an index. |
User-defined data types | A user-defined data type is a custom data type, based on a predefined SQL Server 2000 data type. It is used to make a table structure more meaningful to programmers and help ensure that columns holding similar classes of data have the same base data type. |
Stored procedures | A stored procedure is a group of Transact-SQL statements compiled into a single execution plan. The procedure is used for performance optimization and to control access. |
Constraints | Constraints define rules regarding the values allowed in columns and are the standard mechanism for enforcing data integrity. |
Defaults | A default specifies what values are used in a column in the event that you do not specify a value for the column when you are inserting a row. |
Triggers | A trigger is a special class of stored procedure defined to execute automatically when an UPDATE, INSERT, or DELETE statement is issued against a table or view. |
User-defined functions | A user-defined function is a subroutine made up of one or more Transact-SQL statements used to encapsulate code for reuse. A function can have a maximum of 1024 input parameters. User-defined functions can be used in place of views and stored procedures. |
The logical design of the database, including the tables and the relationships between them, is the core of an optimized relational database. Optimizing logical database design begins with a normalized database design. Normalization is the process of removing redundancies from the data. Normalization often involves breaking data from a single file into two or more logical tables in a relational database. For example, you can place customer detail information in one table and place order information in another table, and then relate the two tables based on a customer number. Transact-SQL queries can then recombine the table data by using relational join operations, when needed. By avoiding the need to update the same data in multiple places, normalization improves the efficiency of a client application and reduces the opportunities for introducing errors due to inconsistent data. Some of the benefits of normalization include
However, there are tradeoffs to normalization. A database that is used primarily for decision support (as opposed to update-intensive transaction processing) might not have redundant updates and might be more understandable and efficient for queries if the design is not fully normalized. Nevertheless, data that is not normalized is a more common design problem in database applications than overnormalized data.
Database administrators can detect problems caused by poor database design and must work with database designers, and sometimes client and server application designers, to optimize the overall performance of a database on an instance of SQL Server 2000. Performance monitoring is discussed in Chapter 14.
SQL Server consists of system and user databases that are organized physically into data and log files for each database that are placed on one or more disks. SQL Server is organized logically into objects, such as tables, indexes, and views. Database administrators optimize the physical design and database designers optimize the logical design of databases.