Databases Under the Hood

A database consists of user-defined space for the permanent storage of user objects such as tables and indexes. This space is allocated in one or more operating system files.

Databases are divided into logical pages (of 8 KB each), and within each file the pages are numbered contiguously from 0 to x, with the upper value x being defined by the size of the file. You can refer to any page by specifying a database ID, a file ID, and a page number. When you use the ALTER DATABASE command to enlarge a file, the new space is added to the end of the file. That is, the first page of the newly allocated space is page x + 1 on the file you're enlarging. When you shrink a database by using either the DBCC SHRINKDATABASE or DBCC SHRINKFILE command, pages are removed starting at the highest page in the database (at the end) and moving toward lower-numbered pages. This ensures that page numbers within a file are always contiguous.

The master database contains 50 system tables: the same 19 tables found in user databases, plus 31 tables that are found only in master. Many of these tables—including syslockinfo, sysperfinfo, syscurconfigs, and sysprocesses—don't physically exist in the master database; rather, they're built dynamically each time a user queries them. Twelve of these tables contain information about remote servers; they don't contain any data until you define other servers with which you'll be communicating. The master database also contains 26 system views. Twenty of these are INFORMATION_SCHEMA views; I'll discuss them in Chapter 6. Except for the INFORMATION_SCHEMA views, all these system objects, in both the master and user databases, have names that begin with sys. To see a list of the system tables and views that are found in only the master database, you can execute the following query:

 SELECT type, name FROM master..sysobjects WHERE type IN ('s', 'v') AND name NOT IN (SELECT name FROM model..sysobjects) GO 

The master database also contains nine tables that can be referred to as pseudo-system tables. These table names begin with spt_ and are used as lookup tables by various system procedures, but they aren't true system tables. You should never modify them directly because that might break some of the system procedures. However, deleting them doesn't invalidate the basic integrity of the database, which is what happens if true system tables are altered.

When you create a new database using the CREATE DATABASE statement, it is given a unique database ID, or dbid, and a new row is inserted in the table master..sysdatabases for that database. Only the master database contains a sysdatabases table. Figure 5-4 below depicts a sample sysdatabases table, and Table 5-2 shows its columns.

The rows in sysdatabases are updated when a database's ownership or name is changed or when database options are changed. (I'll discuss this in more detail later in the chapter.)

Table 5-2. Columns of the sysdatabases table.

Column Information
name Name of the database.
dbid Unique database ID; can be reused when the database is dropped.
sid System ID of the database creator.
mode Locking mode; used internally while a database is being created.
status Bit mask that shows whether a database is read-only, off line, designated for use by a single user only, and so on. Some of the bits can be set by a database owner using the ALTER DATABASE command; others are set internally. (The SQL Server documentation shows most of the possible bit-mask values.)
status2 Another bit mask-like status, with bits indicating additional database options.
crdate For user databases, the date when the database was created. For tempdb, this is the date and time that SQL Server was last started. For other system databases, this date is not really useful. Depending on decisions made during installation, it could be the date Microsoft originally created the database prior to shipping the code, or it could be the date that you installed SQL Server.
reserved Reserved for future use.
category Another bit mask-like status. Contains information about whether the database is involved with replication.
cmptlevel Compatibility level for the database. (I'll discuss this concept briefly at the end of the chapter.)
filename Operating system path and name of the primary file.
version Internal version of SQL Server that was used to create the database.

click to view at full size.

Figure 5-4. A partial listing of a sysdatabases table.

Space Allocation

The space in a database is used for storing tables and indexes. The space is managed in units called extents. An extent is made up of eight logically contiguous pages (or 64 KB of space). To make space allocation more efficient, SQL Server 2000 doesn't allocate entire extents to tables with small amounts of data. SQL Server 2000 has two types of extents:

  • Uniform extents These are owned by a single object; all eight pages in the extent can be used only by the owning object.
  • Mixed extents These are shared by up to eight objects.

SQL Server allocates pages for a new table or index from mixed extents. When the table or index grows to eight pages, all future allocations use uniform extents.

When a table or index needs more space, SQL Server needs to find space that's available to be allocated. If the table or index is still less than eight pages total, SQL Server must find a mixed extent with space available. If the table or index is eight pages or larger, SQL Server must find a free uniform extent.

SQL Server uses two special types of pages to record which extents have been allocated and which type of use (mixed or uniform) the extent is available for:

  • Global Allocation Map (GAM) pages These pages record which extents have been allocated for any type of use. A GAM has a bit for each extent in the interval it covers. If the bit is 0, the corresponding extent is in use; if the bit is 1, the extent is free. Since there are almost 8000 bytes, or 64,000 bits, available on the page after the header and other overhead are accounted for, each GAM can cover about 64,000 extents, or almost 4 GB of data. This means that one GAM page exists in a file for every 4 GB of size.
  • Shared Global Allocation Map (SGAM) pages These pages record which extents are currently used as mixed extents and have at least one unused page. Just like a GAM, each SGAM covers about 64,000 extents, or almost 4 GB of data. The SGAM has a bit for each extent in the interval it covers. If the bit is 1, the extent is being used as a mixed extent and has free pages; if the bit is 0, the extent isn't being used as a mixed extent, or it's a mixed extent whose pages are all in use.

Table 5-3 shows the bit patterns that each extent has set in the GAM and SGAM based on its current use.

Table 5-3. Bit settings in GAM and SGAM pages.

Current Use of Extent GAM Bit Setting SGAM Bit Setting
Free, not in use 1 0
Uniform extent, or full mixed extent 0 0
Mixed extent with free pages 0 1

If SQL Server needs to find a new, completely unused extent, it can use any extent with a corresponding bit value of 1 in the GAM page. If it needs to find a mixed extent with available space (one or more free pages), it finds an extent with a value in the GAM of 0 and a value in the SGAM of 1.

SQL Server can quickly locate the GAMs in a file because a GAM is always the third page in any database file (page 2). An SGAM is the fourth page (page 3). Another GAM appears every 511,230 pages after the first GAM on page 2, and another SGAM every 511,230 pages after the first SGAM on page 3. Page 0 in any file is the File Header page, and only one exists per file. Page 1 is a Page Free Space (PFS) page (which I'll discuss shortly). In Chapter 6, I'll say more about how individual pages within a table look. For now, because I'm talking about space allocation, I'll examine how to keep track of which pages belong to which tables.

Index Allocation Map (IAM) pages map the extents in a database file used by a heap or index. Recall from Chapter 3 that a heap is a table without a clustered index. Each heap or index has one or more IAM pages that record all the extents allocated to the object. A heap or index has at least one IAM for each file on which it has extents. A heap or index can have more than one IAM on a file if the range of the extents exceeds the range that an IAM can record.

An IAM contains a small header, eight page-pointer slots, and a set of bits that map a range of extents onto a file. The header has the address of the first extent in the range mapped by the IAM. The eight page-pointer slots might contain pointers to pages belonging to the relevant objects that are contained in mixed extents; only the first IAM for an object has values in these pointers. Once an object takes up more than eight pages, all its extents are uniform extents—which means that an object will never need more than eight pointers to pages in mixed extents. If rows have been deleted from a table, the table can actually use fewer than eight of these pointers. Each bit of the bitmap represents an extent in the range, regardless of whether the extent is allocated to the object owning the IAM. If a bit is on, the relative extent in the range is allocated to the object owning the IAM; if a bit is off, the relative extent isn't allocated to the object owning the IAM.

For example, if the bit pattern in the first byte of the IAM is 1100 0000, the first and second extents in the range covered by the IAM are allocated to the object owning the IAM and extents 3 through 8 aren't allocated to the object owning the IAM.

IAM pages are allocated as needed for each object and are located randomly in the database file. Each IAM covers a possible range of about 512,000 pages. Sysindexes.FirstIAM points to the first IAM page for an object. All the IAM pages for that object are linked in a chain.

NOTE


In a heap, the data pages and the rows within them aren't in any specific order and aren't linked together. The only logical connection between data pages is recorded in the IAM pages.

Once extents have been allocated to an object, SQL Server can use pages in those extents to insert new data. If the data is to be inserted into a B-tree, the location of the new data is based on the ordering within the B-tree. If the data is to be inserted into a heap, the data can be inserted into any available space. PFS pages within a file record whether an individual page has been allocated and the amount of space free on each page. Each PFS page covers 8088 contiguous pages (almost 64 MB). For each page, the PFS has 1 byte recording whether the page is empty, 1-50 percent full, 51-80 percent full, 81-95 percent full, or 96-100 percent full. SQL Server uses this information when it needs to find a page with free space available to hold a newly inserted row. The second page (page 1) of a file is a PFS page, as is every 8088th page thereafter.

There are also two other kinds of special pages within a data file. The seventh page (page 6) is called a DCM (Differential Changed Map) page and keeps track of which extents in a file have been modified since the last full database backup. The eighth page (page 7) of a file is called a BCM (Bulk Changed Map) page and is used when an extent in the file is used in a minimally or bulk-logged operation. I'll tell you more about these two kinds of pages in the "Backing Up and Restoring a Database" section later in this chapter. Like GAM and SGAM pages, DCM and BCM pages have one bit for each extent in the section of the file they represent. They occur at regular intervals, every 511,230 pages.



Inside Microsoft SQL Server 2000
Inside Microsoft SQL Server 2000
ISBN: 0735609985
EAN: 2147483647
Year: 2005
Pages: 179
Authors: Kalen Delaney

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net