Lesson 2: Understanding the Transaction Log Architecture | MCSE Microsoft SQL Server 2000 System Administration(c) Training Kit (c) Exam 70-228

3 4

Each database contains at least one transaction log file. The transaction log records changes made to a database and stores sufficient information to allow changes to be undone (rolled back) or redone (rolled forward). This lesson covers the architecture of the transaction log, including the organization of transaction log files. This lesson also covers how the transaction log works and how the various recovery models affect the transaction log.

After this lesson, you will be able to

Understand the organization of transaction log files
Understand how the transaction log functions
Understand SQL Server 2000 recovery models
View the properties of transaction log files

Estimated lesson time: 30 minutes

Introducing Transaction Log Files

Each SQL Server 2000 database has at least one transaction log file and can have multiple transaction log files spread across a number of disks. Each transaction log file is a separate operating system file and is used by only one database. Each transaction log file generally has the .LDF filename extension (this extension is not required).

Each transaction log has a logical filename that is used in Transact-SQL statements and a physical filename that is used by the Windows operating system. Additional file properties include the file ID number, initial file size, file growth increment (if any), and maximum file size. Unlike data files that contain pages, transaction log files contain a series of transaction log records. A sequential log sequence number (LSN) identifies each transaction log record. Regardless of the number of physical log files, SQL Server 2000 treats the transaction log as one continuous log.

SQL Server 2000 logically divides each physical transaction log file into a number of virtual log files (VLFs). The number and size of virtual log files are determined dynamically based on the size of each transaction log file. Each transaction log file has at least two VLFs. Each growth increment (if any) is treated as a separate physical file with its own VLFs. The number or size of VLFs cannot be configured or set directly by the database administrator. SQL Server 2000 tries to maintain a small number of virtual file logs because it operates most efficiently with a small number of VLFs.

Note

You should create a transaction log large enough to prevent the need for frequent growth. If automatic growth is required, you should set a reasonable growth increment to prevent many small growth increments because small growth increments will result in many small virtual log files, which can slow down recovery.

When a database is created, the logical transaction log begins at the start of the first physical log file, which is also the beginning of the first virtual log file. The logical transaction log is made up of the portion of the transaction log that is required for recovery and backup. The portion of the transaction log required for recovery and backup will vary with the recovery model chosen. Backup and restore strategies are covered in Chapter 8.

How the Transaction Log Works

SQL Server 2000 uses a buffer cache, which is an in-memory structure, into which it retrieves data pages from disk for use by applications and users. Each modification to a data page is made to the copy of the page in the buffer cache. A modified buffer page in the cache that has not yet been written to disk is called a dirty page. The modification is recorded in the transaction log before it is written to disk. For this reason, the SQL Server 2000 transaction log is called a write-ahead transaction log. SQL Server 2000 has internal logic to ensure that a modification is recorded in the transaction log before the associated dirty page is written to disk.When SQL Server writes the dirty page in the cache to the disk, it is called flushing the page.

A transaction log record contains sufficient information to roll any database modification back or forward if necessary, including any extent allocations or index modifications. This ensures that any modification written to disk (such as a change to a data page or the creation of a new database object) can be rolled back in case the transaction that caused the modification fails to complete for any reason (such as a server failure or a rollback command), or rolled forward in case a completed transaction is not completely written to disk for any reason (such as a server failure).

Note

Because of this rollback capacity, a backup of the transaction log allows a database to be rebuilt when a drive containing a data file fails. The rollback capacity is also the reason that the transaction log file for a database should be on a different drive than the data file.

SQL Server 2000 periodically writes dirty pages to disk from the buffer cache. These writes occur either when a database checkpoint process occurs or when an operating system thread (either an individual worker thread or a lazywriter thread) scans for dirty pages, writes the dirty pages to disk, and then clears space in the buffer cache to hold new data pages. Operating system threads may write dirty pages to disk before SQL Server 2000 knows whether the transaction is complete. However, if a transaction rolls back or never completes, the transaction log ensures that modifications made to disk by transactions that did not complete will be rolled back either via a rollback command or when the server restarts in the case of a server failure.

Checkpoint Process

The checkpoint process is designed to minimize the recovery time if the server fails, by minimizing the number of pages in the buffer cache that have not been written to disk. Checkpoints occur whenever

A CHECKPOINT statement is issued.
The ALTER DATABASE statement is used.
An instance of SQL Server 2000 is stopped normally.
An automatic checkpoint is issued. Automatic checkpoints are generated periodically based on the number of records in the active portion of the transaction log, not on the amount of time that has elapsed since the last checkpoint.

The checkpoint process records the lowest LSN that must be present for a successful rollback of an incomplete transaction. This number is called the minimum LSN (MinLSN). The MinLSN is based on the lowest LSN of the oldest active transaction, the beginning of the oldest replication transaction that has not been replicated yet to all subscribers, or the start of the checkpoint. The portion of the transaction log from the MinLSN to the most recent transaction log record is the active portion of the transaction log and must be present to ensure a successful rollback, if necessary.Whenever SQL Server 2000 starts (either normally or after a failure), a recovery process occurs on each database. The recovery process checks the transaction log for completed transactions that were not written to disk and rolls them forward. It also checks the transaction log for incomplete transactions and makes sure they were not written to disk. If they were written to disk, they are removed from the disk. The MinLSN from the most recent checkpoint identifies the earliest LSN that SQL Server 2000 must look at during this recovery process.

All transaction log records lower than the MinLSN are no longer active (the checkpoint ensures that records older than the MinLSN have been written to disk). To reuse this space, the transaction log records must be truncated (deleted) from the transaction log file. The smallest unit of truncation is an individual VLF file. If any part of a VLF is part of the active log, that VLF cannot be truncated. If the Simple Recovery model is used, the checkpoint process simply truncates each VLF within the inactive portion of the transaction log (allowing these VLFs to be reused). If the Full Recovery or Bulk-Logged Recovery models are used, you must back up the transaction log to truncate the inactive portion of the transaction log. Chapters 8 and 9 cover transaction log backups.

Note

Log truncations must be performed from the parts of the log before the MinLSN and can never be performed on any part of the active log.

Figure 5.3 illustrates the transaction log after a checkpoint has occurred and the inactive portion of the transaction log has been truncated. Notice that the MinLSN is earlier than the LSN of the checkpoint.

The checkpoint process frees space from the physical transaction log file so that the logical log file can reuse space when it reaches the end of the last physical transaction log file. When the end of the logical transaction log reaches the end of the last physical transaction log file, the logical transaction log wraps to the beginning of the first physical file (provided that the first VLF has been truncated). If the first VLF has not been truncated and the transaction log is not set to autogrow (or the drive is out of disk space), SQL Server 2000 cannot continue to function. Figure 5.4 illustrates this wrapping of the logical log to the beginning of the first physical transaction log file.

figure 5.3 - the transaction log after truncation of the inactive portion.

Figure 5.3

The transaction log after truncation of the inactive portion.

figure 5.4 - wrapping of the logical log.

Figure 5.4

Wrapping of the logical log.

Operating System Threads

SQL Server 2000 uses individual worker threads and a lazywriter thread to periodically scan the memory buffer, schedule dirty pages for asynchronous writes to disk, and free inactive buffer pages for reuse. A thread is an operating system component that allows multiple processes to execute using separate asynchronous execution paths. The write of each dirty page is recorded in the transaction log before the write to disk can occur. Individual worker threads are threads from other SQL Server 2000 processes, such as an asynchronous read request by a user. These individual worker threads scan the buffer cache while waiting for their primary task to complete. A SQL Server 2000 lazywriter thread also periodically scans the free buffer list. If the free buffer list is below a certain point (dependent on the size of the cache), the lazywriter thread scans the buffer cache to free buffer space. The term "lazywriter" refers to the fact that this lazywriter thread sleeps for an interval of time, awakes and scans the memory buffer, and then goes back to sleep.

Note

The individual worker threads and the lazywriter thread write most of the dirty pages to disk between checkpoints, and the checkpoint process generally finds few dirty pages to write to disk. The difference between the threads and the checkpoint process is that checkpoints do not place the buffer pages back on the free list. These three processes work together to smooth out the writing of data to disk in order to minimize disk bottlenecks and optimize performance.

Introducing Recovery Models

SQL Server 2000 provides a choice of three recovery models: Simple, Full, and Bulk-Logged. Each database has a recovery model. The model chosen affects the size of the transaction log and the backup and recovery choices. Backup and recovery strategies are covered in Chapter 8.

Full Recovery Model

The Full Recovery model gives you the ability to recover a database to the point of failure or to a specific point in time. When a database uses the Full Recovery model, all operations are fully logged. This includes full logging of all large-scale operations (such as index creation and bulk loading of data using SELECT INTO, Bcp, or BULK INSERT). These large-scale operations frequently use a substantial amount of transaction log space. If you use this recovery model, you must make sure that the transaction log does not run out of space (particularly during a large-scale operation). Regular and frequent backups of the transaction log are required to ensure that the transaction log is regularly truncated to free up space for more records.

Bulk-Logged Recovery Model

When a database uses the Bulk-Logged Recovery model, all operations except certain large-scale operations are fully logged. Index creation and bulk load operations are minimally logged. The transaction log does not record sufficient detail of these large-scale operations to recover in case of a media failure after one of these operations. This helps reduce the amount of transaction log space used, but increases exposure to data loss after these large-scale operations. A full database backup after a large-scale operation is not required for recoverability. With the Bulk-Logged Recovery model, regular backups of the transaction log are still required to truncate the transaction log to free up space for more records, but these backups need to occur less frequently than with the full recovery model.

Note

Point-in-time recovery is not supported in the Bulk-Logged Recovery model.

Simple Recovery Model

When a database uses the Simple Recovery model, all operations are fully logged (including all large-scale operations). However, because this recovery model does not rely on transaction log backups for database recovery, each checkpoint process truncates the inactive portion of the transaction log. This prevents the transaction log from running out of space in most circumstances. However, long-running transactions and unreplicated transactions marked for replication can still cause the transaction log to fill up. This recovery model is rarely used in production databases because the risk of loss of recently written changes is simply too great. When you use the Simple Recovery model, the database can be recovered only to the point of the last backup.

Note

The tempdb system database always uses the Simple Recovery model. The sample user databases, Northwind and pubs, use the Simple Recovery model by default, but this can be changed.

Practice: Viewing the Properties of a Transaction Log and a Database

In this practice you use SQL Server Enterprise Manager to view some of the properties of a transaction log file and a database.

To view the properties of a transaction log file and a database

Ensure that you are logged on to the SelfPacedSQL.MSFT domain controller as Administrator.
Click Start, point to Programs, point to Microsoft SQL Server, and then click Enterprise Manager.
SQL Server Enterprise Manager appears displaying the Microsoft SQL Servers and the Event Viewer (Local) console trees in the console root.
In the console tree, expand the Microsoft SQL Servers container, expand the SQL Server Group container, expand the default instance, and then expand Databases.
In the console tree, right-click Northwind and then click Properties.
The Northwind Properties dialog box appears, with the General tab selected, displaying various properties of the Northwind database.
Click the Transaction Log tab.
The File Name column displays the logical filename of the Northwind database.
Expand the size of the Location column by sliding the column bar to the right.
The name and complete file path of the operating system file are displayed.
Reduce the size of the Location column so you can view the Space Allocated (MB) column.
The current space allocated is displayed. Most systems will display 1 MB. This is the initial size of the Northwind transaction log file (which is rounded to the nearest whole number).
Click the Options tab.
Notice that the Northwind database is using the Simple Recovery model.
Click Cancel to close the Northwind Properties dialog box.
Close SQL Server Enterprise Manager.

Lesson Summary

SQL Server 2000 uses one or more transaction log files to record modifications made by transactions ahead of writing these data modifications to disk. This ensures that a transaction can be rolled forward or rolled back if needed, such as in the case of a server failure. SQL Server 2000 uses the checkpoint process and operating system threads to ensure that data modifications are written to disk. The checkpoint process also identifies the portion of the transaction log that is no longer active. The inactive portion of the transaction cannot be reused until it is truncated. The recovery model in use determines whether the checkpoint process truncates the inactive portion of the transaction log, or a transaction log backup is required to truncate the transaction log. Finally, the recovery model chosen determines the extent of logging for bulk operations, which dictates the frequency of transaction log backups.