Changes in Log Size | Inside Microsoft SQL Server 2005: The Storage Engine (Solid Quality Learning)

No matter how many physical files have been defined for the transaction log, SQL Server always treats the log as one contiguous stream. For example, when the DBCC SHRINKDATABASE command (discussed in Chapter 4) determines how much the log can be shrunk, it does not consider the log files separately but instead determines the shrinkable size based on the entire log.

Virtual Log Files

The transaction log for any database is managed as a set of virtual log files (VLFs) whose size is determined internally by SQL Server based on the total size of all the log files and the growth increment used when enlarging the log. When a log file is first created, it always has between 4 and 16 VLFs. A log always grows in units of entire VLFs and can be shrunk only to a VLF boundary. (Figure 5-3 illustrates a physical log file along with several VLFs.)

Figure 5-3. Multiple VLFs that make up a physical log file

A VLF can be in one of four states:

Active The active portion of the log begins at the minimum LSN representing an active (uncommitted) transaction. The active portion of the log ends at the last LSN written. Any VLFs that contain any part of the active log are considered active VLFs. (Unused space in the physical log is not part of any VLF.) Figure 5-3 contains 04 active VLFs.
Recoverable The portion of the log preceding the oldest active transaction is needed only to maintain a sequence of log backups for restoring the database to a former state.
Reusable If transaction log backups are not being maintained or if you have already backed up the log, VLFs before the oldest active transaction are not needed and can be reused. Truncating or backing up the transaction log will change recoverable VLFs into reusable VLFs.
Unused One or more VLFs at the physical end of the log files might not have been used yet if not enough logged activity has taken place or if earlier VLFs have been marked as reusable and then reused.

Observing Virtual Log Files

You can observe the same key properties of virtual log files by executing the undocumented command DBCC LOGINFO. This command takes no parameters, so it must be run in the database for which you want information. It returns one row for each VLF. When I run this command in a newly created pubs database, I get the following three rows returned:

FileId	FileSize	StartOffset	FSeqNo	Status
2	253952	8192	18	0
2	253952	262144	19	0
2	270336	516096	20	2

The number of rows tells me how many VLFs are in my database. The results actually include two additional columns, but I have omitted them here because I will not be discussing them. The FileID column indicates which of the log's physical files contains the VLF; for my pubs database, there is only one physical log file. FileSize and StartOffset are indicated in bytes, so you can see that the first VLF starts after 8192 bytes, which is the number of bytes in a page. The first physical page of a log file contains header information and not log records, so the VLF is considered to start on the second page. The FileSize column is actually redundant for most rows because the size value can be computed by subtracting the StartOffset values for two successive VLFs. The rows are listed in physical order, but that is not always the order in which the VLFs have been used. The use order (logical order) is reflected in the column called FSeqNo (which stands for File Sequence Number). For example, if I run the DBCC LOGINFO command in my AdventureWorks database, the following 10 rows are returned:

FileId	FileSize	StartOffset	FSeqNo	Status
2	458752	8192	4994	0
2	458752	466944	4995	0
2	458752	925696	4996	0
2	712704	1384448	4997	0
2	4194304	2097152	4998	2
2	4194304	6291456	4993	0
2	4194304	10485760	4989	0
2	4194304	14680064	4990	0
2	4194304	18874368	4991	0
2	4194304	23068672	4992	0

Again, you can see that the rows are listed in physical order according to the StartOffset, but the logical order does not match. The FSeqNo values indicate that the seventh VLF is actually the first one in use (logical) order; the last one in use order is the fifth VLF in physical order. The status column indicates whether the VLF is reusable. A status of 2 means that it is either active or recoverable; a status of 0 indicates that it is reusable or completely unused. As I mentioned earlier, truncating or backing up the transaction log changes recoverable VLFs into reusable VLFs, so a status of 2 will change to a status of 0 for all VLFs that don't include active log records. In fact, that's one way to tell which VLFs are active: the VLFs that still have a status of 2 after a log backup or truncation must contain records from active transactions. VLFs with a status of 0 can be reused for new log records, and the log will not need to grow to keep track of the activity in the database. On the other hand, if all the VLFs in the log have a status of 2, SQL Server will need to add new VLFs to the log to record new transaction activity.

Multiple Log Files

I mentioned earlier that SQL Server treats multiple physical log files as if they were one sequential stream. This means that all the VLFs in one physical file are used before any VLFs in the second file are used. If you have a well-managed log that is regularly backed up or truncated, you might never use any log files other than the first one. If one of the VLFs in multiple physical log files is available for reuse when a new VLF is needed, SQL Server will add new VLFs to each physical log file in a round-robin fashion. There is really no reason to need multiple physical log files if you have done thorough testing and have determined the optimal size of your database's transaction log. However, if you find that the log needs to grow more than expected (if the volume containing the log does not have sufficient free space to allow the log to grow enough), you might need to create a second log file on another volume.

Automatic Truncation of Virtual Log Files

SQL Server will assume you're not maintaining a sequence of log backups if any of the following is true:

You have truncated the log using BACKUP LOG WITH NO_LOG or BACKUP LOG WITH TRUNCATE_ONLY.
You have set the database to truncate the log on a regular basis by setting the recovery model to SIMPLE.
You have never taken a full database backup.

Under any of these circumstances, SQL Server will truncate the database's transaction log every time it gets "full enough." (I'll explain this in a moment.) The database is considered to be in autotruncate mode.

Remember that truncation means that all log records prior to the oldest active transaction are invalidated, and all VLFs not containing any part of the active log are marked as reusable. It does not imply shrinking of the physical log file. In addition, if your database is a publisher in a replication scenario, the oldest open transaction could be a transaction marked for replication that has not yet been replicated.

"Full enough" means that there are more log records than can be redone during system startup in a reasonable amount of timethe recovery interval. You can manually change the recovery interval by using the sp_configure stored procedure or by using SQL Server Management Studio, as discussed in Chapter 3. However, it is best to let SQL Server autotune this value. In most cases, this recovery interval value is set to 1 minute. By default, sp_configure shows zero minutes by default, meaning SQL Server will autotune the value. SQL Server bases its recovery interval on the estimate that 10 MB worth of transactions can be recovered in 1 minute.

The actual log truncation is invoked by the checkpoint process, which is usually sleeping and is woken up only on demand. Each time a user thread calls the log manager, the log manager checks the size of the log. If the size exceeds the amount of work that can be recovered during the recovery interval, the checkpoint thread is woken up. The checkpoint thread checkpoints the database and then truncates the inactive portion.

In addition, if the log ever gets to 70 percent full, the log manager wakes up the checkpoint thread to force a checkpoint. Growing the log is much more expensive than truncating it, so SQL Server truncates the log whenever it can.

Note

If the log manager is never needed, the checkpoint process won't be invoked and the truncation will never happen. If you have a database in autotruncate mode, for which the transaction log has VLFs with a status of 2, you will not see the status change to 0 until some logging activity is required in the database.

If the log is regularly truncated, SQL Server can reuse space in the physical file by cycling back to an earlier VLF when it reaches the end of the physical log file. In effect, SQL Server recycles the space in the log file that is no longer needed for recovery or backup purposes. My AdventureWorks database is in this state because I have never taken a full database backup.

Maintaining a Recoverable Log

If a log backup sequence is being maintained, the part of the log before the minimum LSN cannot be overwritten until those log records have actually been backed up. The VLF status will stay at 2 until the log backup occurs. After the log backup, the status will change to 0 and SQL Server can cycle back to the beginning of the file. Figure 5-4 depicts this cycle in a simplified fashion. As you can see from the FSeqNo values in the output from the earlier AdventureWorks database, SQL Server does not always reuse the log files in order of their physical sequence.

Figure 5-4. The active portion of the log circling back to the beginning of the physical log file

Note

If a database is not in autotruncate mode and you are not performing regular log backups, your transaction log will never be truncated. If you are doing only full database backups, you must manually truncate the log to keep it at a manageable size.

The easiest way to tell whether a database is in autotruncate mode is by using the catalog view called sys.database_recovery_status and looking in the column called last_log_backup_lsn. If that column has a value of 0, the database is in autotruncate mode.

You can actually observe the difference between a database in autotruncate mode and a database that isn't by running a simple script in the pubs database. This script will work as long as you have never made a full backup of the pubs database. If you have never made any modifications to pubs, the size of its transaction log file will be just about 0.75 MB, which is the size at creation. The following script creates a new table in the pubs database, inserts three records, and then updates those records 1000 times. Each update is an individual transaction, and each one is written to the transaction log. However, you should note that the log does not grow at all, and the number of VLFs does not increase even after 3000 update records are written. (If you've already taken a backup of pubs, you might want to re-create the database before trying this example. You can do that by running the script instpubs.sql from the companion CD, which I mentioned in Chapter 4.) However, even though the number of VLFs does not change, you will see that the FSeqNo values changes. Log records are being generated, and as each VLF is reused, it gets a new FSeqNo value.

USE pubs -- First look at the VLFs for the pubs database DBCC LOGINFO -- Now verify that pubs is in auto truncate mode SELECT last_log_backup_lsn FROM master.sys.database_recovery_status WHERE database_id = db_id('pubs') GO CREATE TABLE newtable (a int) GO INSERT INTO newtable VALUES (10) INSERT INTO newtable VALUES (20) INSERT INTO newtable VALUES (30) GO DECLARE @counter int SET @counter = 1 WHILE @counter < 1000 BEGIN     UPDATE newtable SET a = a + 1     SET @counter = @counter + 1 END

Now make a backup of the pubs database after making sure that the database is not in the SIMPLE recovery mode. I'll discuss recovery models later in this chapter, but for now you can just make sure that pubs is in the appropriate recovery mode by executing the following command:

ALTER DATABASE pubs SET RECOVERY FULL

You can use the following statement to make the backup, substituting the path shown with the path to your SQL Server installation:

BACKUP DATABASE pubs to disk =     'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\backup\pubs.bak'

As soon as you take the full backup, you can verify that the database is not in autotruncate mode, again by looking at the database_recovery_status view:

SELECT last_log_backup_lsn FROM master.sys.database_recovery_status WHERE database_id = db_id('pubs')

This time, you should get a non-null value for last_log_backup_lsn to indicate that log backups are expected. Next, run the update script again, starting with the DECLARE statement. You should see that the physical log file has grown to accommodate the log records added and that there are more VLFs. The initial space in the log could not be reused because SQL Server assumed that you were saving that information for transaction log backups.

Now you can try to shrink the log back down again. If you issue the following command or if you issue the DBCC SHRINKFILE command for the log file, SQL Server will mark a shrinkpoint in the log, but no actual shrinking will take place until log records are freed by either backing up or truncating the log.

DBCC SHRINKDATABASE (pubs)

You can truncate the log in one of two ways. One way is to change the database's recovery mode to SIMPLE, as we'll see in the next section. Alternatively, you can run this command:

BACKUP LOG pubs WITH TRUNCATE_ONLY

At this point, you should notice that the physical size of the log file has been reduced. If a log is truncated without any prior shrink command issued, SQL Server marks the space used by the truncated records as available for reuse but does not change the size of the physical file.

In SQL Server 7.0, where this log architecture was first introduced, running the preceding commands exactly as specified did not always shrink the physical log file. When the log file did not shrink, it was because the active part of the log was located at the end of the physical file. Physical shrinking can take place only from the end of the log, and the active portion is never shrinkable. To remedy this situation, you had to enter some dummy transactions after truncating the log to force the active part of the log to move around to the beginning of the file. In SQL Server 2000 and SQL Server 2005, this process is unnecessary. If a shrink command has already been issued, truncating the log internally generates a series of NO-OP (or dummy) log records that force the active log to move from the physical end of the file. Shrinking happens as soon as the log is no longer needed.

Automatic Shrinking of the Log

Remember, truncating is not shrinking. A database should be truncated so that it is most shrinkable, and if the log is in autotruncate mode and the autoshrink option is set, the log will be physically shrunk at regular intervals.

If a database has the autoshrink option on, an autoshrink process kicks in every 30 minutes (as discussed in Chapter 4) and determines the size to which the log should be shrunk. The log manager accumulates statistics on the maximum amount of log space used in the 30-minute interval between shrinks. The autoshrink process marks the shrinkpoint of the log as 125 percent of the maximum log space used or the minimum size of the log, whichever is larger. (Minimum size is the creation size of the log or the size to which it has been manually increased or decreased.) The log then shrinks to that size whenever it gets the chance, which is when it gets truncated or backed up. It's possible to have autoshrink without having the database in autotruncate mode, although there's no way to guarantee that the log will actually shrink. For example, if the log is never backed up, none of the VLFs will be marked as reusable, so no shrinking can take place.

As a final note, you need to be aware that just because a database is in autotruncate mode, there is no guarantee that it won't grow. (It is the converse that you can be sure ofthat a database not in autotruncate mode will grow.) Autotruncate means only that VLFs that are considered recoverable will be marked as reusable at regular intervals. But VLFs in an active state will not be affected. If you have a long-running transaction (which might be a transaction that someone forgot to commit), all the VLFs that contain any log records since that long-running transaction started are considered active and can never be reused. One uncommitted transaction can mean the difference between a very manageable transaction log size and a log that uses more disk space than the database itself and continues to grow.

Log File Size

You can see the current size of the log file for all databases, as well as the percentage of the log file space that has been used, by running the DBCC command DBCC SQLPERF('logspace'). However, because it is a DBCC command, it's hard to filter the rows to get just the rows for a single database. In addition, if you want to create a stored procedure to return the log size, it can be tricky to manage the permissions mechanisms to allow someone who normally doesn't have permission to run the DBCC command. Instead, you can use the catalog view sys.master_files, which returns rows for all files in all databases or from one particular database, or you can use the view sys.database_files, which returns the rows from sys.master_files for the current database. If you filter on type = 1, you'll get just the rows for the log files.

The following code, supplied by Cliff Dibble from Microsoft, returns the same information as DBCC SQLPERF('logspace') in a tabular format that can be further filtered or easily embedded into a stored procedure, table-valued function, or view:

  SELECT rtrim(pc1.instance_name) AS [Database Name]    ,      pc1.cntr_value/1024.0 AS [Log Size (MB)]    ,      cast(pc2.cntr_value*100.0/pc1.cntr_value as dec(5,2))            as [Log Space Used (%)]    FROM    sys.dm_os_performance_counters as pc1    JOIN    sys.dm_os_performance_counters as pc2    ON      pc1.instance_name = pc2.instance_name    WHERE   pc1.object_name LIKE '%Databases%'    AND     pc2.object_name LIKE '%Databases%'    AND     pc1.counter_name = 'Log File(s) Size (KB)'    AND     pc2.counter_name = 'Log File(s) Used Size (KB)'    AND     pc1.instance_name not in ('_Total', 'mssqlsystemresource')    AND     pc1.cntr_value > 0 go

The final condition is needed to filter out databases that have no log file size reported. This includes any database that is unavailable because it has not been recovered or is in a suspect state, as well as any database snapshots, which have no transaction log.