Overview of Backup and Restore | Professional SQL Server 2005 Administration (Wrox Professional Guides)

Before you can effectively formulate a backup and restore plan, you'll need to know how backup and recovery works on a mechanical level. SQL Server 2005 has several different backup and recovery processes that you can use, depending on the needs of your organization. In this section, we examine how backup and recovery work and help you choose the best plan for your needs.

How Backup Works

Database backup is a procedure that safeguards your organization's investment to reduce the amount of data loss. A database backup is the process of making a point-in-time copy of the data and transaction log into an image on either disks or tapes. SQL Server 2005 implements versatile backup processes that can be used separately or together to produce the optimal backup strategy required by an organization. Moreover, SQL Server 2005 can perform the database backup while it is online and available to users. Additionally, it supports up to 64 concurrent backup devices. The following types of backup are available:

A full backup is an image copy of all data in the database including the transaction log. Using this backup type, you can restore the database to the point in time when the backup was taken. It is the most basic of the backups, where all of the database files are combined into the image. Therefore, in a restore, all of the database files are restored without any other dependencies, and the database is available.
A partial backup is an image copy of the primary and read/write filegroups. Read-only filegroups can optionally be included in the backup copy. It allows more flexibility to speed backups by providing better manageability for a larger database where large read-only filegroups can be backed up once after been set up as read-only. For example, a large database may have archival data that does not change, so there is no need to back it up every time, reducing the amount of data to back up.
A file backup is an image copy of files or filegroups of a database. This method is typically used for very large databases where it is not feasible to do a full database backup. A transaction log backup is needed with this backup type if the backup includes read/write files or filegroups. The challenge is maintaining the files, filegroups, and transaction-log backups, because larger databases have many files and filegroups. Additionally, it requires more steps to restore the database.
A differential backup is an image copy of all the data that has changed since the last full backup. The SQL Server 2005 backup process identifies each changed extant and backs it up. Restoring from a differential backup requires the full base backup. Each differential is always based on the full backup, not on a previous differential. If the database is very volatile, the differential backup may approach the size of the full backup and may be as slow as SQL Server needs to identify each changed extant.
A full differential backup is a copy of all extants modified for the complete database since the last full backup. Restoring requires the full database backup. Typically, this type of backup is used in combination with the full database backup, where a full backup may be taken every weekend and full differential backups every weekday.
A partial differential backup is a copy of all extents modified since the last partial backup. To restore requires the partial backup.
A file differential backup is a copy of the file or filegroup of all extants modified since the last file or filegroup backup. A transaction-log backup is required after this backup for read/write files or filegroups. Moreover, after the restore, you need to restore the transaction log as well. Choosing to use the file backup and file differential backup methods will increase the complexity of the restore procedures. Furthermore, it may take longer to restore the complete database.

The Transaction Log on SQL Server 2005 is a main component for a transactional relational database system that maintains the ACID properties for transactions, which are: atomicity, consistency, isolation, and durability. SQL Server 2005 and earlier versions implement the write ahead logging (WAL) protocol, which means that the transaction-log records are written to a stable media prior to the data is written to disk and before SQL Server 2005 sends an acknowledgment that the data has been permanently committed. A stable media is usually a physical disk drive but can be any device that guarantees that on restart the data will not been lost. On a Storage Area Network (SAN), which may have a built-in cache, instead of writing the transaction log directly to physical disk drives, in a power failure, the SAN must certify that it implements a battery backup to provide ample time to write all cached IO to physical disk drives before it powers down. Additionally, SAN vendors often mirror their built-in cache for redundancy. The transactional relational database system expects that the data is available on restart, and, if not, it will identify that the database as corrupted because it cannot determine the data consistency of the database.

In addition, when a data modification occurs, SQL Server 2005 generates a new log sequence number (LSN) used on restart to identify the consistency of the data while performing database recovery. Additionally, the LSN is used when restoring the transaction log; SQL Server 2005 uses it to determine the sequences of each transaction log restored. If, for example, a transaction-log backup is not available, that is known as a broken log chain and will prevent a transaction-log recovery past that point. Backing up the transaction log to point-in-time recovery is a critical part of a backup strategy. There are three transaction-log backup types that a DBA can perform:

A pure transaction-log backup is when there have not been any bulk-logged operations performed on the database. That is, every data modification performed is represented in the transaction log. The database recovery model can be in Full or in Bulk-Logged mode, provided that no bulk-logged operation has been performed. This is the most common transaction-log backup type, as it best protects the data and provides the capability to recover to a point in time.
A bulk transaction-log backup is when Bulk-Logged operations have been performed in the database, and therefore point in time recovery is not allowed. To improve performance on bulk operations, that is, to reduce transaction logging, the database can be set in the Bulk-Logged recovery model where only the allocation pages are logged, not the actual data modifications in the transaction log. During a transaction-log backup, SQL Server will extract and include the bulk-logged data inside the transaction-log backup to allow recoverability.
A tail transaction-log backup is a transaction backup when the database has been damaged. For example, if the data files are not accessible but the transaction log files are, you may back up the transaction log to capture the last database modifications. During restore, the tail transaction log can be used to restore the database up to the point of database failure. This cannot be performed if the database is in the Bulk-Logged recovery model and bulk operations have been performed, because the transaction-log backup would need to retrieve the data modifications for the bulk operations from the data files that are not accessible.

Another available backup option is to detach the database or shut down SQL Server and use the OS to copy the database files to a backup device. To backup the database files, you would detach the database like this:

 EXEC MASTER.dbo.sp_detach_db @dbname = N'AdventureWorks', @keepfulltextindexfile=N'TRUE'

To restore, you would attach the database files like this:

 EXEC MASTER.dbo.sp_attach_db @dbname = N'AdventureWorks',  @filename1 = N'F:\MSSQL\Data\AdventureWorks_Data.mdf',   @filename2 = N'F:\MSSQL\Data\AdventureWorks_log.ldf' ;

How Restore Works

Restore is the ability to recover the database in case of a failure and is a major function of a transactional relational database system. To recover the database is to produce a consistent, stable, and accurate copy of the database to a certain point in time. When a DBA restores a database, three restore phases must happen.

In the copy phase, the database image is created and initialized on disk, and then the full backup is copied. That can be followed by any differential and transaction-log backups.

After all the full backup has been applied, and any differential and transaction logs have been restored, the DBA performs a redo phase where all committed transaction records that were in the transaction log but not in the data are applied. The WAL protocol guarantees that the transaction records that were committed have been written to the transaction-log stable media. Then, during the redo, SQL Server evaluates the transaction-log records and applies the data modifications to the data of the database. The duration of the redo phase depends on how many data modifications SQL Server 2005 had performed, which depends on what SQL Server was doing at the time of the failure, and also the recovery interval setting. For example, if SQL Server 2005 just finished updating 10-million rows from a table and committed the transaction but was unexpectedly shut down right after, during recovery it would have to redo those data modifications to the data. The SQL Server 2005 recovery interval setting influences the recovery time by how many dirty pages will be kept in memory before the checkpoint process must write them to stable media. By default, the recovery interval is set to 0, which means that SQL Server will keep less than a minute of work. With that setting, during recovery, there is minimal redo work before the database becomes available for users. The higher the recovery interval value, the longer the recovery may take.

After the redo phase is the undo phase, where any transaction-log records that did not complete are rolled back. Depending on the amount of work and the length of the transactions at the time before shutdown, this phase can take some time. For example, if the DBA was in the middle of deleting 10-million rows, SQL Server 2005 is required to roll back all those rows during recovery. SQL Server 2005 does make the database available to users while in the undo phase, but users should expect some performance impact while in the redo phase.

Recovery Models

Understanding the recovery models is essential to developing an effective backup strategy. The recovery model determines how the transaction log is managed by SQL Server 2005 which in turn determines the recovery options, such as the backup types that can be performed on the database and the data loss exposure.

In the full recovery model, the transaction log records all data modifications, makes available all database recovery options, and implements the highest data protection but uses the most transaction-log space. This recovery model can be used with all database backup operations, has the capability of point-in-time recovery, allows backing up the tail transaction log, and the transaction log is accessible. If the full database backup occurred at 1:00 p.m., a transaction-log backup occurs at 1:30 p.m., and the physical drives containing the data files fail at 2:00 p.m. You will be able to recover up to 2:00 p.m. by performing a tail transaction-log backup. As a result, no data will have been lost. Most OLTP production systems and mission-critical applications that require minimal data loss should be using this recovery model.

The Bulk-Logged recovery model performs minimal logging for certain database operations such as bulk import operations like BCP, Bulk Insert, SELECT INTO, CREATE INDEX, ALTER INDEX REBUILD, and DBCC DBREINDEX. Instead of logging every modification for these database operations, it logs the extant allocations. As a result, these operations will execute faster, as they are minimally logged, but it presents possible data-loss risks for recovery. A transaction-log backup is allowed, and if no bulk-logged operations have been performed, the transaction log contains all data modifications exactly like the Full recovery model. If bulk-logged operations have been performed, the transaction-log backup will contain the data that was not logged. Additionally, after a bulk-logged operation, point-in-time recovery using transaction-log backup is disallowed.

Consider the same case scenario. A bulk-logged database operation has been performed. The full database backup occurred at 1:00 p.m., a transaction log backup occurs at 1:30 p.m. and then the physical drives containing the data files fail at 2:00 p.m. You would not be able to recover up 2:00 p.m., because the transaction-log backup would need to access the data files to retrieve the data modifications performed during the bulk-logged operations. As a result, data will be lost and you can only recover up to 1:30 p.m.

The data loss in this scenario can be minimized with some bulk-logged database operations by implementing shorter transactions and performing transaction-log backups during the bulk-logged operations.

Oftentimes, this recovery model is used when the DBA is performing bulk-logged operations and then switches back to full after the bulk-logged operation completes to improve the performance for bulk operations. Additionally, this model is commonly used in an OLAP or Report database where there are nightly bulk data loads. A backup is taken, and afterward no data is modified during the day, where if the data is lost because of a failure, it can be restored from backup.

The simple recovery model implements minimal logging, just like the bulk-logged recovery model, except that it keeps the transaction-log records until the next checkpoint process that writes the dirty data pages into physical disks. Then the checkpoint process truncates the transaction log. Transaction-log backups are not allowed; therefore, point-in-time recovery is not available. Typically, this recovery model is used for development or test servers, where data loss is acceptable and data can be reloaded. Moreover, this model may be used by an OLAP and Reporting database where there may be only a nightly data load and then a full or differential backup is performed. With this model, if the database were to fail during the data load, you would have to start from the beginning, unless a full or differential backup was taken during the process. Furthermore, if a DBA switches from one of the other recovery models to this one, the transaction-log continuity will be broken, as it will truncate the transaction log. A full or differential backup should be taken. In addition, during the time that the database is in this recovery model, the database is more exposed to potential data loss.

Note

Transactional replication, log shipping, or data mirroring is not allowed in the simple recovery model, as there is no transaction log.

Choosing a Model

The recovery model you choose depends on the amount of acceptable data loss, the database's read and write daily activities, and how critical that database is to the daily business of your organization.

Choose the full recovery model for a mission-critical database to keep data loss to a minimum, because it is fully logged, and in case of damaged data files, the tail transaction log can be backed up and used to restore the database to a point in time. Therefore, OLTP production systems usually use the full recovery model, except when the database is modified nightly, as is sometimes the case with OLAP or Reporting databases.

You can use the bulk-logged recovery model to increase bulk operations' performance because it does minimal logging. For example, you could do a nightly bulk operation and then switch back to full recovery. The bulk-logged model will fully log, as is the case with the full recovery model, except for the bulk operations. Therefore, you could use bulk-logged recovery as a permanent recovery model, except it poses a data risk. As long as there are no bulk-data operations, the DBA can back up the transaction log, but oftentimes unknown to the DBA, the tail transaction-log backup recovery may no longer be available if a bulk operation has been performed. To protect from someone doing bulk operations without a database backup and to reduce that data risk, you should switch to bulk-logged only when a bulk operation needs to be performed. Bulk-logged can be a permanent recovery model in an OLAP or Report database where there is no daily modification activity, as there is a result limited data loss risk if the databases are backed up right after any nightly data load. There is no chance of data loss throughout the day, as nothing would have changed. Also, some data loss may be acceptable, as the OLAP and Reporting databases can be reloaded from the OLTP data source whenever needed.

Simple recovery model acts like the bulk-logged model, except that it does not save the transaction log; instead, the checkpoint process truncates it. Therefore, no one has to maintain the transaction log. This recovery model is commonly used for development, read-only, and test systems where transaction-log backups are not required. If there is data loss, a new copy of the data can be reloaded from the OLTP data source. If the DBA switches to this recovery model from one of the others, the transaction-log continuity is broken, because there is no way to back up the transaction log. In this recovery model, there is no point-in-time recovery, because the DBA cannot back up the transaction log. Therefore, any restore would be from the previous full and any differential backups.

Switching Recovery Models

SQL Server 2005 allows complete flexibility to switch among the recovery models. However, be aware of the limitations when switching among them, as switching can result in data loss during recovery. The following list outlines the limitations of switching recovery models.

Switching from full to bulk-logged: Because bulk-logged database operations may be performed, a transaction-log backup is recommended at a minimum, so that the DBA can recover to this last transaction log if the tail transaction log is not available. To change to this recovery model, use this command:
```
 ALTER DATABASE < db_name> SET RECOVERY MODEL BULK_LOGGED 
```
Switching from full to simple: Because the transaction-log continuity will be broken by this recovery model, a transaction-log backup is recommended, at minimum, before the switch. After the recovery model switch, transaction-log backups and point-in-time recovery are disallowed. To change to this recovery model, use this command:
```
 ALTER DATABASE < db_name> SET RECOVERY MODEL SIMPLE 
```
Switching from bulk-logged to full: Because bulk-logged database operations may have been performed, and to minimize potential data loss if the tail transaction log is not accessible, a transaction-log backup is recommended after the switch. To change to this recovery model, use this command:
```
 ALTER DATABASE < db_name> SET RECOVERY MODEL FULL 
```
Switching from bulk-logged to simple: Because in this recovery model there is a greater chance of data loss in case of a database failure, at a minimum a transaction-log backup is highly recommended before the switch. To change to this recovery model, use this command:
```
 ALTER DATABASE < db_name> SET RECOVERY MODEL SIMPLE 
```
Switching from simple to full: To allow the full recovery model to start to apply transaction-log backups, a full, differential, file, filegroup backup is required after the switch. To change to this recovery model, use this command:
```
 ALTER DATABASE < db_name> SET RECOVERY MODEL FULL 
```
Switching from simple to bulk-logged: Supported. To allow the bulk-logged recovery model to start to apply transaction-log backups, a full, differential, file, or filegroup backup is required after the switch. To change to this recovery model, use this command:
```
 ALTER DATABASE < db_name> SET RECOVERY MODEL BULK_LOGGED 
```

The recovery model is configured for each database. You can also switch the recovery model from the SQL Server Management Studio by opening the Database Properties and choosing Options, as shown in Figure 18-1.

image from book
Figure 18-1

Verifying the Backup Images

With any backup solution, a critical operation is to verify the backup images that they will restore. Oftentimes, a DBA may be meticulously doing backups, but along the way, the database becomes corrupted and every backup from that point on is not useable. Plan on doing periodic restores to verify recoverability. Additionally, perform database consistency checks to validate the database structures. Use the RESTORE VERIFYONLY T-SQL command to perform validation checks on the backup image. It does not restore the backup, but will perform validation checks, including:

Backup set is readable
Page ID
If the backup was created WITH CHECKSUMS, will validate it
Check destination devices for sufficient space

However, the RESTORE VERIFYONLY command will not completely guarantee that the backup is restorable. That is why you need a policy to randomly restore a backup to a test server. RESTORE VERIFYONLY simply provides another level of validation. Here's the syntax:

 RESTORE VERIFYONLY FROM <backup_device_name>

An example resulting message is the following:

 The backup set on file 1 is valid.

For higher reliability to protect from a malfunctioning backup device that may make the whole backup unrecoverable, use mirroring of backup sets for redundancy. They can be either disk or tape and have the following restrictions:

Backup devices must be identical.
To create a new or extend a backup, the mirror backup set must be intact. If one is not present, the media backup set cannot be used.
To restore from a media backup set, only one of the mirror devices must be present.
If one mirror of the media backup set is damaged, no additional mirroring can performed on that media backup set.

For example, to use backup device mirroring on the AdventureWorks database:

 BACKUP DATABASE AdventureWorks TO TAPE = '\\.\tape0', TAPE = '\\.\tape1' MIRROR TO TAPE = '\\.\tape2', TAPE = '\\.\tape3' WITH FORMAT, MEDIANAME = 'AdventureWorksSet1'

Backup History Tables

SQL Server 2005 maintains the backup history for the server in the MSDB database in a group of tables from which it can identify the backup available for a database. In the Restore dialog box, SQL Server will present the restores available for the database. The tables are the follows:

Backupfile: A row for each data or log file backed up
Backupfilegroup: A row for each filegroup in a backup set
Backupmediafamily: A row for each media family
Backupmediaset: A row for each backup media set
Backupset: A row for each backup set

Note

A media set is an ordered collection of all tapes or disks from all devices that took part in the backup. A media family is a collection of all backup media on a single device that took part in the backup. A media backup is a tape or disk device used for backup.

The following three backup information statements return information from the history backup tables:

RESTORE FILISTONLY: Returns a list of database and log files in a backup set from the backup file table:
```
 RESTORE FILELISTONLY FROM AdventureWorks_Backup 
```
RESTORE HEADERONLY: Returns all the backup header information for all the backup sets in a device from the backupset table:
```
 RESTORE HEADERONLY FROM AdventureWorks_Backup 
```
RESTORE LABELONLY: Returns information about the backup media of a backup device from the backupmediaset table:
```
 RESTORE LABELONLY FROM AdventureWorks_Backup 
```

Permissions required for backup and restore

SQL Server provides granular permission for both backup and restoring a database. A Windows or SQL Server authenticated user or group can be given permission to perform the backup and restore operations. To have permission to back up a database, a user must have at minimum the following permissions:

Server role: none
DB role: db_backupoperator

To restore a database, a user must have at minimum the following permissions:

Server role: dbcreater
DB role: db_owner

Backup System Databases

SQL Server 2005 system databases are critical to the operation of each SQL Server instance. These databases are not often modified, but they contain important information that needs to be backed up. After creating a new SQL Server 2005 instance, develop a backup plan to perform a full backup of the system databases, except for tempdb. SQL Server 2005 recreates tempdb every time it is restarted, as it does not contain any data to recover.

Master

The master database contains the login information, metadata about each database for the SQL instance. Moreover, it contains SQL Server configuration information. For example, the database is altered every time you:

Add, remove, or modify a database level setting
Add or delete a user database
Add or remove a file or filegroup in a user database
Add, remove, or modify a login's security
Modify a SQL Server 2005 server-wide configuration
Add or remove a logical backup device
Configure distributed queries or remote procedure calls (RPC).
Add, modify, or remove a linked server or remote login

Although these modifications occur infrequently, when they do, consider doing a full database backup. If a backup is not performed, you stand to lose the modifications if a previous backup of the master is restored. Moreover, as a precautionary measure, before and after any service pack or hotfix, perform a new backup of the master database.

MSDB

The msdb database contains SQL jobs, backup jobs, schedule, operators, backup, and restore history and can contain Integration Services packages and others. If you create a new job or add a new Integration Services package and msdb were to fail, the previous backup would not contain these new jobs and would need to be recreated.

Tempdb

Tempdb cannot be backed up. As it is recreated every time SQL Server 2005 is restarted, there is no data in it that needs to be recovered.

Model

Typically, the model database changes even less frequently than the other system databases. Model is the template database used when a new database is created. If you want a certain database object to be present in every new database, like a stored procedure or table, place it in Model. In these cases, it should be backed up or any Model modifications will be lost and need to be recreated, if not backed up.

Full-text Backup

Full-text search performs fast querying of unstructured data using keywords based on the linguistic verse in a particular language. It is primarily used to search char, varchar, and nvarchar fields. Prior to querying, the full-text index must be created by a population or crawl process, during which full-text search performs a breakdown of the keywords and stores them in the full-text index. Each full-text index is then stored in a full-text catalog. Then a catalog is stored in a filegroup. The full-text indexing and querying is performed by the Microsoft full-text engine SQL Server (MSFTESQL). Unlike previous versions of SQL Server, in SQL Server 2005 a full backup will include the full-text files. Additionally during the backup, the full-text catalog can be queried, but all modification operations are suspended until the backup completes, where they are held by the notification log. In addition to the full database backup and database differential backup, SQL 2005 provides more granular backup functionality where a full-text file or filegroup can be backed up.

For an example of a full-text file backup:

 BACKUP DATABASE Adventureworks FILE = 'FulltextFI' TO Adventureworks_FT

For an example of a full-text filegroup backup:

 BACKUP DATABASE Adventureworks FILEGROUP = 'FulltextFG' TO Adventureworks_FT