Disaster Recovery with a Replicated Environment | Microsoft SQL Server 2000 High Availability

In the event of a problem with a SQL Server solution using replication, you need to know how to recover. If this is a temporary outage , as when you are applying an update (like a hotfix ) to each one, then there really is no problem. You need to assess why the servers are unavailable before you reinitialize snapshots or restore servers from backups .

Tip	No matter what model of replication you employ or what architecture you decide on, always script out your replication installations once they are configured and working properly.

To script your replication on an instance of SQL Server, select the Replication folder in Enterprise Manager, and right-click it. Select the Generate SQL Script option. You then see the dialog box shown in Figure 8-4. This should be done on each participating server. Place these scripts in a safe place that is documented in your run book for use in any disaster recovery scenario or ease of recreating in a test environment.

Figure 8-4: Generating SQL Script For Replication dialog box.

Backing Up Replication Databases

Your backup and restore strategy is crucial when you deploy replication. To recover from a disaster, you must back up the databases used in replication. The strategy changes depending on your replication model. If you are using transactional replication with log shipping, it impacts everything. See the later section Log Shipping and Replication for more information.

A comprehensive backup and restore plan for replication must include the Publisher, Distributor, and Subscriber, as well as SQL Server system databases. Good backups are crucial for disaster recovery. You have to test and stress every aspect of the replication environment; otherwise you have no test, meaning you really do not have a replicated environment you can feel confident about. Whatever your eventual plan, its complexity will match that of your replication topology.

Note	This topic is not covered in Chapters 9 or 10, as it is specific to replication and its dependencies. Remember to take this into account when reading Chapters 9 and 10, as this obviously affects an overall backup and restore plan.

Publisher

The Publisher is arguably the most important part of your replication, as it is the source of your data for Subscribers. You should be doing full backups on the publication database, as well as backing up msdb and master databases, because they are integral to the success of replication. Besides full backups, you should also perform differential or transaction log backups and periodically generate SQL scripts from the Publisher database. Here are some common things that influence your Publisher backups; each bullet point represents a change to the Publisher:

Creating new publications
Altering any publication property including filtering
Adding articles to an existing publication
Performing a publication-wide reinitialization of subscriptions
Altering any published table using a replication schema change
Performing on-demand script replication
Cleaning up merge meta data (running sp_mergecleanupmetadata )
Changing any article property, including changing the selected article resolver
Dropping any publications
Dropping any articles
Disabling replication

For snapshot replication, you only need to back up the publication database when changes are made to publications (adding, deleting, or modifying them). With merge replication, because the data can potentially reside anywhere , your backup scheme might or might not be more flexible. You have other ways of synchronizing your data, because Global Subscribers can be used to catch a restored Publisher and include previously merged changes that were not part of the backup.

Distributor

The Distributor is equally as important as the Publisher, as it pushes the data out to the Subscribers. With merge replication, the Distributor does not push data, but it does store history and information. A backup plan for a distribution database must include backing up the distribution database along with master and msdb. Realistically, a recovery strategy would include full as well as transaction log or differential backups because the distribution database can grow large, depending on your model of replication. Here are some common things that influence your Distributor backups; each changes the distributor database:

Creating or modifying replication agent profiles
Modifying replication agent profile parameters
Changing the replication agent properties (including schedules) for any push subscriptions

For snapshot replication, you need to synchronize the backup of this database with the Publisher. That means you need at least two SQL Server Agent jobs executed in parallel. You might also want to run the Distribution Cleanup Task to shorten your backup time and remove unused data from the distribution database prior to backing it up. While you are backing up these databases, do not add new publications or subscriptions.

Using the Sync with Backup Option with Transactional Replication Publisher and Distributor

The sync with backup option affects how often transactional replication sends transactions to the Distributor. This option should always be used when configuring transactional replication, because it is the only way to ensure that the distribution and publication databases can be recovered to the same point in time. The only way you can get to a point in time without using sync with backup is to stop all updates, perform a backup of both the Publisher and Distributor while no activity is occurring, and then allow activity again. Sync with backup is not an option for any other model of replication, nor is it available if your Publisher is another version of SQL Server (such as Microsoft SQL Server 7.0). To set this option, use the following syntax:

 sp_replicationdboption 'publication_db', 'sync with backup', 'true'

This restricts the Log Reader Agent from sending any transactions at the Publisher to the Distributor s distribution database until they have been backed up. Although this creates a bit of latency that will be as frequent as your transaction log backups are made, it ensures that the Distributor will never get ahead of any Subscriber and both will be perfectly in sync. The last backup for the Publisher and Distributor can then be restored with identical transactions. On the distribution database, sync with backup prevents transactions that have not been backed up at the Distributor from being removed from the transaction log of the Publisher until the Distributor has been backed up. Your latency for transactional replication is dependent on the volume of transactions, the time it takes to complete the transaction log backup, and the frequency of the backup.

If you use this option, you need to back up the publication database and distribution databases (usually you would back up the transaction log or make differential backups) frequently because the frequency of backups determines the latency with which replication delivers changes to Subscribers.

If you try to restore the Publisher without setting sync with backup, your Publisher and Subscriber might be out of sync.

Tip	If you want to see what transactions might not have yet been propagated to the distribution database, run DBCC OPENTRAN, and look for Oldest Distributed and Oldest Non-Distributed transactions. If these parameters exist, you are more than likely out of sync.

Subscriber

For a Subscriber, you should back up the subscription database. If you are using pull subscriptions, you should also back up msdb and master (if you are not doing so already) for replication purposes.

When using merge replication or transactional replication, restoration of a qualified Subscriber backup allows replication to continue normally. A nonqualified backup requires a reinitialization of the subscription after the backup is restored. For a merge backup to be qualified, it must be a backup from within the meta data cleanup interval. For a transactional replication backup to be a qualified backup, the publication must have the immediate_sync property set to true and the backup must be taken within the transaction retention period of the Distributor.

Note	When setting the transaction retention period or meta data retention periods, you must consider the space and performance trade-offs.

To set the transaction retention period, follow these steps:

Select the Replication folder in SQL Server Enterprise Manager or a database participating in replication, and right-click it. Select the Configure Publishing, Subscribers, And Distribution option.
When the Publisher And Distributor Properties dialog box appears (shown in Figure 8-5), make sure the Distributor tab is selected.

Figure 8-5: Publisher And Distributor Properties dialog box.
Select the correct distribution database, and click Properties.
Modify the transaction retention setting to the period for which you want to have a transaction stored in the distribution database. Remember that this has an impact on the size of your database, but it also ensures recoverability.

Note
If you want to modify the retention of the history of your replication, you can also do that here.
Click OK when you are done. Click Apply, and finally click OK to close the Publisher And Distributor Properties dialog box.
Modify any scheduled SQL Agent backup jobs configured on the database that are used to back it up so that you can recover from this scenario.

If you configure this, it guarantees that when you restore a Subscriber s database, all the transactions necessary for the Subscriber to catch up will still be available in the distribution database. After you restore the Subscriber, the Distribution Agent delivers any transactions the Subscriber is missing.

To set the merge meta data retention period, execute this stored procedure on the publication database:

 sp_changepublication @publication= 'publication_name', @property= 'retention', @value=  n

where n is number of hours, value for the retention property.

System Databases

Msdb contains all of the job definitions for the replication agents . With transactional replication, if you are using pull subscriptions, you have to back up this database on the Distributor and Subscriber if you add or drop a subscription or make any changes to the replication agent. If you are using Data Transformation Services (DTS) packages as part of transactional replication, you also need to back up msdb. Otherwise, you have to ensure that the DTS packages are saved to the file system and easily restorable. Here are some common actions that influence your msdb backups in relation to replication:

Enabling or disabling replication
Adding or dropping a distribution database (at the Distributor)
Enabling or disabling a database for publishing (at the Publisher)
Creating or modifying replication agent profiles (at the Distributor)
Modifying any replication agent profile parameters (at the Distributor)
Changing the replication agent properties (including schedules) for any push subscriptions (at the Distributor)
Changing the replication agent properties (including schedules) for any pull subscriptions (at the Subscriber)

Each time you add a new Subscriber, the sysservers table is updated in master, so you have to back up master on your servers after you add a new Publisher or Subscriber. Here are some common actions that influence your master backups in relation to replication:

Enabling or disabling replication
Adding or dropping a distribution database (at the Distributor)
Enabling or disabling a database for publishing (at the Publisher)
Adding the first or dropping the last publication in any database (at the Publisher)
Adding the first or dropping the last subscription in any database (at the Subscriber)
Enabling or disabling a Publisher at a Distribution Publisher (at the Publisher and Distributor)
Enabling or disabling a Subscriber at a Distribution Publisher (at the Subscriber and Distributor)

Important
Remember that you cannot restore system databases onto another SQL Server instance if it has another name .

Disaster Recovery Restore Scenarios

Table 8-4 lists scenarios that you must consider if you ever need to restore your replication solution.

Table 8-4: Replication Disaster Recovery Scenarios
Scenario	Publisher	Distributor	Subscriber
1	Up	Up	Down
2	Up	Down	Up
3	Up	Down	Down
4	Down	Up	Up
5	Down	Up	Down
6	Down	Down	Up
7	Down	Down	Down

Scenario 1

Scenario 1 is the least problematic in terms of overall replication availability. Although users accessing the data from the Subscriber are affected, the processes driving replication are still intact. This does become a problem, however, is if this database also publishes data elsewhere. This underscores the point that the more complex your replication configuration, the more it affects any disaster recovery planning that involves replication.

When you restore the Subscriber database to the last good backup, assuming the database backup is made after replication was configured and contains the replication tables, you might not have to reinitialize the subscription. If possible, the subscription resynchronizes the data without requiring a reinitialization; however, you need to reinitialize the subscription if the data required to bring the Subscriber up to date has been cleaned up.

Note

This scenario obviously works with merge and transactional replication. Snapshot replication is a complete snapshot of your data (there is no way to do a differential snapshot like a differential backup), so modifying your replication settings if you are sending out all of your data each time makes no sense.

Scenarios 2 and 4

If you are using transactional replication and specify sync with backup on both the Publisher and Distributor, you can restore the Publisher or Distributor to the same point. If you are using merge replication and want to restore a Publisher, you have a few options. You could reinitialize all subscriptions with the restored publication database, but you might lose data at a Subscriber that was updated. You could also synchronize the Publisher with another database so it is fully in sync. If you choose the latter option, you must synchronize with a Subscriber that has a global subscription. If you use a Subscriber with anonymous subscriptions, it will not have enough meta data to apply the changes to the production database.

Scenario 3

If you lose both the Distributor and the Subscriber databases at the same time, the procedures outlined for Scenario 1 and Scenario 2 apply in most cases. Merge replication and queued are exceptions.

Scenario 5

If you lose both the publication and the subscription database at the same time, the procedures outlined for Scenarios 1 and 3 apply.

Scenario 6

This can happen if the Distributor is local (that is, the Publisher and Distributor are running on the same server). In this case, the Publisher and Distributor databases can be restored as outlined in Scenarios 2 and 4.

Scenario 7

This is the absolute worst-case scenario, in which every participating server is down for something other than a temporary reason. However, if the servers are genuinely damaged (for example, the datacenter burned down), you will be relying on the restore steps outlined for Scenarios 1, 2, and 4 to get your replication environment up and running again.