Designing a Replication Topology | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

When your databases are widely dispersed over the Internet or across an extensive corporate WAN, supporting widely dispersed data centers, and you must replicate data, you have to define a replication topology to support the interconnection of servers and replication and synchronized updating of the data that resides on them. Not only must the topology take into consideration how the servers communicate, but it must cater to the synchronization that has to occur between copies so that data remains consistent and durable across the enterprise.

Designing a replication topology will help you, among other things, determine how long it takes for changes to get from a publisher to a subscriber, how updates are propagated, and the order in which updated information arrives at a subscriber. There are several steps you must take when designing a replication topology:

You will need to select the physical replication model. This can be any one of the following models: a central publisher, a central publisher with remote distributor, a publishing subscriber, or a central subscriber.
You will have to determine where to locate the snapshot files (which are used to create the first loads to the receiving databases). You will also need to determine how the publishers and subscribers will initially synchronize their data.
You will need to decide if the distributor will be local or remote. You will also need to decide if the distribution database will be shared. More than one publisher can share a distributor, with each one using its own distribution database on the publisher. They can also share a distribution database, so you have lot to synthesize.
There are many different types of replication options to use. You have had a tiny taste of a few in this chapter, and you will need to decide what options are going to be best for your solution.
You will also need to determine whether the replication will kick off at the publisher- in what is typically called push subscription-or at the subscriber through (you guessed it) a pull subscription.

On a WAN, managing many subscribers and publishers can be a complex situation and requires the dedication of DBAs devoted to the replication process. Many data paths might exist between the servers, and your job will be to ensure that the data remains synchronized and the solution works to have subscribers obtain the correct versions of the data. Fortunately, replication technology has come a long way from the day data was updated in the morning and then overwritten again with yesterday’s information in the afternoon.

Understanding the Physical Replication Models

The physical replication model is your blueprint f or how you will allow data to be distributed across the enterprise or the Internet. Understanding the physical model means understanding how to configure the servers for replication services. If you are new to replication, and many people are, the following sections provide a point of departure, a proverbial leg up, so to speak.

The advice that you cannot be too careful about planning for replication deployment might seem like a gross understatement, but when you have a highly complex replication model, I cannot stress how important it is to properly plan the whole effort. Remember, you have to plan to maximize data consistency, minimize demands on network resources, and implement sound technical services that will prevent a disaster down the road. Many Internet or Web applications today consist of demanding replication needs, and if there is one factor that is a business killer, it is finding out after the fact that there is a flaw in the design of your replication topology, the models used, and so on. The following list of considerations should be noted before you begin to make any purchases or decisions that may prove expensive to undo later:

Decide how and where replicated data needs to be updated, and by whom.
Decide how your data distribution needs will be affected by issues of consistency, autonomy, and latency
Draw up a blueprint or architecture map illustrating your replication environment. Include your business users, your technical infrastructure, your network and security, and the characteristics of your data.
Evaluate the types of replication you can use and the replication options that will work for your solution.
Evaluate the replication topology options and the effect they will have on the types or type of replication you may be considering.

Let’s now turn to standby servers.