Replication Configurations | MCSE Training Guide (70-229). Designing and Implementing Databases with MicrosoftR SQL ServerT 2000 Enterprise Edition

Replication can be set up in a number of ways. The different scenarios in which replication is set up each provide specific benefits and qualities. SQL Server can serve as Publisher, Subscriber, or Distributor. The individual roles can all be set up on a single machine, although in most implementations there is at least a separation between a Publisher/Distributor and the subscribing machine(s). In some other scenarios, the subscriber of the data from one machine may re-publish the data to still other subscribers. The possibilities are endless. You can choose to implement any of the following common scenarios, which cover the basics of a physical replication model. In real-world scenarios, however, the actual physical model used could have any combination of these elements, based on a business's individual needs.

Central Publisher, multiple Subscribers
Multiple Publishers, multiple Subscribers
Multiple Publishers, single Subscriber
Single Publisher, remote Distributor

Central Publisher, Multiple Subscribers

In this form of replication, the data originates at the publishing server, and that original data is sent to multiple Subscribers via the Distributor. Depending on the form of replication used, changes to the data at the destination servers can enable updates to be propagated back to the Publisher and other Subscribers or can be treated as read-only data, where updates occur only at the Publisher. Later, this chapter looks at the specifics of the different forms of replication, because the physical model chosen does not affect the capability of the server to accept updates on the data received.

This type of scenario is typically used when a company has a master catalog at the headquarters and has many different Subscribers located elsewhere. This configuration is shown in Figure 11.2.

Figure 11.2. The Central Publisher, Multiple Subscribers configuration.

graphics/11fig02.gif

For example, a chemical manufacturing company named Fronted Chemicals has headquarters located in Geneva, Switzerland, that keeps a list of the chemicals they produce in a table named Chemicals . This table holds specific information about each chemical such as DangerLevel , UnitPrice , TotalQuantity , and so forth. What they want to do is send this list of chemicals to 25 international sales offices located in North America and Europe. They can use replication based on a multiple Subscribers, with the central Publisher in Geneva sending the chemical list to all offices.

An advantage of this configuration is that multiple copies of the data from a central server are available for user processing on multiple machines. Data can be distributed to the locations where it is needed. With this form of replication, data can be brought closer to the user , and the load on a single server can be reduced. Expensive bandwidth can also be utilized in an improved manner.

The location of the subscribing servers is not really relevant to the model, although variations on the model may be necessary to minimize costs and decrease the time needed to synchronize data at all locations. The Subscribers receiving the data can be within the same building or city where there are a large number of users within a very close proximity who need access to the data. It might be necessary to locate Subscribers throughout separate regions of the country so that the data can be brought closer to the end user, thereby minimizing communications over a WAN link. A different implementation may have a global corporation with Subscribers located around the world.

Although it is a common solution for copying data from one server to many others servers, the single, central Publisher with multiple Subscribers is only one of many basic configurations. The next few sections examine a number of other configurations that are used in some different situations.

Multiple Publishers, Multiple Subscribers

In the Multiple Publishers, Multiple Subscribers scenario, each server involved in the replication process acts as both a Publisher and a Subscriber. In this replication, every server publishes a particular set of rows that relate to it and subscribes to the publications that all the other servers are publishing, so that each of them receive data and send data.

Multiple Publishers, Multiple Subscribers is typically used in a distributed warehouse environment with inventory spread out among different locations, or any other situation in which the data being held at each location-specific server needs to be delivered to the other servers so that each location has a complete set of data.

You might have this type of replication set up in chained public libraries. When a reader comes to request a particular book, the librarian checks the book inventory table and finds it is not available in this local branch. The replicated data she gets from all other Publishers' book inventories is combined with her results to show which library currently holds the desired book. This is not only the case with her library, but all the other libraries in the chain. (This configuration is shown in Figure 11.3.) If data were replicated to only a single library, it would have been a Multiple Publisher, Single Subscriber model, which is what the next section covers.

Figure 11.3. The Multiple Publisher, Multiple Subscribers configuration.

graphics/11fig03.gif

In the figure, the servers are acting as both Publishers and Subscribers of data, although not every server receives data from every other server. In some forms of this type of replication, all servers publish and subscribe to all other servers. For this form of replication, the correct database design is crucial to having each server publish and subscribe to the correct information. The table structure for the data involved in the publication is usually implemented with a compound Primary Key or unique index, although it is possible to use an Identity or other algorithm that enables each location to be uniquely identified within the entire table.

One portion of the key is an identifier for the location, whereas the second element is the data identifier. In the library example, the key would likely be made up of a library location identifier combined with an ISBN number or other distinguishing value that defines the individual book. Each library would publish a subset of the table based on its location identifier while subscribing to each other library's subset.

Another way to involve a compound key structure is to have multiple locations send information to a single subscribing machine. The only location that has a complete copy of all the data is the single subscriber. This is the next form of replication configuration to be discussed.

Multiple Publishers, Single Subscriber

In the Multiple Publishers, Single Subscriber scenario, a server subscribes to publications on some or all of a number of other publishing servers. This is needed when overall data is required at only one site, possibly the headquarters. Consider the following scenario where data need not be replicated to more than one site.

A car distribution company holds a diverse number of cars in stock, and they record the number of cars sold per week into the Sold Table Inventory. This data must be given back to the HQ so that they can make business decisions and send more cars when a certain type goes short. This data does not need to be replicated to other branches because it would be useless to them.

This configuration is shown in Figure 11.4.

Figure 11.4. The Multiple Publisher, Single Subscriber configuration.

graphics/11fig04.gif

Single Publisher, Remote Distributor

Replication does not need to include a Distributor residing on the same server or even within close proximity to the Publisher. Instead, the machine handling the distribution can be implemented as a totally separate segment. This is practical when you need to free the publishing server from having to perform the distribution task and minimize costs that can be incurred over long distance or overseas network connections. Data can also be replicated faster and delivered to many Subscribers at a much lower cost while minimizing the load on the Publisher. This configuration is shown in Figure 11.5.

Figure 11.5. The Single Publisher, Remote Distributor configuration.

graphics/11fig05.gif

EXAM TIP

Slow Connections You may find this type of scenario on the exam: If the network connection is over a slow link or the links between the Publisher and Subscriber are over high-cost connections, then a remote Distributor should be used to lower the cost and increase data transfers rates.

The individual roles of each server are implemented in all types of replication; the physical configuration does not dictate the type of replication used. The next section examines the types of replication and their implementation.

REVIEW BREAK: Publisher/Subscriber Metaphor and Replication Configurations

So far we have learned about the Publisher/Subscriber metaphor and the different replication configurations available. The Publisher/Subscriber metaphor is the basis of all replication processes and defines the data originator (Publisher), data sender (Distributor), and the data receiver (Subscriber). Replication configurations enable the Publisher, Distributor, and Subscriber to be arranged in a diverse number of ways, enough to meet all real-world situations. In any given replication scenario, the publishing server has the copy of the original data.