Tuning for Transactional Replication | Microsoft SQL Server 2000 Administrators Companion

3 4

In this section, you will learn how to configure and tune a transactional replication system for optimal performance. This section begins with a review of the attributes of transactional replication, and then it presents configuration, monitoring, and tuning guidelines.

Attributes of Transactional Replication

Transactional replication starts with the copying of a snapshot to the distributor and then to the subscriber. Once the snapshot has been copied, the Log Reader Agent, which runs on the distributor, reads the publisher's transaction log either on a continual basis or according to a regular schedule. How often the transaction log is read is determined by how you configure the Log Reader Agent. (The overhead incurred on the publisher when the transaction log is being read is the only replication-related overhead the publisher requires.)

The transactions that the Log Reader Agent reads from the transaction log on the publisher are put into the distribution database. These transactions are then eventually sent to the subscribers. Factors that can limit the performance of transactional replication include the following:

I/O performance on the publisher's transaction log The transaction log on the publisher is read in order to determine what changes have been made. Because the transaction log is read as well as written to when replication is used, the sequential access to the transaction log can be disrupted. This can cause a bottleneck. To prevent this, the log must be carefully configured.
Performance of the distributor Depending on how many replication operations are occurring and how many publishers are using the distributor, the distributor might experience performance problems. Earlier in this chapter, you learned how to configure and tune the distributor.
Performance of the subscriber The subscriber can experience performance problems, depending on what activity is occurring on it. To prevent these problems, perform standard tuning operations on the subscriber's SQL Server database.

As you can see, several factors can limit performance. By properly sizing and configuring the systems involved, you can reduce the effects of these factors and ensure efficient performance.

Configuring Transactional Replication

Configuring a transactional replication system involves several tasks. As mentioned in the previous section, you must properly configure the transaction log on the publisher because it experiences additional overhead when replication is used. In this section, we'll look at several other guidelines that you should keep in mind when you configure your transactional replication system. These guidelines are summarized here:

Configure sufficient I/O capacity on all the replication systems, following general guidelines for configuring I/O capacity. (You might need to configure more I/O capacity for the transaction log on the publisher than is normally required.)
Increase the commit batch size on the distributor.
Tune the Log Reader Agent.

Configure Sufficient I/O Capacity

By configuring sufficient I/O capacity, you can enhance the performance of the entire replication process. As on any SQL Server system, the transaction log on a system participating in replication should be located on its own RAID 1 volume for data protection. The data files should be located on one or more RAID 10 or RAID 5 volumes. Unlike snapshot replication, transactional replication requires only minor adjustments to standard I/O configurations. Those requirements are described in this section.

Configuring the I/O Subsystem on the Publisher In general, you should follow the standard SQL Server configuration guidelines described throughout this book when you are configuring the publisher's I/O subsystem. However, you might need to configure more I/O capacity for the publisher's transaction log than is usually required. Normally, the transaction log should be configured on a RAID 1 volume. If necessary (depending on how busy your system is), you should configure it on a RAID 10 volume, using more disk drives. RAID 5 is not appropriate for the transaction log.

Configuring the I/O Subsystem on the Distributor The distributor should be configured such that the distribution database has its transaction log on a dedicated RAID 1 disk volume. This will allow the distribution database's log to achieve maximum performance, thus improving the performance of the distributor.

Configuring the I/O Subsystem on the Subscriber Transactional replication does not require any special I/O configuration on the subscriber. Simply follow the general sizing and configuration guidelines described throughout this book.

Configure the Commit Batch Size on the Distributor

The commit batch size on the distributor determines how many replication transactions are committed on the distributor in a single batch. If you increase the batch size, more rows will be committed at a time, thus increasing the time that the distribution tables are unavailable to other processes. (Locks are held on the tables while the distribution database is being updated.) If you decrease the batch size, fewer rows will be committed at a time, thus giving other processes a chance to access the distribution database.

If the distribution database experiences a large amount of activity that is generated by several sources (that is, by the publisher and several subscribers), try reducing the batch size. If the Log Reader Agent is running on a periodic schedule and has many transactions to insert into the distribution database at once, you might benefit from configuring a larger batch size. You might not need to change the batch size, but if you do, compare the differences between increasing and decreasing the batch size in order to determine which is better.

You can configure the commit batch size in Enterprise Manager by accessing the properties of the Distribution Agent. For more information, see the section "Monitoring and Managing Replication Agents" in this chapter.

Tune the Log Reader Agent

As mentioned, reads to the publisher's transaction log performed by the Log Reader Agent can randomize the otherwise sequential I/O operations of the log. By configuring the Log Reader Agent, you might be able to reduce its effect on the log. There are several ways to make the Log Reader Agent's operations more efficient. One way is to use a caching controller for the log drive volume. Because the Log Reader Agent reads from the log drive, a cache on a controller will allow the read to take place from the cache, rather than cause a random I/O operation to occur.

Another way to tune the Log Reader Agent is to configure it to run less frequently. The Log Reader Agent can run on a continuous basis or periodically. If your system is not experiencing a large number of updates, you might be able to run the Log Reader Agent continuously without disrupting the transaction log. If your system's transaction log is busy, you can improve performance of the publisher by configuring the Log Reader Agent to run less frequently. That way, the Log Reader Agent will not be reading from the transaction log as often, thus allowing the transaction log I/O operations to remain sequential.

Yet another way to make the Log Reader Agent more efficient in heavily used systems is to increase the read batch size. This specifies how many transactions are read from the transaction log and copied to the distributor at a time. In heavily used systems, increasing this parameter can improve performance. In addition, when you increase the polling interval, you might find it useful to increase the batch size. If you increase the read batch size on the Log Reader Agent, you should increase the commit batch size on the distributor to correspond to the new read batch size.

You can configure the Log Reader Agent by accessing its properties in Enterprise Manager. See the section "Monitoring and Managing Replication Agents" for more information.

Monitoring the Transactional Replication System

You monitor transactional replication activities, like you monitor other types of replication activities, via perfmon. Within perfmon are a number of objects that are added when SQL Server replication is used. These objects are as follows:

SQLServer:Replication Agents Provides the number of each type of agent that is running.
SQLServer:Replication Dist. Provides information about distribution latency. Long latencies can be a sign that the distributor is overloaded.
SQLServer:Replication Logreader Provides data about Log Reader Agent activity and latency. Look for long latencies. This can be an indication that a problem exists relating to the Log Reader Agent's reading of the transaction log on the publisher. Also, watch the number of delivered transactions per second. If this number is high, you might need to add more I/O capacity to the transaction log disk volumes.

By using Performance Monitor to monitor these values, you can sometimes determine if the Log Reader Agent or the distributor is experiencing a performance problem. This perfmon data provides a lot of valuable information, but it does not always identify problems.

Tuning the Transactional Replication System

The main steps in tuning the transactional replication system are to properly configure and monitor the system, as described in the preceding sections. In addition, after the system is in production and you can monitor it, you might need to modify the polling interval. The default value of 10 seconds is usually adequate. If you increase the polling frequency (by making the polling interval shorter), transactions will be replicated faster, but overhead on the transaction log will be increased. By decreasing the polling frequency (making the polling interval longer), you will reduce the overhead on the transaction log, but transactions will take longer to replicate.

In addition, if your system experiences frequent updates, you might need to increase the read batch size. As mentioned, this will let the Log Reader Agent read more transactions out of the transaction log at a time. If you increase this value and leave the polling interval at 10 seconds, more transactions will be replicated and less additional overhead will be required.

You also need to monitor the capacity of the network and increase it if necessary, just as you do if you use snapshot replication. If your system does not appear to be performing adequately—for example, if CPUs and I/O subsystems are reaching their capacity and if the replication process seems to be taking too long—you might have a network problem. Unfortunately, network problems cannot be diagnosed via perfmon. A network-monitoring product such as Microsoft Systems Management Server (SMS) should be used. Monitor the network card to see if it's reaching its capacity.

Finally, remember that the publisher, distributor, and subscribers are SQL Server systems. You should therefor tune these systems just as you would tune any other SQL Server system. Tuning guidelines for SQL Server systems are given throughout this book.