Introduction to Transactional Replication | Microsoft SQL Server 2000 Administrators Companion

3 4

As indicated by its name, transactional replication is used to replicate transactions. When this replication method is used, the initial subscriber database is created by means of a snapshot, and then subsequent changes to the publisher database are replicated to the subscriber on a per-transaction basis. All database updates, insertions, and deletions are propagated in this manner.

Transactions are read from the transaction log on the publisher by the Log Reader Agent, which runs on the distributor and connects to the publisher. Those transactions are processed and put into the distribution database on the distributor. At a later time, the Distribution Agents read this information out of the distribution database and apply the transactions to the subscriber's database.

Each database that uses transactional replication has its own Log Reader Agent that runs on the distributor and monitors that database's transaction log on the publisher. Only one Log Reader Agent per database is used, regardless of the number of publications that are defined on a database.

The Log Reader Agent runs on the distributor on a schedule that is determined at the time that the publication is created. The Log Reader Agent will run either on a regularly scheduled basis or continuously, whichever you choose. When the Log Reader Agent runs, the sequence of events is as follows:

The Log Reader Agent reads the publication system's transaction log and creates a list of any INSERT, UPDATE, or DELETE statements, and other modifications to the data that have been marked for replication, including table drops, column definition changes, and so on.
The Log Reader Agent processes the data from the transaction log and performs any filtering that has been defined on the articles. This includes all horizontal and vertical filtering.
These modifications are batched and sent to the distribution database on the distributor. Within the distribution database there are multiple tables that keep track of replication changes and tasks. The modifications to the publisher that must be propagated to the subscribers are kept in a table called MSRepl_commands. This table holds the actual replication commands in a compressed format. The MSRepl_commands table contains one row for each insert, update, and delete for each article that has been defined. If a modification is made to a table on the publishing database that is contained in multiple articles, that change will be duplicated in the distribution database. For example, if table A is contained in three articles, an update to A will cause three rows to be created in the distribution database.
After each batch has been successfully sent to the distribution database, the transactions in the batch are committed. If the commit fails, an error message will be written to the agent error log.
Following the successful commit of the changes to the distribution database, the Log Reader Agent marks the last change included in the most recent replication operation so that changes are not repeated.
Once the transaction has been read from the transaction log and committed into the distribution database, the Log Reader Agent marks those rows in the transaction log as eligible to be truncated.

Every modification in the publication database will create at least one entry in the distribution database. In some cases, a modification in the publication database will cause multiple entries to be created in the distribution database. Those cases are described here:

An insert into a table will create an insert into the distribution database for every article that the table is a member of. If a table exists in two different publications, it will be defined as two separate articles. Both articles will have a row in the distribution database for each insert, update, and delete from the publication database.
An update or delete that affects multiple rows will create a row in the distribution database for each row affected. The SQL statement that performs an update to or a delete of multiple rows will cause the Log Reader Agent to create an individual command in the distribution database for each row that is affected. The WHERE clause in the SQL statement is converted to a WHERE clause that specifies a row in the database based on the primary key value. For example, an update that affects all the rows in a 10-row table will create 10 entries in the distribution database, each specifying the primary key values in the WHERE clause.