Creating a Replication Set | Replicating PostgreSQL Data with Slony

At this point, you've created a replication cluster (a collection of nodes), defined the paths between the nodes, and you have a daemon (slon) servicing each node in the cluster. The replication daemons are exchanging SYNC messages (and configuration messages), but you're not actually replicating any data yet. To replicate a table (or a collection of tables), you must first define a set. A set is a unit of replicationa set can contain one or more tables and one or more sequences. Every set has an origin (a node in the replication cluster).

You create a set by executing a create set command in a slonik script. You add tables to the set by executing a series of set add table commands (likewise, you add a sequence to the set with the set add sequence command). The syntax for these statements is shown here:

create set ( id = integer, origin = node-id, comment = 'description' );

set add table ( set id = set-id, origin = node-id, id = integer,
 fully qualified name = 'schema-name.table-name',
 comment = 'description'
 [, key = {'index-name' | SERIAL}] );

set add sequence( set id = set-id, origin = node-id, id = integer,
 fully qualified name = 'schema-name.sequence-name',
 comment = 'description' );

You must keep a few restrictions in mind when you create a replication set.

First, you cannot add a table (or sequence) to a set once another node has subscribed to that setinstead, you have to create a second set, merge the two sets together, and then re-subscribe all nodes that subscribed to the original. Second, every table that you want to replicate must have a unique identifier. There are three ways that you can convince Slony that a table contains such an identifier. If Slony sees a PRIMARY KEY constraint in the table definition, it's happy. If not, you can include a key=index-name option in the set add table command to tell slonik how to uniquely identify each row in the table. Finally, you can ask slonik to add a unique identifier (a BIGINT column whose default value is defined by the _cluster.sl_row_id_seq sequence). Ideally, you should define a PRIMARY KEY for each table or at least a UNIQUE index defined over a set of non-NULL columns. The Slony developers frown upon those who use slonik's table add key command.

Since we want to replicate the customers and tapes tables, you'll have to ensure that each table contains an acceptable unique identifier. Here's the current definition of the customers table:

movies=# d customers
 Table "springfield.customers" 
---------------+-----------------------+---------
 customer_id | integer | not null
 customer_name | character varying(50) | not null
 phone | character(8) |
 birth_date | date |
 balance | numeric(7,2) |
Indexes:
 "customers_customer_id_key" UNIQUE, btree (customer_id)

You can see that customers already contains an acceptable identifier (the customers_customer_id_key index is a UNIQUE index and it covers a NOT NULL column). You can add the customers table to the replication set without any modifications, but you'll have to tell slonik to use the customers_customer_id_key. The command to do this is

set add table ( set id = 1, origin = 1, id = 1, 
 fully qualified name = 'springfield.customers',
 comment = 'Springfield customers',
 key = 'customers_customer_id_key' );

The tapes table does not contain a unique identifier that slonik would find acceptable, but that's easy to fix. You could create a UNIQUE index that covers the tape_id column (since, in the real world, each tape is uniquely identified by its tape_id), but you may as well create a PRIMARY KEY constraint instead since slonik prefers a PRIMARY KEY. To add a PRIMARY KEY to the tapes table, you would execute the following command:

movies=# ALTER TABLE tapes ADD PRIMARY KEY ( tape_id );
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create
 implicit index "tapes_pkey" for table "tapes"
ALTER TABLE

Now you're ready to complete the replication set. The slonik script shown in Listing 24.7 creates and populates the set.

Listing 24.7. buildSet.sk

#!/usr/local/bin/slonik

# File: buildSet.sk
include ;

create set ( id = 1, origin = @SPRINGFIELD, comment = 'Springfield movies' );

set add table ( set id = 1, origin = @SPRINGFIELD, id = 1,
 fully qualified name = 'springfield.customers',
 comment = 'Springfield customers',
 key = 'customers_customer_id_key' );

set add table ( set id = 1, origin = @SPRINGFIELD, id = 2,
 fully qualified name = 'springfield.tapes',
 comment = 'Springfield tapes' );

Notice that the second set add table command does not specify a keyslonik knows that it can use the PRIMARY KEY column(s) to uniquely identify each row in the tapes table.

Part I: General PostgreSQL Use

Introduction to PostgreSQL and SQL

Working with Data in PostgreSQL

PostgreSQL SQL Syntax and Use

Performance

Part II: Programming with PostgreSQL

Introduction to PostgreSQL Programming

Extending PostgreSQL

PL/pgSQL

The PostgreSQL C APIlibpq

A Simpler C APIlibpgeasy

The New PostgreSQL C++ APIlibpqxx

Embedding SQL Commands in C Programsecpg

Using PostgreSQL from an ODBC Client Application

Using PostgreSQL from a Java Client Application

Using PostgreSQL with Perl

Using PostgreSQL with PHP

Using PostgreSQL with Tcl and Tcl/Tk

Using PostgreSQL with Python

Npgsql: The .NET Data Provider

Other Useful Programming Tools

Part III: PostgreSQL Administration

Introduction to PostgreSQL Administration

PostgreSQL Administration

Internationalization and Localization

Security

Replicating PostgreSQL Data with Slony

Contributed Modules

Index