8.6 Transactions and Locking

In contrast to some other open source databases, PostgreSQL provides a highly developed system for handling transactions. This is one of the features that makes PostgreSQL such a good choice for almost all purposes.

In this section you will learn about PostgreSQL's transaction-handling capabilities, and you will see how transactions can be used efficiently to build even more sophisticated applications and to protect your data.

8.6.1 The Main Idea of Transactions

Imagine a simple example: Your database contains a table with 200 million records in it. One user starts a query that goes through all records and computes the average of a column. Because the table is very large, this will take some time. While the query of this user is running, a second user starts to delete records at the beginning of the table. What will happen to the first user's query? Logically you don't want the second user's changes to affect the first user's query. One way to solve the problem would be to make the second user wait until the first user is ready. This would be a problem because if the query to compute the average takes very long, your system would be down for the time needed to execute the query. This is obviously not the best solution, although some other databases have solved the problem this way.

However, in the case of PostgreSQL, every query sees a consistent snapshot of the data no matter what other users do meanwhile. In PostgreSQL the changes of User 2 will not affect User 1 until the query has been processed successfully. This way the result of the query will always be consistent because it is built on a consistent snapshot of data. Internally things are very complicated because PostgreSQL has to make sure that every user sees the correct set of data.

Another important point when talking about transactions is that various queries can be combined into one block. The advantage of blocking queries is that if one query in the list fails, the changes made by the queries in the list won't be performed. Just think of a program used to import data into the database. If the entire import is performed using just one transaction, PostgreSQL adds the data to the table at the end of the transaction if the import stops after half the records, these records won't be in the database. The advantage of this behavior is that changes are made only if the entire block is successful. This is comfortable when importing data because you don't have to delete those records from the table that have been imported successfully. Things like that can save you a lot of work and help you to build more secure applications because your data is protected by PostgreSQL's behavior.

8.6.2 Starting and Committing Transactions; Rollback

Normally every SQL statement processed by the server is executed as one transaction. To block various SQL statements, transactions must be started and committed manually.

Let's create a table first:

 phpbook=# CREATE TABLE city (name text, country text); CREATE

To start a transaction, use the BEGIN command. In the next listing you can see the syntax overview of the command:

 phpbook=# \h BEGIN Command:     BEGIN Description: start a transaction block Syntax: BEGIN [ WORK | TRANSACTION ]

As you can see, the syntax of the command is easy. Now that you have seen how the COMMIT command works, you can start to insert some data into the table:

 phpbook=# BEGIN; BEGIN phpbook=# INSERT INTO city VALUES ('New York', 'USA'); INSERT 25560 1 phpbook=# INSERT INTO city VALUES ('Paris', 'France'); INSERT 25561 1

To end a transaction, use COMMIT:

 phpbook=# \h COMMIT Command:     COMMIT Description: commit the current transaction Syntax: COMMIT [ WORK | TRANSACTION ]

Let's commit the transaction:

 phpbook=# COMMIT; COMMIT

If no error occurred, the transaction has been terminated successfully. Two records can be found in table city:

 phpbook=# SELECT * FROM city;    name   | country ----------+---------  New York | USA  Paris    | France (2 rows)

In the next step you can write a SQL file containing some data, including the commands for transaction control:

 BEGIN; INSERT INTO city VALUES ('Vienna', 'Austria'); INSERT INTO city VALUES ('Geneve', Switzerland); COMMIT;

If the data is sent to PostgreSQL using the front end, an error will occur:

 [postgres@athlon postgres]$ psql phpbook < data.sql BEGIN INSERT 25563 1 ERROR:  Attribute 'switzerland' not found COMMIT

You have forgotten to add single quotes around Switzerland, so PostgreSQL is not able to parse the INSERT command. If you look at the content of city, you will see that no record has been added to the table:

 phpbook=# SELECT * FROM city;    name   | country ----------+---------  New York | USA  Paris    | France (2 rows)

This is efficient because this way, nothing has happened to the data in the database. If the explicit transaction handling is omitted, things will be treated differently:

 INSERT INTO city VALUES ('Vienna', 'Austria'); INSERT INTO city VALUES ('Geneve', Switzerland);

If the data is added to the table, the same problem will still occur:

 [postgres@athlon postgres]$ psql phpbook < data.sql INSERT 25564 1 ERROR:  Attribute 'switzerland' not found

However, this time one record has been added to the table, so it has to be removed before importing the data again:

 phpbook=# SELECT * FROM city;    name   | country ----------+---------  New York | USA  Paris    | France  Vienna   | Austria (3 rows)

When you are importing data, make sure that the entire data is always imported in just one transaction. This way failures will be corrected by the database automatically and you don't have to do these corrections yourself.

Another important command when dealing with transactions is ROLLBACK. In order to undo a transaction, ROLLBACK is essential. Take a look at an example:

 phpbook=# BEGIN; BEGIN

After starting a transaction, you can add a record to the table:

 phpbook=# INSERT INTO city VALUES ('London', 'UK'); INSERT 25565 1

The record can be found in the table:

 phpbook=# SELECT * FROM city;    name   | country ----------+---------  New York | USA  Paris    | France  Vienna   | Austria  London   | UK (4 rows)

In the next step you want to abort the transaction and cancel all operations. This can be done by using ROLLBACK:

 phpbook=# ROLLBACK; ROLLBACK

If you look at the table now, London has been removed from the table:

 phpbook=# SELECT * FROM city;    name   | country ----------+---------  New York | USA  Paris    | France  Vienna   | Austria (3 rows)

The transaction has been interrupted by ROLLBACK, so all operations that have been performed were canceled as well because all commands in the transaction can be seen as one big command.

If you try to commit the transaction, you will find out that the connection has already been terminated by ROLLBACK:

 phpbook=# COMMIT; NOTICE:  COMMIT: no transaction in progress COMMIT

PostgreSQL does not support nested transactions, which means that it is not possible to run a transaction inside a transaction:

 phpbook=# BEGIN; BEGIN phpbook=# BEGIN; NOTICE:  BEGIN: already a transaction in progress BEGIN

Maybe we will see this feature in future versions of PostgreSQL.

8.6.3 Transaction Isolation

Transaction isolation is important for concurrent transactions. Data can never be seen by a transaction before it has been committed. Therefore some types of reads must not happen:

Dirty reads Data from an uncommitted concurrent transaction must not be read by a transaction.
Non-repeatable reads If a transaction has to read the same piece of data more than once (this can happen in case of complex queries), the data read by the query must not have changed.
Phantom read The changes made by a concurrent connection that is committed must not affect other transactions that are already running.

According to the ANSI SQL specification, four transaction isolation levels have been defined: read uncommitted, read committed, repeatable read, and serializable.

Two of these transaction isolation levels are supported by PostgreSQL: read committed and serializable. The default transaction isolation level is read committed.

Table 8.3 shows the difference between these various levels.

Table 8.3. Transaction Isolation Levels
	Dirty Read	Non-Repeatable	Phantom Read
Read uncommitted	Possible	Possible	Possible
Read committed	Not possible	Possible	Possible
Repeatable read	Not possible	Not possible	Possible
Serializable	Not possible	Not possible	Not possible

As you can see, various transaction isolation levels provide certain levels of security. Usually the default transaction isolation level will satisfy your demands.

Let's see how the transaction isolation level can be changed.

To see which level is currently used, use the SHOW command:

 phpbook=# SHOW TRANSACTION ISOLATION LEVEL; NOTICE:  TRANSACTION ISOLATION LEVEL is READ COMMITTED SHOW VARIABLE

To use SERIALIZABLE as the current transaction isolation level, use the SET command:

 phpbook=# SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SET VARIABLE

To change the transaction isolation to the default value, use SET as well.

8.6.4 PostgreSQL's Locking System

Locking is an essential part of the internals of every database. Depending on the quality of your database, the way locking is performed varies significantly. In general two basic ways of locking can be distinguished:

Table locking Entire tables are locked in order not to conflict with other operations.
Row-level locking Individual rows are locked instead of tables.

In the case of more than just one concurrent operation, row-level locking has significant advantages over table locking. Because PostgreSQL is a highly developed and high-quality database, the server rarely uses table locks. Especially when many users are performing concurrent operations including long and complex queries, this is much better and will lead to fewer conflicts.

To lock an object, use a command called LOCK. Take a look at the syntax overview of the command:

 phpbook=# \h LOCK Command:     LOCK Description: explicitly lock a table Syntax: LOCK [ TABLE ] name [, ...] LOCK [ TABLE ] name [, ...] IN lockmode MODE where lockmode is one of:         ACCESS SHARE | ROW SHARE | ROW EXCLUSIVE | SHARE UPDATE EXCLUSIVE |         SHARE | SHARE ROW EXCLUSIVE | EXCLUSIVE | ACCESS EXCLUSIVE

In this book the way locking is performed by the database internally will not be covered because this topic is beyond the scope of this book. For further information, we recommend that you read the PostgreSQL Developer's Handbook.

8.6.5 Examples Using Transactions

This section presents some practical examples to show how transactions affect your everyday life with the database. It is necessary to understand what happens in case of concurrent transactions. To reduce the data involved in the process, you can delete the records stored in city:

 phpbook=# DELETE FROM city; DELETE 3

Take a look at some more examples.

Table . Example One
User 1	User 2	Description
BEGIN; INSERT INTO city VALUES ('London', 'UK');		User 1 starts a transaction. User 1 inserts a record.
	SELECT * FROM city;	User 2 sees an empty table.
SELECT * FROM city; COMMIT;		User 1 can see his record.
	SELECT * FROM city;	User 2 can see one record.

First User 1 starts a transaction and inserts one record into the table. While the transaction of User 1 is still in progress, User 2 queries the table. No records will be retrieved because User 2 cannot see uncommitted data. User 1 can see the data he has inserted because he is inside his personal transaction. After the transaction has been committed, User 2 can see the data as well.

Take a look at a slightly more complex example. Remember, one record is already in the table:

Table . Example Two
User 1	User 2	Description
`BEGIN;`		User 1 starts a transaction.
	`BEGIN;`	User 2 starts a transaction.
INSERT INTO city VALUES ('Salzburg', 'Austria');		User 1 inserts a record into the table.
	`DELETE FROM city;`	User 2 deletes all records in the table.
`COMMIT;`		User 1 commits his transaction.
	`COMMIT;`	User 2 commits his transaction.
`SELECT * FROM city;`		User 1 retrieves all records.
	`SELECT * FROM city;`	User 2 retrieves all records.

At the beginning the two users start two transactions and User 1 inserts one record into the table. In the next step User 2 deletes all records in the table. Because User 2 can only see the data that is already in the table, only one record will be deleted (the one containing London). Finally User 1 and User 2 commit the transactions. The question now is: Will Salzburg still be in the database? The first transaction has been committed before the second one, so the record is already in the table when the second transaction is committed. However, Salzburg can be found in the table:

 phpbook=# SELECT * FROM city;    name   | country ----------+---------  Salzburg | Austria (1 row)

As you can see, the way transactions are processed is not always obvious, but it is necessary to understand the basics of how transactions are processed and how your work is affected by them.

A crucial point when dealing with transactions is the way PostgreSQL deals with sequences. Take a look at an example:

Table . Example Three
User 1	User 2	Description
`BEGIN;`		User 1 starts a transaction.
	`BEGIN;`	User 2 starts a transaction.
CREATE SEQUENCE seq_test INCREMENT 1 START 1; SELECT nextval('seq_test');		User 1 creates a sequence. The sequence returns 1.
	`SELECT nextval('seq_test');`	User 2 cannot see the relation.
`COMMIT;`
	`COMMIT;`

Two transactions are started. In the next step User 1 creates a sequence and retrieves a value from the sequence. User 2 cannot see the sequence yet. So far you haven't seen anything unusual, but let's get to the next example:

Table . Example Four
User 1	User 2	Description
`BEGIN;`		User 1 starts a transaction.
	`BEGIN;`	User 2 starts a transaction.
`SELECT nextval('seq_test');`		The sequence returns 2.
	`SELECT nextval('seq_test');`	The sequence returns 3.
	`ROLLBACK;`	User 2 performs a
`ROLLBACK`.
	`SELECT nextval('seq_test');`	The sequence returns 4.
`COMMIT;`		User 1 commits the transaction.
	`COMMIT;`	User 2 commits the transaction.

First, two transactions are started and both users select a value from the sequence. As you can see, the value is incremented. Although both users access the same consistent snapshot of data, the sequence increases. Even in the case of ROLLBACK, the sequence is not set to the old value. This behavior of PostgreSQL is necessary because otherwise it would not be possible to retrieve unique values from the sequence in the case of concurrent transactions. Another important point is that there is no need to wonder if a list of numbers generated by a sequence is not complete things like that can easily happen if you make heavy use of PostgreSQL's transaction code. To point it out more clearly PostgreSQL must behave the way we have shown in order to generate unique values.

8.6.6 Limitations of PostgreSQL's Transaction System

Although PostgreSQL's way of handling transactions is a good one, several restrictions must be taken into consideration when using transactions. In this section we will provide a brief overview of these restrictions.

PostgreSQL currently supports two of four transaction isolation levels defined by the ANSI standard. In most cases, this will be enough.

Oracle users might miss being able to add savepoints to a transaction. Currently savepoints are not yet supported. In addition, PostgreSQL does not supported nested transactions, which means that you cannot start a transaction within a transaction.

8.6.1 The Main Idea of Transactions

8.6.2 Starting and Committing Transactions; Rollback

8.6.3 Transaction Isolation

Table 8.3. Transaction Isolation Levels

8.6.4 PostgreSQL's Locking System

8.6.5 Examples Using Transactions

Table . Example One

Table . Example Two

Table . Example Three

Table . Example Four

8.6.6 Limitations of PostgreSQL's Transaction System