I l @ ve RuBoard |
PostgreSQL makes use of Multiversion Concurrency Control (MVCC) to maintain data consistency. Understanding, at least in concept, how MVCC works can be beneficial to a PostgreSQL administrator or developer. Most popular RDBMSs make use of table or row locks to maintain database consistency. Typically, these locks occur at the physical level of the file. These locks are used to prohibit two or more instances from writing to the same row (or table) concurrently. PostgreSQL uses a more advanced method for ensuring database integrity. In MVCC, each transaction sees a version of the database as it existed at some near point in the past. It is import to distinguish that the transaction is not seeing the actual data, just a previous version of that data. This prevents the current transaction from having to deal with the database arriving at an inconsistent state due to other concurrent database transactions. In essence, once a transaction is started, that transaction is an island unto itself. The underlying data structures are isolated from other transactions' manipulations. Once that transaction has ended, the changes it made to its specific version of the database are merged back into the actual data structures. There are three types of concurrency problems that any RDBMS has to deal with:
PostgreSQL provides for READ COMMITTED and SERIALIZABLE isolation levels of protection that offer the following:
READ COMMITTED LevelThis is the default method of isolation protection in PostgreSQL. READ COMMITTED level prevents queries from seeing data changes after the transaction has been started. However, the transaction will see previous changes it made to the table while in process. The crucial point in understanding READ COMMITTED isolation is what happens during an UPDATE , DELETE , or SELECT FOR UPDATE command being run by another transaction. Instances such as these can cause only partial isolation support. The following steps outline how such a scenario can occur:
The important point to notice is what happens in step 4. At this point, Transaction A has a new version of the database it is using. This occurred when Transaction A re-executed its query. At that point, it was using a new version of the database as its baseline. Therefore, subsequent statements in Transaction A will operate on the changes made by Transaction B. So, in this way, the transaction isolation is only partial. It is possible for transactions to "seep" into each other in specific cases like these. SERIALIZABLE LevelThis isolation level differs from READ COMMITTED in that each transaction must occur in a scheduled serial manner. No transactions can occur that would result in one transaction acting on another transaction's modifications. This is enforced by strict transaction scheduling. This scheduling mandates that if one transaction completes successfully and thereby contaminates another transaction's read buffer, then the second transaction issues a ROLLBACK automatically. The practical effect of this is that the database system must be constructed in such a way as to expect transaction failures and retry them afterward. On a heavily used system, this could mean that a significant percentage of transactions would fail due to this strict scheduling. Such a burden could make the system much slower than would occur under a straight READ COMMITTED database. In the majority of cases, the READ COMMITTED isolation level is appropriate. However, there might be some queries that mandate such a rigorous approach to ensure data validity. |
I l @ ve RuBoard |