Section 4.2. Lock Management

Transaction Management > Lock Management

4.2. Lock Management

SQLite, to produce serializable execution of transactions, uses a locking mechanism to regulate database access requests from transactions. It follows strict two-phase locking protocol, i.e., it releases locks only on transaction completion. SQLite does database level locking, but neither row nor page nor table level locking: it sets locks on the entire database file, and not on particular data items in the file.

NOTE

A subtransaction acquires locks through the container user transaction. All locks are held by the user transaction until it commits or aborts irrespective of the outcome of the subtransaction.

4.2.1. Lock types and their compatibilities

From the viewpoint of a single transaction, a database file can be in one of the following five locking states:

NOLOCK: The transaction does not hold a lock on the database file. It can neither read nor write the database file. Other transactions can read or write the database as their own locking states permit. Nolock is the default state when a transaction is initiated on a database file.
SHARED: This lock only permits reading from the database file. Any number of transactions can hold shared locks on the file at the same time, and hence there can be many simultaneous read-transactions. No transaction is allowed to write the database file while one or more transactions hold shared locks on the file.
RESERVED: This lock permits reading from the database file. A reserved lock means that the transaction is planning on writing the database file at some point in the future, but that it is currently just reading the file. There can be at most one reserved lock on the file, but the lock can coexist with any number of shared locks. Other transactions may obtain new shared locks on the file, but not other locks.
PENDING: This lock permits reading from the database file. A pending lock means that the transaction wants to write to the database as soon as possible. It is just waiting on all current shared locks to clear prior to its acquiring an exclusive lock. There can be at most one pending lock on the file, but the lock can coexist with any number of shared locks with one difference: other transactions may hold on to existing shared locks, but may not obtain new (shared or other) ones.
EXCLUSIVE: This is the only lock that permits writing (and reading too) the database file. Only one exclusive lock is allowed on the file, and no other lock of any kind is allowed to coexist with an exclusive lock.

The lock compatibility matrix is given in Table 4-1. The rows are the existing lock mode on the database held by a transaction, and the columns are a new request from another transaction. Each matrix cell identifies the compatibility between the two locks: Y indicates the new request can be granted; N indicates that it cannot.

Table 4-1. Lock compatibility matrix
Requested lock	SHARED	RESERVED	PENDING	EXCLUSIVE
Existing lock	SHARED	RESERVED	PENDING	EXCLUSIVE
SHARED	Y	Y	Y	N
RESERVED	Y	N	N	N
PENDING	N	N	N	N
EXCLUSIVE	N	N	N	N

NOTE

A database system like SQLite needs at least the exclusive lock; the other lock modes just increase transactional concurrency. With only exclusive lock, the system will execute all (read and write) transactions serially. With shared and exclusive locks, the system can execute many read-transactions concurrently. In practice, a transaction reads a data item from a database file under a shared lock, modifies the item, and then requests for an exclusive lock to write the (modified) item back into the file. If two transactions do so simultaneously, there is a possibility of deadlock formation (see the "Deadlock and Starvation" sidebar later in this section), in which the transactions cannot make progress in their executions. The reserved and pending locks are designed to minimize the formation of these kinds of deadlocks. These two locks also help to improve concurrency and to reduce the well-known writer starvation problem (in which reads perpetually overtake writes).

4.2.2. Lock acquisition protocol

Prior to reading the very first (any) page from a database file, a transaction acquires a shared lock that indicates its intention to read pages from the file. Prior to making any changes to a database (at any page), a transaction acquires a reserved lock that indicates its intention to write in the near future. With a reserved lock on a database, a transaction can make changes to "in-cache" pages. Before writing to the database, it needs to obtain an exclusive lock on the database. The locking state transition diagram is given in Figure 4-1.

Figure 4-1. Locking state transition diagram

Normal lock transition is from nolock to shared lock to reserved lock to pending. lock to exclusive lock (as shown by bold lines). The direct shared lock to pending lock transition can only happen if there is a journal that needs rolling back. But, if that is the case, then no other transaction could have made the transition from shared lock to reserved lock. (I revisit this in the "Recovery from Failures" section.)

NOTE

The pending lock is an intermediate lock, and it is not visible outside the lock management subsystem: in particular, the pager cannot ask the lock manager to get a pending lock. It always requests an exclusive lock, but the lock manager first acquires a pending lock en route to getting the exclusive lock. Thus, the pending lock is always just a temporary stepping stone to the path to an exclusive lock. In case the exclusive lock request fails after acquiring a pending lock, the lock will be upgraded to an exclusive lock by a subsequent request for an exclusive lock (from the pager).

NOTE

Although locking solves the concurrency control problem, it introduces another problem. Suppose two transactions hold shared locks on a database file. They both request reserved locks on the file. One of them gets the reserved lock, and the other one waits. After a while, the transaction with the reserved lock requests an exclusive lock, and waits for the other transaction to clear the shared lock. But the shared lock will never be cleared because the transaction with the shared lock is waiting. This kind of situation is called a deadlock.

Deadlock is an annoying problem. There are two ways to handle the deadlock problem: (1) prevention, and (2) detection and break. SQLite prevents deadlock formation. It always acquires file locks in nonblocking mode. If it fails to obtain some lock on behalf of a transaction, it will retry only for a finite number of times. (The retry number can be preset by the application at runtime; the default is once.) If all retries fail, SQLite returns SQLITE_BUSY error code to the application. The application can back off and retry later or abort the transaction. Consequently, there is no possibility of deadlock formation in the system. However, there is a possibility, at least theoretically, of starvation where a transaction perpetually tries to obtain some lock without a success. As SQLite is not targeted for enterprise-level highly concurrent applications, starvation is not a big concern.

4.2.3. Lock implementation

SQLite implements its own locking system by using the file locking functions supported by the native operating system. (Lock implementation is platform-dependent. I use Linux in this book to show how SQLite locks are implemented.) Linux implements only two lock modes, namely read and write, to lock contiguous regions on files. To avoid confusion in terminologies, I use read lock and write lock for native shared lock and exclusive lock, respectively. Linux allocates locks to processes (in single thread applications) and to threads (in multithreaded applications). To avoid confusion between process and thread, I consistently refer to thread in this subsection. Many (sibling and peer) threads may hold read locks on a file region, but only one thread may hold a write lock on the region. A write-lock excludes all other locks, both read and write. Read and write locks can coexist on the same file, but at different regions. A single thread can hold only one type of lock on a region. If it applies a new lock to an already-locked region, then the existing lock is converted to the new lock mode.

SQLite implements its own four lock modes using the two native operating system lock modes on different file-regions. (It sets and releases native locks on regions by making fcntl system calls.)

A SHARED lock on a database file is obtained by setting a read lock on a specific range of bytes on the file.
An EXCLUSIVE lock is obtained by setting a write lock on all bytes in the specified range.
A RESERVED lock is obtained by setting a write lock on a single byte of the file (that lies outside the shared range of bytes); it is designated as the reserved lock byte.
A PENDING lock is obtained by setting a write lock on a designated byte different from the reserve lock byte, outside the shared range.

Figure 4-2 displays this arrangement.

Figure 4-2. File offsets that are used to set POSIX locks

SQLite reserves 510 bytes as the shared range of bytes. (The range value is defined as a macro SHARED_SIZE in a source header file.) The range begins at SHARED_FIRST offset. The PENDING_BYTE macro (0x40000000, the first byte past the 1 GB boundary defines the beginning of the lock bytes) is the byte used for setting PENDING locks. The RESERVED_BYTE macro is set to the next byte after the PENDING_BYTE, and the SHARED_FIRST is set to the second byte after the PENDING_BYTE. All lock bytes will fit into a single database page, even with the minimum page size of 512 bytes.

NOTE

Locking in Windows is mandatory; that is, locks are enforced for all processes, even if they are not cooperating processes. The locking space is reserved by the operating system. For this reason, SQLite cannot store actual data in the locking bytes. Therefore, the pager never allocates the page involved in locking. That page is also not used in other platforms to have uniformity and cross-platform use of databases. The pending byte is set high so that SQLite doesn't have to allocate an unused page except for very large databases.

To obtain a SHARED lock on a database file, a thread first obtains a native read-lock on the PENDING_BYTE to make sure that no other process/thread has a PENDING lock on the file. (You may recall from Table 4-1 that the existing PENDING lock and a new SHARED lock are incompatible.) If this is successful, the SHARED_SIZE range starting at the SHARED_FIRST byte is read-locked, and, finally, the read-lock on the PENDING_BYTE is released.

NOTE

Some versions of Windows support only write-lock. There, to obtain a SHARED lock on a file, a single byte out of the specific range of bytes is write-locked. The byte is chosen at random from the range so that two separate readers can probably access the file at the same time, unless they are unlucky and choose the same byte to set write-locks. In such systems, read concurrency is limited by the size of the shared range of bytes.

A thread may only obtain a RESERVED lock on a database file after it has obtained a SHARED lock on the file. To obtain a RESERVED lock, a write-lock is obtained on the RESERVED lock byte. Note that the thread does not release its SHARED lock on the file. (This ensures that another thread cannot obtain an EXCLUSIVE lock on the file.)

A thread may only obtain a PENDING lock on a database file after it has obtained a SHARED lock on the file. To obtain a PENDING lock, a write-lock is obtained on the PENDING_BYTE. (This ensures that no new SHARED locks can be obtained on the file, but existing SHARED locks are allowed to coexist.) Note that the thread does not release its SHARED lock on the file. (This ensures that another thread cannot obtain an EXCLUSIVE lock on the file.)

NOTE

A thread does not have to obtain a RESERVED lock on the way to obtaining a PENDING lock. This property is used by SQLite to roll back a journal file after a system crash. If the thread takes the route from shared to reserved to pending, it does not release the other two locks upon setting the pending lock.

A thread may only obtain an EXCLUSIVE lock on a database file after it has obtained a PENDING lock on the file. To obtain an EXCLUSIVE lock, a write-lock is obtained on the entire "shared byte range." Because all other SQLite locks require a read-lock on (at least one of) the bytes within the range, it ensures that no other locks are held on the file when the thread has obtained the exclusive lock.

To have a clear picture on SQLite's way of acquiring native locks on a file, the native locking state transition is drawn in Figure 4-3. The figure also reveals the relationship between SQLite locks and native locks. The representations of PENDING lock and EXCLUSIVE lock are a little awkward; they may or may not have a write-lock on the RESERVED_BYTE depending on which route SQLite has taken to set those locks.