8.5. File Locking Locks may be placed on any arbitrary range of bytes within a file. These semantics are supported in FreeBSD by a list of locks, each of which describes a lock of a specified byte range. An example of a file containing several range locks is shown in Figure 8.13 (on page 320). The list of currently held or active locks appears across the top of the figure, headed by the i_lockf field in the inode, and linked together through the lf_next field of the lock structures. Each lock structure identifies the type of the lock (exclusive or shared), the byte range over which the lock applies, and the identity of the lock holder. A lock may be identified either by a pointer to a process entry or by a pointer to a file entry. A process pointer is used for POSIX-style range locks; a file-entry pointer is used for BSD-style whole file locks. The examples in this section show the identity as a pointer to a process entry. In this example, there are three active locks: an exclusive lock held by process 1 on bytes 1 to 3, a shared lock held by process 2 on bytes 7 to 12, and a shared lock held by process 3 on bytes 7 to 14. Figure 8.13. A set of range locks on a file. In addition to the active locks, there are other processes that are sleeping waiting to get a lock applied. Pending locks are headed by the lf_block field of the active lock that prevents them from being applied. If there are multiple pending locks, they are linked through their lf_block fields. New lock requests are placed at the end of the list; thus, processes tend to be granted locks in the order that they requested the locks. Each pending lock uses its lf_next field to identify the active lock that currently blocks it. In the example in Figure 8.13, the first active lock has two other locks pending. There is also a pending request for the range 9 to 12 that is currently linked onto the second active entry. It could equally well have been linked onto the third active entry, since the third entry also blocks it. When an active lock is released, all pending entries for that lock are awakened, so they can retry their request. If the second active lock were released, the result would be that its currently pending request would move over to the blocked list for the last active entry. A problem that must be handled by the locking implementation is the detection of potential deadlocks. To see how deadlock is detected, consider the addition of the lock request by process 2 outlined in the dashed box in Figure 8.13. Since the request is blocked by an active lock, process 2 must sleep, waiting for the active lock on range 1 to 3 to clear. We follow the lf_next pointer from the requesting lock (the one in the dashed box), to identify the active lock for the 1-to-3 range as being held by process 1. The wait channel for process 1 shows that process 2 is sleeping, waiting for a lock to clear, and identifies the pending lock structure as the pending lock (range 9 to 12) hanging off the lf_block field of the second active lock (range 7 to 12). We follow the lf_next field of this pending lock structure (range 9 to 12) to the second active lock (range 7 to 12) that is held by the lock requester, process 2. Thus, the lock request is denied, as it would lead to a deadlock between processes 1 and 2. This algorithm works on cycles of locks and processes of arbitrary size. Performance is reasonable provided there are fewer than 50 processes contending for locks within the same range of a file. As we note, the pending request for the range 9 to 12 could equally well have been hung off the third active lock for the range 7 to 14. Had it been, the request for adding the lock in the dashed box would have succeeded, since the third active lock is held by process 3 rather than by process 2. If the next lock request on this file were to release the third active lock, then deadlock detection would occur when process 1's pending lock got shifted to the second active lock (range 7 to 12). The difference is that process 1, instead of process 2, would get the deadlock error. When a new lock request is made, it must first be checked to see whether it is blocked by existing locks held by other processes. If it is not blocked by other processes, it must then be checked to see whether it overlaps any existing locks already held by the process making the request. There are five possible overlap cases that must be considered; these possibilities are shown in Figure 8.14. The assumption in the figure is that the new request is of a type different from that of the existing lock (i.e., an exclusive request against a shared lock, or vice versa). If the existing lock and the request are of the same type, the analysis is a bit simpler. The five cases are as follows: The new request exactly overlaps the existing lock. The new request replaces the existing lock. If the new request downgrades from exclusive to shared, all requests pending on the old lock are awakened. The new request is a subset of the existing lock. The existing lock is broken into three pieces (two if the new lock begins at the beginning or ends at the end of the existing lock). If the type of the new request differs from that of the existing lock, all requests pending on the old lock are awakened, so they can be reassigned to the correct new piece, blocked on a lock held by some other process, or granted. The new request is a superset of an existing lock. The new request replaces the existing lock. If the new request downgrades from exclusive to shared, all requests pending on the old lock are awakened. The new request extends past the end of an existing lock. The existing lock is shortened, and its overlapped piece is replaced by the new request. All requests pending on the existing lock are awakened, so they can be reassigned to the correct new piece, blocked on a lock held by some other process, or granted. The new request extends into the beginning of an existing lock. The existing lock is shortened, and its overlapped piece is replaced by the new request. All requests pending on the existing lock are awakened, so they can be reassigned to the correct new piece, blocked on a lock held by some other process, or granted. Figure 8.14. Five types of overlap considered by the kernel when a range lock is added. In addition to the five basic types of overlap outlined, a request may span several existing locks. Specifically, a new request may be composed of zero or one of type 4, zero or more of type 3, and zero or one of type 5. To understand how the overlap is handled, we can consider the example shown in Figure 8.15. This figure shows a file that has all its active range locks held by process 1, plus a pending lock for process 2. Figure 8.15. Locks before addition of exclusive-lock request by process 1 on range 3..13. Now consider a request by process 1 for an exclusive lock on the range 3 to 13. This request does not conflict with any active locks (because all the active locks are already held by process 1). The request does overlap all three active locks, so the three active locks represent a type 4, type 3, and type 5 overlap, respectively. The result of processing the lock request is shown in Figure 8.16. The first and third active locks are trimmed back to the edge of the new request, and the second lock is replaced entirely. The request that had been held pending on the first lock is awakened. It is no longer blocked by the first lock but is blocked by the newly installed lock. So it now hangs off the blocked list for the second lock. The first and second locks could have been merged because they are of the same type and are held by the same process. However, the current implementation makes no effort to do such merges because range locks are normally released over the same range that they were created. If the merge were done, it would probably have to be split again when the release was requested. Figure 8.16. Locks after addition of exclusive-lock request by process 1 on range 3.13. Lock-removal requests are simpler than addition requests; they need only to consider existing locks held by the requesting process. Figure 8.17 shows the five possible ways that a removal request can overlap the locks of the requesting process. The unlock request exactly overlaps an existing lock. The existing lock is deleted, and any lock requests that were pending on that lock are awakened. The unlock request is a subset of an existing lock. The existing lock is broken into two pieces (one if the unlock request begins at the beginning or ends at the end of the existing lock). Any locks that were pending on that lock are awakened so that they can be reassigned to the correct new piece, blocked on a lock held by some other process, or granted. The unlock request is a superset of an existing lock. The existing lock is deleted, and any locks that were pending on that lock are awakened. The unlock request extends past the end of an existing lock. The end of the existing lock is shortened. Any locks that were pending on that lock are awakened so that they can be reassigned to the shorter lock, blocked on a lock held by some other process, or granted. The unlock request extends into the beginning of an existing lock. The beginning of the existing lock is shortened. Any locks that were pending on that lock are awakened so that they can be reassigned to the shorter lock, blocked on a lock held by some other process, or granted. Figure 8.17. Five types of overlap considered by the kernel when a range lock is deleted. In addition to the five basic types of overlap outlined, an unlock request may span several existing locks. Specifically, a new request may be composed of zero or one of type 4, zero or more of type 3, and zero or one of type 5. |