Section 15.6. Locking in UFS


15.6. Locking in UFS

UFS uses two basic types of locks: kmutex_t and krwlock_t. The workings of these synchronization primitives is covered in Chapter 17. UFS locks can be divided into eight categories:

  • Inode locks

  • Queue locks

  • ACL locks

  • VNODE locks

  • VFS locks

  • VOP_RWLOCK

  • ufs_iuniqtime_lock

  • Logging locks

15.6.1. UFS Lock Descriptions

Tables 15.2 through 15.9 describe the UFS locks in more detail.

Table 15.2. Inode Locks

Name

Type

Description

i_rwlock

krwlock_t

  • Serializes write requests. Allows reads to proceed in parallel. Serializes directory reads and updates.

  • Does not protect inode fields.

  • Indirectly protects block lists since it serializes allocations/deallocations in UFS.

  • Must be taken before starting UFS logging transactions if operating on a file; otherwise, taken after starting logging transaction.

i_contents

krwlock_t

  • Protects most fields in the inode.

  • When held as a writer, protects all the fields protected by the i_tlock.

i_tlock

kmutex_t

  • When held with the i_contents reader lock, protects the following inode fields: i_utime, i_ctime, i_mtime, i_flag, i_delayoff, i_delaylen, i_nextrio, i_writes, i_writer, i_mapcnt.

  • Also used as mutex for write throttling in UFS.

  • i_contents and i_tlock held together allows parallelism in updates.

i_hlock

kmutex_t

  • Inode hash lock.


Table 15.3. Inode Queue Locks

Name

Type

Description

ufs_scan_lock

kmutex_t

  • Synchronizes ufs_scan_inodes threads

  • ufs_update(), ufs_sync(), ufs_scan_inodes().

  • Needed because of global inode list.

ufs_q->uq_mutex

krwlock_t

  • Protects the two inode idle queues ufs_junk_iq and ufs_useful_iq.

ufs_hlock

kmutex_t

  • Used by the hlock thread. For more information, see man lockfs(1M), hardlock section.

ih_lock

kmutex_t

  • Protects the inode hash. The inode hash is global, per system, not per file system.


Table 15.4. Quota Queue Locks

Name

Type

Description

dq_cachelock

kmutex_t

  • Protects the quota cache list. Prerequisite before taking the dquot.dq_lock.

dq_freelock

kmutex_t

  • Protects the free quota list.

dq_rwlock

krwlock_t

  • Protects the entire quota subsystem.

  • Taken as writer when the quota subsystem is initialized. Taken as reader when we do not want entire quota subsystem to be quiesced.

  • As writer, allows updates to quota-related fields in the ufsvfs structure. Also protects the dquot file as writer to allow quota updates.

  • As reader, allows reads from the quota-related fields in the ufsvfs structure.

dqout.dq_lock

kmutex_t

  • Gives exclusive access to dquot struct.


Table 15.5. VNODE Locks

Name

Type

Description

v_lock

kmutex_t

  • Protects the vnode fields. Also used by VN_HOLD/VN_RELE.


Table 15.6. ACL Locks

Name

Type

Description

s_lock

krwlock_t

  • Protects the in-core shadow inode structure.


Table 15.7. VFS Locks

Name

Type

Description

vfs_lock

kmutex_t

  • Locks contents of file system and cylinder groups. Also protects fields of the vfs_dio.

vfs_dqrwlock

krwlock_t

  • Manages quota subsystem quiescence.

  • If held as writer, UFS quota subsystem may be experiencing changes in quotas, enabling/disabling of quotas, setting new quota limits.

  • Protects d_quot structure. This structure keeps track of all the enabled quotas per file system.

  • Important note: UFS shadow inodes that are used to hold ACL data and extended attribute directories are not counted against user quotas. Thus, this lock is not held for updates to these.

  • Reader held for this lock indicates to quota subsystem that major changes should not be occurring during that time.

  • Held when the i_contents writer lock is held, as described above, signifying that changes are occurring that affect user quotas.

  • Since UFS quotas can be enabled/disabled on the fly, this lock must be taken in all appropriate situations. It is not sufficient to check if the UFS quota subsystem is enabled before taking the lock.

ufsvfs_mutex

kmutex_t

  • Protects access to the list that links all UFS file system instances.

  • Updates lists as a part of the mount operation.

  • Allows synchronization of all UFS file systems.


Table 15.8. VOP_RWLOCK or ufs_rwlock

Name

Type

Description

ufs_rwlock()

function

  • Prevents concurrent reads and writes to a file.

  • Used by NFS when calling a VOP_READDIR, to prevent directory contents from changing.

  • NFS uses this lock to get attributes before and after a read or write to disable another operation from modifying the file.


Table 15.9. Logging Locks

Name

Type

Description

mtm_lock

kmutex_t

  • Protects mtm_taskq_sync_count (keeps track of the number of pending top_issue_sync requests) field in mt_map_t.

mtm_mutex

kmutex_t

  • Protects all the fields in the mt_map_t structure except mtm_mapext and mtm_refcnt.

mtm_rwlock

krwlock_t

  • Protects agenext_mapentry field.

un_log_mutex

kmutex_t

  • Allows one write to the log at a time. Part of ml_unit_t structure (in-core log data structure).

un_state_mutex

kmutex_t

  • Allows one log state update at a time.


15.6.2. Inode Lock Ordering

Now that we are all familiar with the several different types of locks available in UFS, let us put them in order as if we were to work on an inode. Lock ordering is critical, and any mistake will more than likely cause the system to deadlock, and may end up panicking it!

Figure 15.16 give us a quick overview of lock ordering specific to the inode.

Figure 15.16. Inode Lock Ordering Precedence


15.6.3. UFS Lockfs Protocol

Along with basic inode locking, UFS also provides a mechanism to quiesce a file system for file system locking and for the forced unmounting of a file system. All VOPs (vnode operations) in UFS are required to follow the UFS lock protocol with ufs_lockfs_begin() and ufs_lockfs_end(), although the following functions purposely do not adhere to the tradition:

  • ufs_close

  • ufs_putpage

  • ufs_inactive

  • ufs_addmap

  • ufs_delmap

  • ufs_rwlock

  • ufs_rwunlock

  • ufs_poll

The basic principle here is that UFS supports various file system lock states (see list below) and each vnode operation must initiate the protocol by calling ufs_lockfs_begin() with an appropriate lock mask (a lock that this operation might grab while it is being processed) and end the protocol by calling ufs_lockfs_end before it returns. This way, UFS knows exactly how many vnode operations are in progress for the given file system by incrementing and decrementing the ul_vnops_cnt variable in the file-system-dependent ulockfs structure. If the file system is hard-locked, the thread gets an EIO error. If the file system is error-locked, then the thread is blocked.

Here are the file system locks and their actions.

  • Write lock. Suspends writes that would modify the file system. Access times are not kept while a file system is write-locked.

  • Name lock. Suspends accesses that could change or remove existing directories entries.

  • Delete lock. Suspends access that could remove directory entries.

  • Hard lock. Returns an error upon every access to the locked file system and cannot be unlocked. Hard-locked file systems can be unmounted. Hard lock supports forcible unmount.

  • Error lock. Blocks all local access to the file system and returns EWOULDBLOCK on all remote access. File systems are error-locked by UFS upon detection of internal inconsistency. They can only be unlocked after successful repair by fsck, which is usually done automatically. Error-locked file systems can be unmounted. Once the file system becomes clean, it can be upgraded to a hard lock.

  • Soft lock. Quiesces a file system.

  • Unlock. Awakens suspended accesses, releases existing locks, and flushes the file system.

While a vnode operation is being executed in UFS, a call can be made to another vnode function on the same UFS or a different UFS. This is called recursive VOP. The per-file system vnode operation counter is not incremented or decremented during recursive calls.

Here is the basic ordering to initiate and complete the lock protocol when operating on an inode in UFS.

1) Acquire i_rwlock (from the vnode layer in most cases). 2) Begin the UFS lock protocol by calling ufs_lockfs_begin(). 3) Open UFS logging transactions if necessary now. 4) Acquire inode and quota locks (vfs_dqrwlock, i_contents, i_tlock, ...). 5) [work on inode] 6) Drop inode and quota locks (i_tlock, i_contents, vfs_dqrwlock, ...). 7) Close logging transactions. 8) End the UFS lock protocol by calling ufs_lockfs_end(). 9) Release i_rwlock. 


When working with directories, you need to make one minor change. i_rwlock is acquired after the logging transaction is initialized, and i_rwlock is released before the transaction is ended. Here are the steps.

1) Begin the UFS lock protocol by calling ufs_lockfs_begin(). 2) Open UFS logging transactions if necessary now. 3) Acquire i_rwlock. 4) Acquire inode and quota locks (vfs_dqrwlock, i_contents, i_tlock, ...). 5) [work on inode] 6) Drop inode and quota locks (i_tlock, i_contents, vfs_dqrwlock, ...). 7) Release i_rwlock. 8) Close logging transactions. 9) End the UFS lock protocol by calling ufs_lockfs_end(). 





SolarisT Internals. Solaris 10 and OpenSolaris Kernel Architecture
Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)
ISBN: 0131482092
EAN: 2147483647
Year: 2004
Pages: 244

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net