Synchronization

< Day Day Up >

The concept of mutual exclusion is a crucial one in operating systems development. It refers to the guarantee that one, and only one, thread can access a particular resource at a time. Mutual exclusion is necessary when a resource doesn't lend itself to shared access or when sharing would result in an unpredictable outcome. For example, if two threads copy a file to a printer port at the same time, their output could be interspersed. Similarly, if one thread reads a memory location while another one writes to it, the first thread will receive unpredictable data. In general, writable resources can't be shared without restrictions, whereas resources that aren't subject to modification can be shared. Figure 3-23 illustrates what happens when two threads running on different processors both write data to a circular queue.

Figure 3-23. Incorrect sharing of memory

Because the second thread got the value of the queue tail pointer before the first thread had finished updating it, the second thread inserted its data into the same location that the first thread had used, overwriting data and leaving one queue location empty. Even though this figure illustrates what could happen on a multiprocessor system, the same error could occur on a single-processor system if the operating system were to perform a context switch to the second thread before the first thread updated the queue tail pointer.

Sections of code that access a nonshareable resource are called critical sections. To ensure correct code, only one thread at a time can execute in a critical section. While one thread is writing to a file, updating a database, or modifying a shared variable, no other thread can be allowed to access the same resource. The pseudocode shown in Figure 3-23 is a critical section that incorrectly accesses a shared data structure without mutual exclusion.

The issue of mutual exclusion, although important for all operating systems, is especially important (and intricate) for a tightly coupled, symmetric multiprocessing (SMP) operating system such as Windows, in which the same system code runs simultaneously on more than one processor, sharing certain data structures stored in global memory. In Windows, it is the kernel's job to provide mechanisms that system code can use to prevent two threads from modifying the same structure at the same time. The kernel provides mutual-exclusion primitives that it and the rest of the executive use to synchronize their access to global data structures.

Because the scheduler synchronizes access to its data structures at DPC/Dispatch level IRQL, the kernel and executive cannot rely on synchronization mechanisms that would result in a page fault or reschedule operation to synchronize access to data structures when the IRQL is DPC/Dispatch level or higher (levels known as an elevated or high IRQL). In the following sections, you'll find out how the kernel and executive uses mutual exclusion to protect its global data structures when the IRQL is high and what mutual-exclusion and synchronization mechanisms the kernel and executive use when the IRQL is low (below DPC/Dispatch level).

High-IRQL Synchronization

At various stages during its execution, the kernel must guarantee that one, and only one, processor at a time is executing within a critical section. Kernel critical sections are the code segments that modify a global data structure such as the kernel's dispatcher database or its DPC queue. The operating system can't function correctly unless the kernel can guarantee that threads access these data structures in a mutually exclusive manner.

The biggest area of concern is interrupts. For example, the kernel might be updating a global data structure when an interrupt occurs whose interrupt-handling routine also modifies the structure. Simple single-processor operating systems sometimes prevent such a scenario by disabling all interrupts each time they access global data, but the Windows kernel has a more sophisticated solution. Before using a global resource, the kernel temporarily masks those interrupts whose interrupt handlers also use the resource. It does so by raising the processor's IRQL to the highest level used by any potential interrupt source that accesses the global data. For example, an interrupt at DPC/dispatch level causes the dispatcher, which uses the dispatcher database, to run. Therefore, any other part of the kernel that uses the dispatcher database raises the IRQL to DPC/dispatch level, masking DPC/dispatch-level interrupts before using the dispatcher database.

This strategy is fine for a single-processor system, but it's inadequate for a multiprocessor configuration. Raising the IRQL on one processor doesn't prevent an interrupt from occurring on another processor. The kernel also needs to guarantee mutually exclusive access across several processors.

Interlocked Operations

The simplest form of synchronization mechanisms rely on hardware support for multiprocessor-safe manipulating integer values and for performing comparisons. They include functions such as InterlockedIncrement, InterlockedDecrement, InterlockedExchange, and InterlockedCompareExchange. The InterlockedDecrement function, for example, uses the x86 lock instruction prefix (for example, lock xadd) to lock the multiprocessor bus during the subtraction operation so that another processor that's also modifying the memory location being decremented won't be able to modify between the decrement's read of the original value and write of the decremented value. This form of basic synchronization is used by the kernel and drivers.

Spinlocks

The mechanism the kernel uses to achieve multiprocessor mutual exclusion is called a spinlock. A spinlock is a locking primitive associated with a global data structure, such as the DPC queue shown in Figure 3-24.

Figure 3-24. Using a spinlock

Before entering either critical section shown in the figure, the kernel must acquire the spinlock associated with the protected DPC queue. If the spinlock isn't free, the kernel keeps trying to acquire the lock until it succeeds. The spinlock gets its name from the fact that the kernel (and thus, the processor) is held in limbo, "spinning," until it gets the lock.

Spinlocks, like the data structures they protect, reside in global memory. The code to acquire and release a spinlock is written in assembly language for speed and to exploit whatever locking mechanism the underlying processor architecture provides. On many architectures, spinlocks are implemented with a hardware-supported test-and-set operation, which tests the value of a lock variable and acquires the lock in one atomic instruction. Testing and acquiring the lock in one instruction prevents a second thread from grabbing the lock between the time when the first thread tests the variable and the time when it acquires the lock.

All kernel-mode spinlocks in Windows have an associated IRQL that is always at DPC/dispatch level or higher. Thus, when a thread is trying to acquire a spinlock, all other activity at the spinlock's IRQL or lower ceases on that processor. Because thread dispatching happens at DPC/dispatch level, a thread that holds a spinlock is never preempted because the IRQL masks the dispatching mechanisms. This masking allows code executing a critical section protected by a spinlock to continue executing so that it will release the lock quickly. The kernel uses spinlocks with great care, minimizing the number of instructions it executes while it holds a spinlock.

Note

Because the IRQL is an effective synchronization mechanism on uniprocessors, the spinlock acquisition and release functions of uniprocessor HALs don't implement spinlocks they simply raise and lower the IRQL.

The kernel makes spinlocks available to other parts of the executive through a set of kernel functions, including KeAcquireSpinlock and KeReleaseSpinlock. Device drivers, for example, require spinlocks in order to guarantee that device registers and other global data structures are accessed by only one part of a device driver (and from only one processor) at a time. Spinlocks are not for use by user programs user programs should use the objects described in the next section.

Kernel spinlocks carry with them restrictions for code that uses them. Because spinlocks always have an IRQL of DPC/dispatch level or higher, as explained earlier, code holding a spinlock will crash the system if it attempts to make the scheduler perform a dispatch operation or if it causes a page fault.

Queued Spinlocks

A special type of spinlock called a queued spinlock is used in some circumstances instead of a standard spinlock. A queued spinlock is a form of spinlock that scales better on multiprocessors than a standard spinlock. In general, Windows will use only standard spinlocks when it expects there to be low contention for the lock.

A queued spinlock work like this: When a processor wants to acquire a queued spinlock that is currently held, it places its identifier in a queue associated with the spinlock. When the processor that's holding the spinlock releases it, it hands the lock over to the first processor identified in the queue. In the meantime, a processor waiting for a busy spinlock checks the status not of the spinlock itself but of a per-processor flag that the processor ahead of it in the queue sets to indicate that the waiting processor's turn has arrived.

The fact that queued spinlocks result in spinning on per-processor flags rather than global spinlocks has two effects. The first is that the multiprocessor's bus isn't as heavily trafficked by interprocessor synchronization. The second is that instead of a random processor in a waiting group acquiring a spinlock, the queued spinlock enforces first-in, first-out (FIFO) ordering to the lock. FIFO ordering means more consistent performance across processors accessing the same locks.

Windows defines a number of global queued spinlocks by storing pointers to them in an array contained in each processor's processer control region (PCR). A global spinlock can be acquired by calling KeAcquireQueuedSpinlock with the index into the PCR array at which the pointer to the spinlock is stored. The number of global spinlocks has grown in each release of the operating system, and the table of index definitions for them is published in the DDK header file Ntddk.h.

EXPERIMENT: Viewing Global Queued Spinlocks

You can view the state of the global queued spinlocks (the ones pointed to by the queued spinlock array in each processor's PCR) by using the !qlock kernel debugger command. This command is meaningful only on a multiprocessor system because uniprocessor HALs don't implement spinlocks. In the following example, taken from a Windows 2000 system, the dispatcher database queued spinlock is held by processor 1, and the other queued spinlocks are not acquired. (The dispatcher database is described in Chapter 6.)

kd> !qlocks Key: O = Owner, 1-n = Waitorder, blank = notowned/waiting, C =  Corrupt                        Processor Number    LockName        0  1  2  3  4  5  6  7  8  9  10  11  12  13   14  15 KE - Dispatcher         O KE - ContextSwap MM - PFN MM - SystemSpace CC - Vacb CC Master

Instack Queued Spinlocks

In addition to using static queued spinlocks that are globally defined, Windows XP and Windows Server 2003 kernels support dynamically allocated queued spinlocks with the KeAcquireInStackQueuedSpinlock and KeReleaseInStackQueuedSpinlock functions. Several components including the cache manager, executive pool manager, and NTFS take advantage of these types of locks, and the functions are documented in the DDK for use by thirdparty driver writers.

KeAcquireInStackQueuedSpinlock takes a pointer to a spinlock data structure and a spin lock queue handle. The spin lock handle is actually a data structure in which the kernel stores information about the lock's status, including the lock's ownership and the queue of processors that might be waiting for the lock to become available.

Executive Interlocked Operations

The kernel supplies a number of simple synchronization functions constructed on spinlocks for more advanced operations, such as adding and removing entries from singly and doubly linked lists. Examples include ExInterlockedPopEntryList and ExInterlockedPushEntryList for singly linked lists, and ExInterlockedInsertHeadList and ExInterlockedRemoveHeadList for doubly linked lists. All these functions require a standard spinlock as a parameter and are used throughout the kernel and device drivers.

Low-IRQL Synchronization

Executive software outside the kernel also needs to synchronize access to global data structures in a multiprocessor environment. For example, the memory manager has only one page frame database, which it accesses as a global data structure, and device drivers need to ensure that they can gain exclusive access to their devices. By calling kernel functions, the executive can create a spinlock, acquire it, and release it.

Spinlocks only partially fill the executive's needs for synchronization mechanisms, however. Because waiting for a spinlock literally stalls a processor, spinlocks can be used only under the following strictly limited circumstances:

The protected resource must be accessed quickly and without complicated interactions with other code.
The critical section code can't be paged out of memory, can't make references to pageable data, can't call external procedures (including system services), and can't generate interrupts or exceptions.

These restrictions are confining and can't be met under all circumstances. Furthermore, the executive needs to perform other types of synchronization in addition to mutual exclusion, and it must also provide synchronization mechanisms to user mode.

There are several additional synchronization mechanisms for use when spinlocks are not suitable:

Kernel Dispatcher Objects
Fast Mutexes and Guarded Mutexes
Push Locks
Executive Resources

Table 3-9 serves as a reference that compares and contrasts the capabilities of these mechanisms and their interaction with kernel-mode APC delivery.

Table 3-9. Kernel Synchronization Mechanisms
	Exposed for Use by Device Drivers	Disables Normal Kernel-Mode APCs	Disables Special Kernel-Mode APCs	Supports Recursive Acquisition	Supports Shared and Exclusive Acquisition
Kernel Dispatcher Mutexes	Yes	Yes	No	Yes	No
Kernel Dispatcher Semaphores	Yes	No	No	No	No
Fast Mutexes	Yes	Yes	Yes	No	No
Guarded Mutexes	No	Yes	Yes	No	No
Push Locks	No	No	No	No	Yes
Executive Resources	Yes	Yes	No	Yes	Yes

Kernel Dispatcher Objects

The kernel furnishes additional synchronization mechanisms to the executive in the form of kernel objects, known collectively as dispatcher objects. The user-visible synchronization objects acquire their synchronization capabilities from these kernel dispatcher objects. Each user-visible object that supports synchronization encapsulates at least one kernel dispatcher object. The executive's synchronization semantics are visible to Windows programmers through the WaitForSingleObject and WaitForMultipleObjects functions, which the Windows subsystem implements by calling analogous system services the object manager supplies. A thread in a Windows application can synchronize with a Windows process, thread, event, semaphore, mutex, waitable timer, I/O completion port, or file object.

One other type of executive synchronization object worth noting is called executive resources. Executive resources provide both exclusive access (like a mutex) as well as shared read access (multiple readers sharing read-only access to a structure). However, they're available only to kernel-mode code and thus aren't accessible from the Windows API. Executive resources are not dispatcher objects but rather data structures allocated directly from nonpaged pool that have their own specialized services to initialize, lock, release, query, and wait for them. The executive resource structure is defined in Ntddk.h, and the executive support routines are documented in the DDK reference documentation.

The remaining subsections describe the implementation details of waiting for dispatcher objects.

Waiting for Dispatcher Objects A thread can synchronize with a dispatcher object by waiting for the object's handle. Doing so causes the kernel to suspend the thread and change its dispatcher state from running to waiting, as shown in Figure 3-25. The kernel removes the thread from the dispatcher ready queue and no longer considers it for execution.

Figure 3-25. Waiting for a dispatcher object

Note

Figure 3-25 is a process state transition diagram with focus on the ready, waiting, and running states (the states related to waiting for objects). The other states are described in Chapter 6.

At any given moment, a synchronization object is in one of two states: either the signaled state or the nonsignaled state. A thread can't resume its execution until the kernel changes its dispatcher state from waiting to ready. This change occurs when the dispatcher object whose handle the thread is waiting for also undergoes a state change, from the nonsignaled state to the signaled state (when a thread sets an event object, for example). To synchronize with an object, a thread calls one of the wait system services the object manager supplies, passing a handle to the object it wants to synchronize with. The thread can wait for one or several objects and can also specify that its wait should be canceled if it hasn't ended within a certain amount of time. Whenever the kernel sets an object to the signaled state, the kernel's KiWaitTest function checks to see whether any threads are waiting for the object and not also waiting for other objects to become signaled. If there are, the kernel releases one or more of the threads from their waiting state so that they can continue executing.

The following example of setting an event illustrates how synchronization interacts with thread dispatching:

A user-mode thread waits for an event object's handle.
The kernel changes the thread's scheduling state from ready to waiting and then adds the thread to a list of threads waiting for the event.
Another thread sets the event.
The kernel marches down the list of threads waiting for the event. If a thread's conditions for waiting are satisfied (see Note below), the kernel changes the thread's state from waiting to ready. If it is a variable-priority thread, the kernel might also boost its execution priority.
Because a new thread has become ready to execute, the dispatcher reschedules. If it finds a running thread with a priority lower than that of the newly ready thread, it preempts the lower-priority thread and issues a software interrupt to initiate a context switch to the higher-priority thread.
If no processor can be preempted, the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later.

Note

Some threads might be waiting for more than one object, so they continue waiting.

What Signals an Object The signaled state is defined differently for different objects. A thread object is in the nonsignaled state during its lifetime and is set to the signaled state by the kernel when the thread terminates. Similarly, the kernel sets a process object to the signaled state when the process's last thread terminates. In contrast, the timer object, like an alarm, is set to "go off" at a certain time. When its time expires, the kernel sets the timer object to the signaled state.

When choosing a synchronization mechanism, a program must take into account the rules governing the behavior of different synchronization objects. Whether a thread's wait ends when an object is set to the signaled state varies with the type of object the thread is waiting for, as Table 3-10 illustrates.

Table 3-10. Definitions of the Signaled State
Object Type	Set to Signaled State When	Effect on Waiting Threads
Process	Last thread terminates	All released
Thread	Thread terminates	All released
File	I/O operation completes	All released
Debug Object	Debug message is queued to the object	All released
Event (notification type)	Thread sets the event	All released
Event (synchronization type)	Thread sets the event	One thread released; event object reset
Keyed Event	Thread sets event with a key	Thread waiting for key and which is of same process as signaler is released
Semaphore	Semaphore count drops by 1	One thread released
Timer (notification type)	Set time arrives or time interval expires	All released
Timer (synchronization type)	Set time arrives or time interval expires	One thread released
Mutex	Thread releases the mutex	One thread released
File	I/O completes	All threads released
Queue	Item is placed on queue	One thread released

When an object is set to the signaled state, waiting threads are generally released from their wait states immediately. Some of the kernel dispatcher objects and the system events that induce their state changes are shown in Figure 3-26.

Figure 3-26. Selected kernel dispatcher objects

For example, a notification event object (called a manual reset event in the Windows API) is used to announce the occurrence of some event. When the event object is set to the signaled state, all threads waiting for the event are released. The exception is any thread that is waiting for more than one object at a time; such a thread might be required to continue waiting until additional objects reach the signaled state.

In contrast to an event object, a mutex object has ownership associated with it. It is used to gain mutually exclusive access to a resource, and only one thread at a time can hold the mutex. When the mutex object becomes free, the kernel sets it to the signaled state and then selects one waiting thread to execute. The thread selected by the kernel acquires the mutex object, and all other threads continue waiting.

Keyed Events and Critical Sections

A synchronization object new to Windows XP, called a keyed event, bears special mention because of the role it plays in helping processes deal with low-memory situations when using critical sections. A keyed event, which is not documented, allows a thread to specify a "key" for which it waits, where the thread wakes when another thread of the same process signals the event with the same key.

Windows processes often use critical section functions, EnterCriticalSection and LeaveCriticalSection, to synchronize thread access to resources private to the process. These functions have advantages over direct use of mutex objects because if there is no contention they do not make a transition to kernel mode. If there is contention, EnterCriticalSection dynamically allocates an event object and the thread wanting to acquire the critical section waits for the thread that owns the critical section to signal it in LeaveCriticalSection.

EnterCriticalSection uses a global keyed event named CritSecOutOfMemoryEvent (in the \Kernel directory of the object manager namespace) when the allocation of the event object for the critical section fails because system memory is low. If EnterCriticalSection has to use CritSecOutOfMemoryEvent instead of a standard event, a thread waiting for the critical section uses the address of the critical section as the key. This allows the critical section functions to operate properly even when memory is temporarily low.

This brief discussion wasn't meant to enumerate all the reasons and applications for using the various executive objects but rather to list their basic functionality and synchronization behavior. For information on how to put these objects to use in Windows programs, see the Windows reference documentation on synchronization objects or Jeffrey Richter's Programming Applications for Microsoft Windows.

Data Structures Two data structures are key to tracking who is waiting for what: dispatcher headers and wait blocks. Both these structures are publicly defined in the DDK include file Ntddk.h. The definitions are reproduced here for convenience:

typedef struct _DISPATCHER_HEADER {     UCHAR Type;     UCHAR  Absolute;     UCHAR Size;     UCHAR  Inserted;     LONGSignalState;     LIST_ENTRYWaitListHead; } DISPATCHER_HEADER; typedefstruct_KWAIT_BLOCK  {     LIST_ENTRYWaitListEntry;     struct  _KTHREAD*RESTRICTED_POINTERThread;     PVOID  Object;     struct  _KWAIT_BLOCK  *RESTRICTED_POINTERNextWaitBlock;     USHORT  WaitKey;     USHORT  WaitType; } KWAIT_BLOCK,*PKWAIT_BLOCK,*RESTRICTED_POINTER  PRKWAIT_BLOCK;

The dispatcher header contains the object type, signaled state, and a list of the threads waiting for that object. The wait block represents a thread waiting for an object. Each thread that is in a wait state has a list of the wait blocks that represent the objects the thread is waiting for. Each dispatcher object has a list of the wait blocks that represent which threads are waiting for the object. This list is kept so that when a dispatcher object is signaled, the kernel can quickly determine who is waiting for that object. The wait block has a pointer to the object being waited for, a pointer to the thread waiting for the object, and a pointer to the next wait block (if the thread is waiting for more than one object). It also records the type of wait (any or all) as well as the position of that entry in the array of handles passed by the thread on the WaitForMultipleObjects call (position zero if the thread was waiting for only one object).

Figure 3-27 shows the relationship of dispatcher objects to wait blocks to threads. In this example, thread 1 is waiting for object B, and thread 2 is waiting for objects A and B. If object A is signaled, the kernel will see that because thread 2 is also waiting for another object, thread 2 can't be readied for execution. On the other hand, if object B is signaled, the kernel can ready thread 1 for execution right away because it isn't waiting for any other objects.

Figure 3-27. Wait data structures

EXPERIMENT: Looking at Wait Queues

Although many process viewer utilities indicate whether a thread is in a wait state (and if so, they also indicate what kind of wait), you can see the list of objects a thread is waiting for only with the kernel debugger !thread command. For example, the following excerpt from the output of a !process command shows that the thread is waiting for an event object:

kd> !process §         THREAD 8a12a328 Cid 0bb8.0d50 Teb: 7ffdd000 Win32Thread:  e7c9aeb0 WAIT : (WrUserRequest) UserMode Non-Alertable             8a21bf58  SynchronizationEvent

You can use the dt command to interpret the dispatcher header of the object like this:

kd> dt nt!_dispatcher_header 8a21bf58 nt!_DISPATCHER_HEADER    +0x000Type              : 0x1  ''    +0x001Absolute          : 0''    +0x002Size              : 0x4  ''    +0x003Inserted          : 0''    +0x004SignalState       : 0    +0x008WaitListHead      : _LIST_ENTRY  [ 0x8a12a398 - 0x8a12a398 ]

From this, we can ascertain that no other threads are waiting for this event object because the wait list head forward and backward pointers point to the same location (a single wait block). Dumping the wait block (at address 0x8a12a398) yields the following:

kd>dtnt!_kwait_block  0x8a12a398 nt!_KWAIT_BLOCK    +0x000WaitListEntry       : _LIST_ENTRY  [ 0x8a21bf60 -  0x8a21bf60 ]    +0x008Thread              : 0x8a12a328    +0x00cObject              : 0x8a21bf58    +0x010NextWaitBlock       : 0x8a12a398    +0x014WaitKey             : 0    +0x016WaitType            : 1

If the wait list had more than one entry, you could execute the same command on the second pointer value in the WaitListEntry field of each wait block (by executing !thread on the thread pointer in the wait block) to traverse the list and see what other threads are waiting for the object.

Fast Mutexes and Guarded Mutexes

Fast mutexes, which are also known as executive mutexes, usually offer better performance than mutex objects because, although they are built on dispatcher event objects, they avoid waiting for the event object (and therefore the spinlocks on which an event object is based) if there's no contention for the fast mutex. This gives the fast mutex especially good performance in a multiprocessor environment. Fast mutexes are used widely throughout the kernel and device drivers.

However, fast mutexes are suitable only when normal kernel-mode APC (described earlier in this chapter) delivery can be disabled. The executive defines two functions for acquiring them: ExAcquireFastMutex and ExAcquireFastMutexUnsafe. The former function blocks all APC delivery by raising the IRQL of the processor to APC_LEVEL and the latter expects to be called with normal kernel-mode APC delivery disabled, which can be done by raising the IRQL to APC level or by calling KeEnterCriticalRegion. Another limitation of fast mutexes is that they can't be acquired recursively like mutex objects can.

Guarded mutexes are new to Windows Server 2003 and are essentially the same as fast mutexes (although they use a different synchronization object, the KGATE, internally). They are acquired with the KeAcquireGuardedMutex function, but instead of disabling APCs by calling KeEnterCriticalRegion, which disables only normal kernel-mode APCs, it disables all kernel-mode APC delivery by calling KeEnterGuardedRegion. They are not exposed for use outside of the kernel and are used primarily by the memory manager, which uses them to protect global operations such as creating paging files, deleting certain types of shared memory sections, and performing paged pool expansion. (See Chapter 7 for more information on the memory manager.)

Executive Resources

Executive resources are a synchronization mechanism that supports shared and exclusive access, and like fast mutexes, require that normal kernel-mode APC delivery be disabled before they are acquired. They are also built on dispatcher objects that are only used when there is contention. Executive resources are used throughout the system, especially in file-system drivers.

Threads waiting to acquire a resource for shared access wait for a semaphore associated with the resource, and threads waiting to acquire a resource for exclusive access wait for an event. A semaphore with unlimited count is used for shared waiters because they can all be woken and granted access to the resource when an exclusive holder releases the resource simply by signaling the semaphore. When a thread waits for exclusive access of a resource that is currently owned, it waits on a synchronization event object because only one of the waiters will wake when the event is signaled.

Because of the flexibility that shared and exclusive access offers, there are a number of functions for acquiring resources: ExAcquireResourceSharedLite, ExAcquireResourceExclusiveLite, ExAcquireSharedStarveExclusive, ExAcquireWaitForExclusive, and ExTryToAcquireResourceExclusiveLite. These functions are documented in the DDK.

EXPERIMENT: Listing Acquired Executive Resources

The kernel debugger !locks command searches paged pool for executive resource objects and dumps their state. By default, the command lists only executive resources that are currently owned, but the d option will list all executive resources. Here is partial output of the command:

lkd> !locks **** DUMP OF ALL RESOURCE OBJECTS **** KD:  Scanning for held locks. Resource@  nt!MmSystemWsLock  (0x805439a0)   Exclusively  owned     Contention  Count=123      Threads:  89b36020-01<*> KD: Scanning for held locks...................................... ..................... ................................................................. ..................... ............................. Resource @ 0x89da1a68   Shared 1  owning  threads      Threads: 8a4cb533-01<*> *** Actual Thread 8a4cb530

Note that the contention count, which is extracted from the resource structure, records the number of times threads have tried to acquire the resource and had to wait because it was already owned.

You can examine the details of a specific resource object, including the thread that owns the resource and any threads that are waiting for the resource, by specifying the v switch and the address of the resource:

lkd> !locks  -v  0x805439a0 Resource @ nt!MmSystemWsLock (0x805439a0)   Exclusively owned     Contention Count = 123      Threads: 89b36020-01<*>      THREAD  89b36020 Cid  0e98.0bd4  Teb: 7ffd9000  Win32Thread:  e2bcc538  RUNNINGon  pr ocessor  0      Not  impersonating      DeviceMap                  e1df7d18      Owning  Process            8999d020      WaitStart  TickCount       492582         Elapsed Ticks: 15      ContextSwitchCount         532                 LargeStack      UserTime                   00:00:01.0462      KernelTime                 00:00:00.0320      Start Address  0x77e7d342      Win32 Start Address 0x0101f1d0      Stack Init  a9d20000 Current  a9d1fd44 Base a9d20000 Limit   a9d1d000 Call  0      Priority 11 BasePriority  8 PriorityDecrement 2   DecrementCount 16 Unable to get context for thread  running  on  processor 0,  HRESULT 0x80004001

Push Locks

Push locks, which were introduced in Windows XP, are another optimized synchronization mechanism built on the event object (and in Windows Server 2003 they are built on the internal KGATE synchronization object), and like fast mutexes, they wait for an event object only when there's contention on the lock. They offer advantages over the fast mutex in that they can be acquired in shared or exclusive mode. They are not documented or exported by the kernel and are therefore reserved for use by the operating system.

There are two types of push locks: normal and cache aware. Normal push locks require only the size of a pointer in storage (4 bytes on 32-bit systems and 8 bytes on 64-bit systems). When a thread acquires a normal push lock, the push lock code marks the push lock as owned if it is not currently owned. If the push lock is owned exclusively or the thread wants to acquire the thread exclusively and the push lock is owned on a shared basis, the thread allocates a wait block on the thread's stack, initializes an event object in the wait block, and adds the wait block to the wait list associated with the push lock. When a thread releases a push lock, the thread wakes a waiter, if any are present, by signaling the event in the waiter's wait block.

A cache-aware push lock layers on the basic push lock by allocating a push lock for each processor in the system and associating it with the cache-aware push lock. When a thread wants to acquire a cache-aware push lock for shared access, it simply acquires the push lock allocated for its current processor in shared mode; to acquire a cache-aware push lock exclusively, it acquires the push lock for each processor in exclusive mode.

Areas where push locks are used include the object manager, where they protect global object manager data structures and object security descriptors, and the memory manager, where they protect AWE data structures.

Deadlock Detection with Driver Verifier

A deadlock is a synchronization issue resulting from two threads or processors holding resources that the other wants and neither will yield what it has. This situation might result in system or process hangs. Driver Verifier, described in Chapter 7 and Chapter 9, has an option to check for deadlocks involving spinlocks, fast mutexes, and mutexes. For information on when to enable Driver Verifier to help resolve system hangs, see Chapter 14.

< Day Day Up >

Figure 3-23. Incorrect sharing of memory

High-IRQL Synchronization

Interlocked Operations

Spinlocks

Figure 3-24. Using a spinlock

Queued Spinlocks

EXPERIMENT: Viewing Global Queued Spinlocks

Instack Queued Spinlocks

Executive Interlocked Operations

Low-IRQL Synchronization

Table 3-9. Kernel Synchronization Mechanisms

Kernel Dispatcher Objects

Figure 3-25. Waiting for a dispatcher object

Table 3-10. Definitions of the Signaled State

Figure 3-26. Selected kernel dispatcher objects

Keyed Events and Critical Sections

Figure 3-27. Wait data structures

EXPERIMENT: Looking at Wait Queues

Fast Mutexes and Guarded Mutexes

Executive Resources

EXPERIMENT: Listing Acquired Executive Resources

Push Locks

Deadlock Detection with Driver Verifier