Cancelling IO Requests | Programming the Microsoft Windows Driver Model

Cancelling I/O Requests

Just as happens with people in real life, programs sometimes change their mind about the I/O requests they ve asked you to perform for them. We re not talking about simple fickleness here. Applications might terminate after issuing requests that will take a long time to complete, leaving requests outstanding. Such an occurrence is especially likely in the WDM world, where the insertion of new hardware might require you to stall requests while the Configuration Manager rebalances resources or where you might be told at any moment to power down your device.

To cancel a request in kernel mode, someone calls IoCancelIrp. The operating system automatically calls IoCancelIrp for every IRP that belongs to a thread that s terminating with requests still outstanding. A user-mode application can call CancelIo to cancel all outstanding asynchronous operations issued by a given thread on a file handle. IoCancelIrp would like to simply complete the IRP it s given with STATUS_CANCELLED, but there s a hitch: IoCancelIrp doesn t know where you have salted away pointers to the IRP, and it doesn t know for sure whether you re currently processing the IRP. So it relies on a cancel routine you provide to do most of the work of cancelling an IRP.

It turns out that a call to IoCancelIrp is more of a suggestion than a demand. It would be nice if every IRP that somebody tried to cancel really got completed with STATUS_CANCELLED. But it s OK if a driver wants to go ahead and finish the IRP normally if that can be done relatively quickly. You should provide a way to cancel I/O requests that might spend significant time waiting in a queue between a dispatch routine and a StartIo routine. How long is significant is a matter for your own sound judgment; my advice is to err on the side of providing for cancellation because it s not that hard to do and makes your driver fit better into the operating system.

If It Weren t for Multitasking

An intricate synchronization problem is associated with cancelling IRPs. Before I explain the problem and the solution, I want to describe the way cancellation would work in a world where there was no multitasking and no concern with multiprocessor computers. In that utopia, several pieces of the I/O Manager would fit together with your StartIo routine and with a cancel routine you d provide, as follows:

When you queue an IRP, you set the CancelRoutine pointer in the IRP to the address of your cancel routine. When you dequeue the IRP, you set CancelRoutine to NULL.
IoCancelIrp unconditionally sets the Cancel flag in the IRP. Then it checks to see whether the CancelRoutine pointer in the IRP is NULL. While the IRP is in your queue, CancelRoutine will be non-NULL. In this case, IoCancelIrp calls your cancel routine. Your cancel routine removes the IRP from the queue where it currently resides and completes the IRP with STATUS_CANCELLED.
Once you dequeue the IRP, IoCancelIrp finds the CancelRoutine pointer set to NULL, so it doesn t call your cancel routine. You process the IRP to completion with reasonable promptness (a concept that calls for engineering judgment), and it doesn t matter to anyone that you didn t actually cancel the IRP.

Synchronizing Cancellation

Unfortunately for us as programmers, we write code for a multiprocessing, multitasking environment in which effects can sometimes appear to precede causes. There are many possible race conditions between the queue insertion, queue removal, and cancel routines in the naive scenario I just described. For example, what would happen if IoCancelIrp called your cancel routine to cancel an IRP that happened to be at the head of your queue? If you were simultaneously removing an IRP from the queue on another CPU, you can see that your cancel routine would probably conflict with your queue removal logic. But this is just the simplest of the possible races.

In earlier times, driver programmers dealt with the cancel races by using a global spin lock the cancel spin lock. Because you shouldn t use this spin lock for synchronization in your own driver, I ve explained it briefly in the sidebar. Read the sidebar for its historical perspective, but don t plan to use this lock.

The Global Cancel Spin Lock

The original Microsoft scheme for synchronizing IRP cancellation revolved around a global cancel spin lock. Routines named IoAcquireCancelSpinLock and IoReleaseCancelSpinLock acquire and release this lock. The Microsoft queuing routines IoStartPacket and IoStartNextPacket acquire and release the lock to guard their access to the cancel fields in an IRP and to the CurrentIrp field of the device object. IoCancelIrp acquires the lock before calling your cancel routine but doesn t release the lock. Your cancel routine runs briefly under the protection of the lock and must call IoReleaseCancelSpinLock before returning.

In this scheme, your own StartIo routine must also acquire and release the cancel spin lock to safely test the Cancel flag in the IRP and to reset the CancelRoutine pointer to NULL.

Hardly anyone was able to craft queuing and cancel logic that approached being bulletproof using this original scheme. Even the best algorithms actually have a residual flaw arising from a coincidence in IRP pointer values. In addition, the fact that every driver in the system needed to use a single spin lock two or three times in the normal execution path created a measurable performance problem. Consequently, Microsoft now recommends that drivers either use the cancel-safe queue routines or else copy someone else s proven queue logic. Neither Microsoft nor I would recommend that you try to design your own queue logic with cancellation because getting it right is very hard.

Nowadays, we handle the cancel races in one of two ways. We can implement our own IRP queue (or, more probably, cut and paste someone else s). Or, in certain kinds of drivers, we can use the IoCsqXxx family of functions. You don t need to understand how the IoCsqXxx functions handle IRP cancellation because Microsoft intends these functions to be a black box. I ll discuss in detail how my own DEVQUEUE handles cancellation, but I first need to tell you a bit more about the internal workings of IoCancelIrp.

Some Details of IRP Cancellation

Here is a sketch of IoCancelIrp. You need to know this to correctly write IRP-handling code. (This isn t a copy of the Windows XP source code it s an abridged excerpt.)

BOOLEAN IoCancelIrp(PIRP Irp) {  IoAcquireCancelSpinLock(&Irp->CancelIrql);  Irp->Cancel = TRUE;  PDRIVER_CANCEL CancelRoutine = IoSetCancelRoutine(Irp, NULL); if (CancelRoutine) { PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);  (*CancelRoutine)(stack->DeviceObject, Irp); return TRUE; } else {  IoReleaseCancelSpinLock(Irp->CancelIrql); return FALSE; } }

IoCancelIrp first acquires the global cancel spin lock. As you know if you read the sidebar earlier, lots of old drivers contend for the use of this lock in their normal IRP-handling path. New drivers hold this lock only briefly while handling the cancellation of an IRP.
Setting the Cancel flag to TRUE alerts any interested party that IoCancelIrp has been called for this IRP.
IoSetCancelRoutine performs an interlocked exchange to simultaneously retrieve the existing CancelRoutine pointer and set the field to NULL in one atomic operation.
IoCancelIrp calls the cancel routine, if there is one, without first releasing the global cancel spin lock. The cancel routine must release the lock! Note also that the device object argument to the cancel routine comes from the current stack location, where IoCallDriver is supposed to have left it.
If there is no cancel routine, IoCancelIrp itself releases the global cancel spin lock. Good idea, huh?

How the DEVQUEUE Handles Cancellation

As I promised, I ll now show you how the major DEVQUEUE routines work so you can see how they safely cope with IRP cancellation.

DEVQUEUE Internals Initialization

The DEVQUEUE object has this declaration in my DEVQUEUE.H and GENERIC.H header files:

typedef struct _DEVQUEUE {  LIST_ENTRY head;  KSPIN_LOCK lock;  PDRIVER_START StartIo;  LONG stallcount;  PIRP CurrentIrp;  KEVENT evStop;  NTSTATUS abortstatus; } DEVQUEUE, *PDEVQUEUE;

InitializeQueue initializes one of these objects like this:

VOID NTAPI InitializeQueue(PDEVQUEUE pdq, PDRIVER_STARTIO StartIo) { InitializeListHead(&pdq->head); KeInitializeSpinLock(&pdq->lock); pdq->StartIo = StartIo; pdq->stallcount = 1; pdq->CurrentIrp = NULL; KeInitializeEvent(&pdq->evStop, NotificationEvent, FALSE); pdq->abortstatus = (NTSTATUS) 0; }

We use an ordinary (noninterlocked) doubly-linked list to queue IRPs. We don t need to use an interlocked list because we ll always access it within the protection of our own spin lock.
This spin lock guards access to the queue and other fields in the DEVQUEUE structure. It also takes the place of the global cancel spin lock for guarding nearly all of the cancellation process, thereby improving system performance.
Each queue has its own associated StartIo function that we call automatically in the appropriate places.
The stall counter indicates how many times somebody has requested that IRP delivery to StartIo be stalled. Initializing the counter to 1 means that the IRP_MN_START_DEVICE handler must call RestartRequests to release an IRP. I ll discuss this issue more fully in Chapter 6.
The CurrentIrp field records the IRP most recently sent to the StartIo routine. Initializing this field to NULL indicates that the device is initially idle.
We use this event when necessary to block WaitForCurrentIrp, one of the DEVQUEUE routines involved in handling PnP requests. We ll set the event inside StartNextPacket, which should always be called when the current IRP completes.
We reject incoming IRPs in two situations. The first situation occurs after we irrevocably commit to removing the device, when we must start causing new IRPs to fail with STATUS_DELETE_PENDING. The second situation occurs during a period of low power, when, depending on the type of device we re managing, we might choose to cause new IRPs to fail with the STATUS_DEVICE_POWERED_OFF code. The abortstatus field records the status code we should use in rejecting IRPs in these situations.

In the steady state after all PnP initialization finishes, each DEVQUEUE will have a zero stallcount and abortstatus.

DEVQUEUE Internals Queuing and Cancellation

Here is the complete implementation of the three DEVQUEUE routines whose usage I just showed you. I cut and pasted the source code directly from GENERIC.SYS and did some minor formatting for the sake of readability on the printed page. I also removed some power management code from StartNextPacket because it would just confuse this presentation.

VOID StartPacket(PDEVQUEUE pdq, PDEVICE_OBJECT fdo, PIRP Irp, PDRIVER_CANCEL cancel) { KIRQL oldirql;  KeAcquireSpinLock(&pdq->lock, &oldirql); NTSTATUS abortstatus = pdq->abortstatus;  if (abortstatus) { KeReleaseSpinLock(&pdq->lock, oldirql); Irp->IoStatus.Status = abortstatus; IoCompleteRequest(Irp, IO_NO_INCREMENT); }  else if (pdq->CurrentIrp pdq->stallcount) {  IoSetCancelRoutine(Irp, cancel);  if (Irp->Cancel && IoSetCancelRoutine(Irp, NULL)) { KeReleaseSpinLock(&pdq->lock, oldirql); Irp->IoStatus.Status = STATUS_CANCELLED; IoCompleteRequest(Irp, IO_NO_INCREMENT); }  else { InsertTailList(&pdq->head, &Irp->Tail.Overlay.ListEntry); KeReleaseSpinLock(&pdq->lock, oldirql); } }  else { pdq->CurrentIrp = Irp; KeReleaseSpinLockFromDpcLevel(&pdq->lock); (*pdq->StartIo)(fdo, Irp); KeLowerIrql(oldirql); } } VOID StartNextPacket(PDEVQUEUE pdq, PDEVICE_OBJECT fdo) { KIRQL oldirql;  KeAcquireSpinLock(&pdq->lock, &oldirql);  pdq->CurrentIrp = NULL;  while (!pdq->stallcount && !pdq->abortstatus && !IsListEmpty(&pdq->head)) {  PLIST_ENTRY next = RemoveHeadList(&pdq->head); PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry);  if (!IoSetCancelRoutine(Irp, NULL)) { InitializeListHead(&Irp->Tail.Overlay.ListEntry); continue; }  pdq->CurrentIrp = Irp; KeReleaseSpinLockFromDpcLevel(&pdq->lock); (*pdq->StartIo)(fdo, Irp); KeLowerIrql(oldirql); } KeReleaseSpinLock(&pdq->lock, oldirql); } VOID CancelRequest(PDEVQUEUE pdq, PIRP Irp) { KIRQL oldirql = Irp->CancelIrql;  IoReleaseCancelSpinLock(DISPATCH_LEVEL);  KeAcquireSpinLockAtDpcLevel(&pdq->lock);  RemoveEntryList(&Irp->Tail.Overlay.ListEntry); KeReleaseSpinLock(&pdq->lock, oldirql);  Irp->IoStatus.Status = STATUS_CANCELLED; IoCompleteRequest(Irp, IO_NO_INCREMENT); }

Now I ll explain in detail how these functions work together to provide cancel-safe queuing. I ll do this by describing a series of scenarios that involve all of the code paths.

1. The Normal Case for StartPacket

The normal case for StartPacket occurs in the steady state when an IRP, which we assume has not been cancelled, arrives after all PnP processing has taken place and at a time when the device was fully powered. In this situation, stallcount and abortstatus will both be 0. The path through StartPacket depends on whether the device is busy, as follows:

We first acquire the spin lock associated with the queue. (See point 1.) Nearly all the DEVQUEUE routines acquire this lock (see points 8 and 15), so we can be sure that no other code on any other CPU can do anything to the queue that would invalidate the decisions we re about to make.
If the device is busy, the if statement at point 3 will find CurrentIrp not set to NULL. The if statement at point 5 will also fail (I ll explain later exactly why), so we ll get to point 6 to put the IRP in the queue. Releasing the spin lock is the last thing we do in this code path.
If the device is idle, the if statement at point 3 will find CurrentIrp set to NULL. I ve already assumed that stallcount is 0, so we ll get to point 7 in order to process this IRP. Note how we manage to call StartIo at DISPATCH_LEVEL after releasing the spin lock.

2. The Normal Case for StartNextPacket

The normal case for StartNextPacket is similar to that for StartPacket. The stallcount and abortstatus members are 0, and the IRP at the head of the queue hasn t been cancelled. StartNextPacket executes these steps:

Acquires the queue spin lock (point 8). This protects the queue from simultaneous access by other CPUs trying to execute StartPacket or CancelRequest. No other CPU can be trying to execute StartNextPacket because the only caller of StartNextPacket is someone who has just finished processing some other IRP. We allow only one IRP to be active at a time, so there should never be more than one such entity.
If the list is empty, we just release the spin lock and return. If StartPacket had been waiting for the lock, it will now find that the device isn t busy and will call StartIo.
If the list isn t empty, the if test at point 10 will succeed, and we ll enter a loop looking for the next uncancelled IRP.
The first step in the loop (point 11) is to remove the next IRP from the list. Note that RemoveHeadList returns the address of a LIST_ENTRY built into the IRP. We use CONTAINING_RECORD to get the address of the IRP.
IoSetCancelRoutine (point 12) will return the non-NULL address of the cancel routine originally supplied to StartPacket. This is because nothing, least of all IoCancelIrp, has changed this pointer since StartPacket set it. Consequently, we ll get to point 13, where we ll send this IRP to the StartIo routine at DISPATCH_LEVEL.

3. IRP Cancelled Prior to StartPacket; Device Idle

Suppose StartPacket receives an IRP that was cancelled some time ago. At the time IoCancelIrp executed, there wouldn t have been a cancel routine for the IRP. (If there had been, it would have belonged to a driver higher up the stack than us. That other driver would have completed the IRP instead of sending it down to us.) All that IoCancelIrp would have done, therefore, is to set the Cancel flag in the IRP.

If the device is idle, the if test at point 3 fails and we once again go directly to point 7, where we send the IRP to StartIo. In effect, we re going to ignore the Cancel flag. This is fine so long as we process the IRP relatively quickly, which is an engineering judgment. If we won t process the IRP with reasonable dispatch, StartIo and the downstream logic for handling the IRP should have code to detect the Cancel flag and to complete the IRP early.

4. IRP Cancelled During StartPacket; Device Idle

In this scenario, someone calls IoCancelIrp while StartPacket is running. Just as in scenario 3, IoCancelIrp will set the Cancel flag and return. We ll ignore the flag and send the IRP to StartIo.

5. IRP Cancelled Prior to StartPacket; Device Busy

The initial conditions are the same as in scenario 3 except that now the device is busy and the if test at point 3 succeeds. We ll set the cancel routine (point 4) and then test the Cancel flag (point 5). Because the Cancel flag is TRUE, we ll go on to call IoSetCancelRoutine a second time. The function will return the non-NULL address we just installed, whereupon we ll complete the IRP with STATUS_CANCELLED.

6. IRP Cancelled During StartPacket; Device Busy

This is the first sticky wicket we encounter in the analysis. Assume the same initial conditions as scenario 3, but now the device is busy and someone calls IoCancelIrp at about the same time StartPacket is running. There are several possible situations now:

Suppose we test the Cancel flag (point 5) before IoCancelIrp manages to set that flag. Since we find the flag set to FALSE, we go to point 6 and queue the IRP. What happens next depends on how IoCancelIrp, CancelRequest, and StartNextPacket interact. StartPacket is in a not-my-problem field at this point, however, and needn t worry about this IRP any more.
Suppose we test the Cancel flag (point 5) after IoCancelIrp sets the flag. We have already set the cancel pointer (point 4). What happens next depends on whether IoCancelIrp or we are first to execute the IoSetCancelRoutine call that changes the cancel pointer back to NULL. Recall that IoSetCancelRoutine is an atomic operation based on an InterlockedExchangePointer. If we execute our call first, we get back a non-NULL value and complete the IRP. IoCancelIrp gets back NULL and therefore doesn t call any cancel routine.
On the other hand, if IoCancelIrp executes its IoSetCancelRoutine first, we will get back NULL from our call. We ll go on to queue the IRP (point 6) and to enter that not-my-problem field I just referred to. IoCancelIrp will call our cancel routine, which will block (point 15) until we release the queue spin lock. Our cancel routine will eventually complete the IRP.

7. Normal IRP Cancellation

IRPs don t get cancelled very often, so I m not sure it s really right to use the word normal in this context. But if there were a normal scenario for IRP cancellation, this would be it: someone calls IoCancelIrp to cancel an IRP that s in our queue, but the cancel process runs to conclusion before StartNextPacket can possibly try to reach it. The potential race between StartNextPacket and CancelRequest therefore can t materialize. Events will unfold this way:

IoCancelIrp acquires the global cancel spin, sets the Cancel flag, and executes IoSetCancelRoutine to simultaneously retrieve the address of our cancel routine and set the cancel pointer in the IRP to NULL. (Refer to the earlier sketch of IoCancelIrp.)
IoCancelIrp calls our cancel routine without releasing the lock. The cancel routine locates the correct DEVQUEUE and calls CancelRequest. CancelRequest immediately releases the global cancel spin lock (point 14).
CancelRequest acquires the queue spin lock (point 15). Past this point, there can be no more races with other DEVQUEUE routines.
CancelRequest removes the IRP from the queue (point 16) and then releases the spin lock. If StartNextPacket were to run now, it wouldn t find this IRP on the queue.
CancelRequest completes the IRP with STATUS_CANCELLED (point 17).

8. Pathological IRP Cancellation

The most difficult IRP cancellation scenario to handle occurs when IoCancelIrp tries to cancel the IRP at the head of our queue while StartNextPacket is active. At point 12, StartNextPacket will nullify the cancel pointer. If the return value from IoSetCancelRoutine is not NULL, we ve beaten IoCancelIrp to the punch and can go on to process the IRP (point 13).

If the return value from IoSetCancelRoutine is NULL, however, it means that IoCancelIrp has gotten there first. CancelRequest is probably waiting right now on another CPU for us to release the queue spin lock, whereupon it will dequeue the IRP and complete it. The trouble is, we ve already removed the IRP from the queue. I m a bit proud of the trick I devised for coping with the situation: we simply initialize the linking field of the IRP as if it were the anchor of a list! The call to RemoveEntryList at point 16 in CancelRequest will perform several motions with no net result to remove the IRP from the degenerate list it now inhabits.

9. Things That Can t Happen or Won t Matter

The preceding list exhausts the possibilities for conflict between these DEVQUEUE routines and IoCancelIrp. (There is still a race between IRP_MJ_CLEANUP and IRP cancellation, but I ll discuss that a bit later in this chapter.) Here is a list of things that might be causing you needless worry:

Could CancelRoutine be non-NULL when StartPacket gets control? It better not be, because a driver is supposed to remove its cancel routine from an IRP before sending the IRP to another driver. StartPacket contains an ASSERT to this effect. If you engage the Driver Verifier for your driver, it will verify that you nullify the cancel routine pointer in IRPs that you pass down the stack, but it will not verify that the drivers above you have done this for IRPs they pass to you.
Could the cancel argument to StartPacket be NULL? It better not be: you might have noticed that much of the cancel logic I described hinges on whether the IRP s CancelRoutine pointer is NULL. StartPacket contains an ASSERT to test this assumption.
Could someone call IoCancelIrp twice? The thing to think about is that the Cancel flag might be set in an IRP because of some number of primeval calls to IoCancelIrp and that someone might call IoCancelIrp one more time (getting a little impatient, are we?) while StartPacket is active. This wouldn t matter because our first test of the Cancel flag occurs after we install our cancel pointer. We would find the flag set to TRUE in this hypothetical situation and would therefore execute the second call to IoSetCancelRoutine. Either IoCancel Irp or we win the race to reset the cancel pointer to NULL, and whoever wins ends up completing the IRP. The residue from the primeval calls is simply irrelevant.

Cancelling IRPs You Create or Handle

Sometimes you ll want to cancel an IRP that you ve created or passed to another driver. Great care is required to avoid an obscure, low-probability problem. Just for the sake of illustration, suppose you want to impose an overall 5-second timeout on a synchronous I/O operation. If the time period elapses, you want to cancel the operation. Here is some naive code that, you might suppose, would execute this plan:

SomeFunction() { KEVENT event; IO_STATUS_BLOCK iosb; KeInitializeEvent(&event, ...); PIRP Irp = IoBuildSynchronousFsdRequest(..., &event, &iosb); NTSTATUS status = IoCallDriver(DeviceObject, Irp); if (status == STATUS_PENDING) { LARGE_INTEGER timeout; timeout.QuadPart = -5 * 10000000;  if (KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, &timeout) == STATUS_TIMEOUT) { IoCancelIrp(Irp); // <== don't do this!  KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, NULL); } } }

The first call (A) to KeWaitForSingleObject waits until one of two things happens. First, someone might complete the IRP, and the I/O Manager s cleanup code will then run and signal event.

Alternatively, the timeout might expire before anyone completes the IRP. In this case, KeWaitForSingleObject will return STATUS_TIMEOUT. The IRP should now be completed quite soon in one of two paths. The first completion path is taken when whoever was processing the IRP was really just about done when the timeout happened and has, therefore, already called (or will shortly call) IoCompleteRequest. The other completion path is through the cancel routine that, we must assume, the lower driver has installed. That cancel routine should complete the IRP. Recall that we have to trust other kernel-mode components to do their jobs, so we have to rely on whomever we sent the IRP to complete it soon. Whichever path is taken, the I/O Manager s completion logic will set event and store the IRP s ending status in iosb. The second call (B) to KeWaitForSingleObject makes sure that the event and iosb objects don t pass out of scope too soon. Without that second call, we might return from this function, thereby effectively deleting event and iosb. The I/O Manager might then end up walking on memory that belongs to some other subroutine.

The problem with the preceding code is truly minuscule. Imagine that someone manages to call IoCompleteRequest for this IRP right around the same time we decide to cancel it by calling IoCancelIrp. Maybe the operation finishes shortly after the 5 second timeout terminates the first KeWaitForSingleObject, for example. IoCompleteRequest initiates a process that finishes with a call to IoFreeIrp. If the call to IoFreeIrp were to happen before IoCancelIrp was done mucking about with the IRP, you can see that IoCancelIrp could inadvertently corrupt memory when it touched the CancelIrql, Cancel, and CancelRoutine fields of the IRP. It s also possible, depending on the exact sequence of events, for IoCancelIrp to call a cancel routine, just before someone clears the CancelRoutine pointer in preparation for completing the IRP, and for the cancel routine to be in a race with the completion process.

It s very unlikely that the scenario I just described will happen. But, as someone (James Thurber?) once said in connection with the chances of being eaten by a tiger on Main Street (one in a million, as I recall), Once is enough. This kind of bug is almost impossible to find, so you want to prevent it if you can. I ll show you two ways of cancelling your own IRPs. One way is appropriate for synchronous IRPs, the other for asynchronous IRPs.

Don t Do This

A once common but now deprecated technique for avoiding the tiger-on-main-street bug described in the text relies on the fact that, in earlier versions of Windows, the call to IoFreeIrp happened in the context of an APC in the thread that originates the IRP. You could make sure you were in that same thread, raise IRQL to APC_LEVEL, check whether the IRP had been completed yet, and (if not) call IoCancelIrp. You could be sure of blocking the APC and the problematic call to IoFreeIrp.

You shouldn t rely on future releases of Windows always using an APC to perform the cleanup for a synchronous IRP. Consequently, you shouldn t rely on boosting IRQL to APC_LEVEL as a way to avoid a race between IoCancelIrp and IoFreeIrp.

Cancelling Your Own Synchronous IRP

Refer to the example in the preceding section, which illustrates a function that creates a synchronous IRP, sends it to another driver, and then wants to wait no longer than 5 seconds for the IRP to complete. The key thing we need to accomplish in a solution to the race between IoFreeIrp and IoCancelIrp is to prevent the call to IoFreeIrp from happening until after any possible call to IoCancelIrp. We do this by means of a completion routine that returns STATUS_MORE_PROCESSING_REQUIRED, as follows:

SomeFunction() { KEVENT event; IO_STATUS_BLOCK iosb; KeInitializeEvent(&event, ...); PIRP Irp = IoBuildSynchronousFsdRequest(..., &event, &iosb); IoSetCompletionRoutine(Irp, OnComplete, (PVOID) &event, TRUE, TRUE, TRUE); NTSTATUS status = IoCallDriver(...); if (status == STATUS_PENDING) { LARGE_INTEGER timeout; timeout.QuadPart = -5 * 10000000;  if (KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, &timeout) == STATUS_TIMEOUT) { IoCancelIrp(Irp); // <== okay in this context  KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, NULL); } } IoCompleteRequest(Irp, IO_NO_INCREMENT); } NTSTATUS OnComplete(PDEVICE_OBJECT junk, PIRP Irp, PVOID pev) { if (Irp->PendingReturned) KeSetEvent((PKEVENT) pev, IO_NO_INCREMENT, FALSE); return STATUS_MORE_PROCESSING_REQUIRED; }

The new code in boldface prevents the race. Suppose IoCallDriver returns STATUS_PENDING. In a normal case, the operation will complete normally, and a lower-level driver will call IoCompleteRequest. Our completion routine gains control and signals the event on which our mainline is waiting. Because the completion routine returns STATUS_MORE_PROCESSING_REQUIRED, IoCom pleteRequest will then stop working on this IRP. We eventually regain control in our SomeFunction and notice that our wait (the one labeled A) terminated normally. The IRP hasn t yet been cleaned up, though, so we need to call IoCompleteRequesta second time to trigger the normal cleanup mechanism.

Now suppose we decide we want to cancel the IRP and that Thurber s tiger is loose so we have to worry about a call to IoFreeIrp releasing the IRP out from under us. Our first wait (labeled A) finishes with STATUS_TIMEOUT, so we perform a second wait (labeled B). Our completion routine sets the event on which we re waiting. It will also prevent the cleanup mechanism from running by returning STATUS_MORE_PROCESSING_REQUIRED. IoCancelIrp can stomp away to its heart s content on our hapless IRP without causing any harm. The IRP can t be released until the second call to IoCompleteRequest from our mainline, and that can t happen until IoCancelIrp has safely returned.

Notice that the completion routine in this example calls KeSetEvent only when the IRP s PendingReturned flag is set to indicate that the lower driver s dispatch routine returned STATUS_PENDING. Making this step conditional is an optimization that avoids the potentially expensive step of setting the event when SomeFunction won t be waiting on the event in the first place.

I want to mention one last fine point in connection with the preceding code. The call to IoCompleteRequest at the very end of the subroutine will trigger a process that includes setting event and iosb so long as the IRP originally completed with a success status. In the first edition, I had an additional call to KeWaitForSingleObject at this point to make sure that event and iosb could not pass out of scope before the I/O Manager was done touching them. A reviewer pointed out that the routine that references event and iosb will already have run by the time IoCompleteRequest returns; consequently, the additional wait is not needed.

Cancelling Your Own Asynchronous IRP

To safely cancel an IRP that you ve created with IoAllocateIrp or IoBuildAsynchronousFsdRequest, you can follow this general plan. First define a couple of extra fields in your device extension structure:

typedef struct _DEVICE_EXTENSION { PIRP TheIrp; ULONG CancelFlag; } DEVICE_EXTENSION, *PDEVICE_EXTENSION;

Initialize these fields just before you call IoCallDriver to launch the IRP:

pdx->TheIrp = IRP; pdx->CancelFlag = 0; IoSetCompletionRoutine(Irp, (PIO_COMPLETION_ROUTINE) CompletionRoutine, (PVOID) pdx, TRUE, TRUE, TRUE); IoCallDriver(..., Irp);

If you decide later on that you want to cancel this IRP, do something like the following:

VOID CancelTheIrp(PDEVICE_EXENSION pdx) {  PIRP Irp = (PIRP) InterlockedExchangePointer((PVOID*)&pdx->TheIrp, NULL); if (Irp) { IoCancelIrp(Irp);  if (InterlockedExchange(&pdx->CancelFlag, 1)  IoFreeIrp(Irp); } }

This function dovetails with the completion routine you install for the IRP:

NTSTATUS CompletionRoutine(PDEVICE_OBJECT junk, PIRP Irp, PDEVICE_EXTENSION pdx) {  if (InterlockedExchangePointer(&pdx->TheIrp, NULL)  InterlockedExchange(&pdx->CancelFlag, 1))  IoFreeIrp(Irp); return STATUS_MORE_PROCESSING_REQUIRED; }

The basic idea underlying this deceptively simple code is that whichever routine sees the IRP last (either CompletionRoutine or CancelTheIrp) will make the requisite call to IoFreeIrp, at point 3 or 6. Here s how it works:

The normal case occurs when you don t ever try to cancel the IRP. Whoever you sent the IRP to eventually completes it, and your completion routine gets control. The first InterlockedExchangePointer (point 4) returns the non-NULL address of the IRP. Since this is not 0, the compiler short-circuits the evaluation of the Boolean expression and executes the call to IoFreeIrp. Any subsequent call to CancelTheIrp will find the IRP pointer set to NULL at point 1 and won t do anything else.
Another easy case to analyze occurs when CancelTheIrp is called long before anyone gets around to completing this IRP, which means that we don t have any actual race. At point 1, we nullify the TheIrp pointer. Because the IRP pointer was previously not NULL, we go ahead and call IoCancelIrp. In this situation, our call to IoCancelIrp will cause somebody to complete the IRP reasonably soon, and our completion routine runs. It sees TheIrp as NULL and goes on to evaluate the second half of the Boolean expression. Whoever executes the InterlockedExchange on CancelFlag first will get back 0 and skip calling IoFreeIrp. Whoever executes it second will get back 1 and will call IoFreeIrp.
Now for the case we were worried about: suppose someone is completing the IRP right about the time CancelTheIrp wants to cancel it. The worst that can happen is that our completion routine runs before we manage to call IoCancelIrp. The completion routine sees TheIrp as NULL and therefore exchanges CancelFlag with 1. Just as in the previous case, the routine will get 0 as the return value and skip the Io FreeIrp call. IoCancelIrp can safely operate on the IRP. (It will presumably just return without calling a cancel routine because whoever completed this IRP will undoubtedly have set the Cancel Routine pointer to NULL first.)

The appealing thing about the technique I just showed you is its elegance: we rely solely on interlocked operations and therefore don t need any potentially expensive synchronization primitives.

Cancelling Someone Else s IRP

To round out our discussion of IRP cancellation, suppose someone sends you an IRP that you then forward to another driver. Situations might arise where you d like to cancel that IRP. For example, perhaps you need that IRP out of the way so you can proceed with a power-down operation. Or perhaps you re waiting synchronously for the IRP to finish and you d like to impose a timeout as in the first example of this section.

To avoid the IoCancelIrp/IoFreeIrp race, you need to have your own completion routine in place. The details of the coding then depend on whether you re waiting for the IRP.

Canceling Someone Else s IRP on Which You re Waiting

Suppose your dispatch function passes down an IRP and waits synchronously for it to complete. (See usage scenario 7 at the end of this chapter for the cookbook version.) Use code like this to cancel the IRP if it doesn t finish quickly enough to suit you:

NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; KEVENT event; KeInitializeEvent(&event, NotificationEvent, FALSE); IoSetCompletionRoutine(Irp, OnComplete, (PVOID) &event, TRUE, TRUE, TRUE); NTSTATUS status = IoCallDriver(...); if (status == STATUS_PENDING) { LARGE_INTEGER timeout; timeout.QuadPart = -5 * 10000000; if (KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, &timeout) == STATUS_TIMEOUT) { IoCancelIrp(Irp); KeWaitForSingleObject(&event, Executive, KernelMode, FALSE, NULL); } } status = Irp->IoStatus.Status; IoCompleteRequest(Irp, IO_NO_INCREMENT); return status; } NTSTATUS OnComplete(PDEVICE_OBJECT junk, PIRP Irp, PVOID pev) { if (Irp->PendingReturned) KeSetEvent((PKEVENT) pev, IO_NO_INCREMENT, FALSE); return STATUS_MORE_PROCESSING_REQUIRED; }

This code is almost the same as what I showed earlier for canceling your own synchronous IRP. The only difference is that this example involves a dispatch routine, which must return a status code. As in the earlier example, we install our own completion routine to prevent the completion process from running to its ultimate conclusion before we get past the point where we might call IoCancelIrp.

You might notice that I didn t say anything about whether the IRP itself was synchronous or asynchronous. This is because the difference between the two types of IRP only matters to the driver that creates them in the first place. File system drivers must make distinctions between synchronous and asynchronous IRPs with respect to how they call the system cache manager, but device drivers don t typically have this complication. What matters to a lower-level driver is whether it s appropriate to block a thread in order to handle an IRP synchronously, and that depends on the current IRQL and whether you re in an arbitrary or a nonarbitrary thread.

Canceling Someone Else s IRP on Which You re Not Waiting

Suppose you ve forwarded somebody else s IRP to another driver, but you weren t planning to wait for it to complete. For whatever reason, you decide later on that you d like to cancel that IRP.

typedef struct _DEVICE_EXTENSION { PIRP TheIrp; ULONG CancelFlag; } DEVICE_EXTENSION, *PDEVICE_EXTENSION; NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; IoCopyCurrentIrpStackLocationToNext(Irp); IoSetCompletionRoutine(Irp, (PIO_COMPLETION_ROUTINE) OnComplete, (PVOID) pdx, TRUE, TRUE, TRUE); pdx->CancelFlag = 0; pdx->TheIrp = Irp; IoMarkIrpPending(Irp); IoCallDriver(pdx->LowerDeviceObject, Irp); return STATUS_PENDING; } VOID CancelTheIrp(PDEVICE_EXTENSION pdx) { PIRP Irp = (PIRP) InterlockedExchangePointer( (PVOID*) &pdx->TheIrp, NULL); if (Irp) { IoCancelIrp(Irp); if (InterlockedExchange(&pdx->CancelFlag, 1)) IoCompleteRequest(Irp, IO_NO_INCREMENT); } } NTSTATUS OnComplete(PDEVICE_OBJECT fdo, PIRP Irp, PDEVICE_EXTENSION pdx) { if (InterlockedExchangePointer((PVOID*) &pdx->TheIrp, NULL) InterlockedExchange(&pdx->CancelFlag, 1)) return STATUS_SUCCESS; return STATUS_MORE_PROCESSING_REQUIRED; }

This code is similar to the code I showed earlier for cancelling your own asynchronous IRP. Here, however, allowing IoCompleteRequest to finish completing the IRP takes the place of the call to IoFreeIrp we made when we were dealing with our own IRP. If the completion routine is last on the scene, it returns STATUS_SUCCESS to allow IoCompleteRequest to finish completing the IRP. If CancelTheIrp is last on the scene, it calls IoCompleteRequest to resume the completion processing that the completion routine short-circuited by returning STATUS_MORE_PROCESSING_REQUIRED.

One extremely subtle point regarding this example is the call to IoMark Irp Pending in the dispatch routine. Ordinarily, it would be safe to just do this step conditionally in the completion routine, but not this time. If we should happen to call CancelTheIrp in the context of some thread other than the one in which the dispatch routine runs, the pending flag is needed so that IoCompleteRequest will schedule an APC to clean up the IRP in the proper thread. The easiest way to make that true is simple always mark the IRP pending.

Handling IRP_MJ_CLEANUP

Closely allied to the subject of IRP cancellation is the I/O request with the major function code IRP_MJ_CLEANUP. To explain how you should process this request, I need to give you a little additional background.

When applications and other drivers want to access your device, they first open a handle to the device. Applications call CreateFile to do this; drivers call ZwCreateFile. Internally, these functions create a kernel file object and send it to your driver in an IRP_MJ_CREATE request. When the entity that opened the handle is done accessing your driver, it will call another function, such as CloseHandle or ZwClose. Internally, these functions send your driver an IRP_MJ_CLOSE request. Just before sending you the IRP_MJ_CLOSE, however, the I/O Manager sends you an IRP_MJ_CLEANUP so that you can cancel any IRPs that belong to the same file object but that are still sitting in one of your queues. From the perspective of your driver, the one thing all the requests have in common is that the stack location you receive points to the same file object in every instance.

Figure 5-10 illustrates your responsibility when you receive IRP_MJ_CLEANUP. You should run through your queues of IRPs, removing those that are tagged as belonging to the same file object. You should complete those IRPs with STATUS_CANCELLED.

figure 5-10 driver responsibility for irp_mj_cleanup.

Figure 5-10. Driver responsibility for IRP_MJ_CLEANUP.

File Objects

Ordinarily, just one driver (the function driver, in fact) in a device stack implements all three of the following requests: IRP_MJ_CREATE, IRP_MJ_CLOSE, and IRP_MJ_CLEANUP. The I/O Manager creates a file object (a regular kernel object) and passes it in the I/O stack to the dispatch routines for all three of these IRPs. Anybody who sends an IRP to a device should have a pointer to the same file object and should insert that pointer into the I/O stack as well. The driver that handles these three IRPs acts as the owner of the file object in some sense, in that it s the driver that s entitled to use the FsContext and FsContext2 fields of the object. So your DispatchCreate routine can put something into one of these context fields for use by other dispatch routines and for eventual cleanup by your DispatchClose routine.

It s easy to get confused about IRP_MJ_CLEANUP. In fact, programmers who have a hard time understanding IRP cancellation sometimes decide (incorrectly) to just ignore this IRP. You need both cancel and cleanup logic in your driver, though:

IRP_MJ_CLEANUP means a handle is being closed. You should purge all the IRPs that pertain to that handle.
The I/O Manager and other drivers cancel individual IRPs for a variety of reasons that have nothing to do with closing handles.
One of the times the I/O Manager cancels IRPs is when a thread terminates. Threads often terminate because their parent process is terminating, and the I/O Manager will also automatically close all handles that are still open when a process terminates. The coincidence between this kind of cancellation and the automatic handle closing contributes to the incorrect idea that a driver can get by with support for just one concept.

In this book, I ll show you two ways of painlessly implementing support for IRP_MJ_CLEANUP, depending on whether you re using one of my DEVQUEUE objects or one of Microsoft s cancel-safe queues.

Cleanup with a DEVQUEUE

If you ve used a DEVQUEUE to queue IRPs, your IRP_MJ_DISPATCH_CLEANUP routine will be astonishingly simple:

NTSTATUS DispatchCleanup(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); PFILE_OBJECT fop = stack->FileObject; CleanupRequests(&pdx->dqReadWrite, fop, STATUS_CANCELLED); return CompleteRequest(Irp, STATUS_SUCCESS, 0); }

CleanupRequests will remove all IRPs from the queue that belong to the same file object and will complete those IRPs with STATUS_CANCELLED. Note that you complete the IRP_MJ_CLEANUP request itself with STATUS_SUCCESS.

CleanupRequests contains a wealth of detail:

VOID CleanupRequests(PDEVQUEUE pdq, PFILE_OBJECT fop, NTSTATUS status) { LIST_ENTRY cancellist;  InitializeListHead(&cancellist); KIRQL oldirql; KeAcquireSpinLock(&pdq->lock, &oldirql); PLIST_ENTRY first = &pdq->head; PLIST_ENTRY next;  for (next = first->Flink; next != first; ) { PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry);  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);  PLIST_ENTRY current = next; next = next->Flink; if (fop && stack->FileObject != fop) continue;  if (!IoSetCancelRoutine(Irp, NULL)) continue;  RemoveEntryList(current); InsertTailList(&cancellist, current); }  KeReleaseSpinLock(&pdq->lock, oldirql); while (!IsListEmpty(&cancellist)) { next = RemoveHeadList(&cancellist); PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry); Irp->IoStatus.Status = status; IoCompleteRequest(Irp, IO_NO_INCREMENT); } }

Our strategy will be to move the IRPs that need to be cancelled into a private queue under protection of the queue s spin lock. Hence, we initialize the private queue and acquire the spin lock before doing anything else.
This loop traverses the entire queue until we return to the list head. Notice the absence of a loop increment step the third clause in the for statement. I ll explain in a moment why it s desirable to have no loop increment.
If we re being called to help out with IRP_MJ_CLEANUP, the fop argument is the address of a file object that s about to be closed. We re supposed to isolate the IRPs that pertain to the same file object, which requires us to first find the stack location.
If we decide to remove this IRP from the queue, we won t thereafter have an easy way to find the next IRP in the main queue. We therefore perform the loop increment step here.
This especially clever statement comes to us courtesy of Jamie Hanrahan. We need to worry that someone might be trying to cancel the IRP that we re currently looking at during this iteration. They could get only as far as the point where CancelRequest tries to acquire the spin lock. Before getting that far, however, they necessarily had to execute the statement inside IoCancelIrp that nullifies the cancel routine pointer. If we find that pointer set to NULL when we call IoSetCancelRoutine, therefore, we can be sure that someone really is trying to cancel this IRP. By simply skipping the IRP during this iteration, we allow the cancel routine to complete it later on.
Here s where we take the IRP out of the main queue and put it in the private queue instead.
Once we finish moving IRPs into the private queue, we can release our spin lock. Then we cancel all the IRPs we moved.

Cleanup with a Cancel-Safe Queue

To easily clean up IRPs that you ve queued by calling IoCsqInsertIrp, simply adopt the convention that the peek context parameter you use with IoCsqRemoveNextIrp, if not NULL, will be the address of a FILE_OBJECT. Your IRP_MJ_CANCEL routine will look like this (compare with the Cancel sample in the DDK):

NTSTATUS DispatchCleanup(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); PFILE_OBJECT fop = stack->FileObject; PIRP qirp; while ((qirp = IoCsqRemoveNextIrp(&pdx->csq, fop))) CompleteRequest(qirp, STATUS_CANCELLED, 0); return CompleteRequest(Irp, STATUS_SUCCESS, 0); }

Implement your PeekNextIrp callback routine this way:

PIRP PeekNextIrp(PIO_CSQ csq, PIRP Irp, PVOID PeekContext) { PDEVICE_EXTENSION pdx = GET_DEVICE_EXTENSION(csq); PLIST_ENTRY next = Irp ? Irp->Tail.Overlay.ListEntry.Flink : pdx->IrpQueueAnchor.Flink; while (next != &pdx->IrpQueueAnchor) { PIRP NextIrp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry); PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(NextIrp); if (!PeekContext (PFILE_OBJECT) PeekContext == stack->FileObject) return NextIrp; next = next->Flink; } return NULL; }