Completing IO Requests | Programming the Microsoft Windows Driver Model

[Previous] [Next]

Every IRP has an urge toward completion. In the standard processing model, you might complete an IRP in at least two circumstances. The DpcForIsr routine would generally complete the request that's responsible for the most recent interrupt. A dispatch function might also complete an IRP in situations like these:

If the request is erroneous in some easily determined way (such as a request to rewind a printer or to eject the keyboard), the dispatch routine should fail the request by completing it with an appropriate status code.

If the request calls for information that the dispatch function can easily determine (such as a control request asking for the driver's version number), the dispatch routine should provide the answer and complete the request with a successful status code.

Completion Mechanics

Mechanically, completing an IRP entails filling in the Status and Information members within the IRP's IoStatus block and calling IoCompleteRequest. The Status value is one of the codes defined by manifest constants in the DDK header file NTSTATUS.H. Refer to Table 5-1 for an abbreviated list of status codes for common situations. The Information value depends on what type of IRP you're completing and on whether you're succeeding or failing the IRP. Most of the time, when you're failing an IRP (that is, completing it with an error status of some kind), you'll set Information to zero. When you succeed an IRP that involves data transfer, you ordinarily set the Information field equal to the number of bytes transferred.

Table 5-1. Some commonly used NTSTATUS codes.

Status Code	Description
STATUS_SUCCESS	Normal completion
STATUS_UNSUCCESSFUL	Request failed, but no other status code describes the reason specifically
STATUS_NOT_IMPLEMENTED	A function hasn't been implemented
STATUS_INVALID_HANDLE	An invalid handle was supplied for an operation
STATUS_INVALID_PARAMETER	A parameter is in error
STATUS_INVALID_DEVICE_REQUEST	The request is invalid for this device
STATUS_END_OF_FILE	End-of-file marker reached
STATUS_DELETE_PENDING	The device is in the process of being removed from the system
STATUS_INSUFFICIENT_RESOURCES	Not enough system resources (often memory) to perform an operation

NOTE
Always be sure to consult the DDK documentation for the correct setting of IoStatus.Information for the IRP you're dealing with. In some flavors of IRP_MJ_PNP, for example, this field is used as a pointer to a data structure that the PnP Manager is responsible for releasing. If you were to overstore the Information field with zero when failing the request, you would unwittingly cause a resource leak.

Since completing a request is something you do so often, I find it useful to have a helper routine to carry out the mechanics:

 NTSTATUS CompleteRequest(PIRP Irp, NTSTATUS status, ULONG_PTR Information) { Irp->IoStatus.Status = status;   Irp->IoStatus.Information = Information; IoCompleteRequest(Irp, IO_NO_INCREMENT); return status; }

I defined this routine in such a way that it returns whatever status value you supply as its second argument. That's because I'm such a lazy typist: the return value allows me to use this helper whenever I want to complete a request and then immediately return a status code. For example:

NTSTATUS DispatchControl(PDEVICE_OBJECT device, PIRP Irp) { PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp); ULONG code = stack->Parameters.DeviceIoControl.IoControlCode; if (code == IOCTL_TOASTER_BOGUS) return CompleteRequest(Irp, STATUS_INVALID_DEVICE_REQUEST, 0); ... }

You might notice that the Information argument to the CompleteRequest function is typed as a ULONG_PTR. In other words, this value can be either a ULONG or a pointer to something (and therefore potentially 64 bits wide).

When you call IoCompleteRequest, you supply a priority boost value to be applied to whatever thread is currently waiting for this request to complete. You normally choose a boost value that depends on the type of device, as suggested by the manifest constant names listed in Table 5-2. The priority adjustment improves the throughput of threads that frequently wait for I/O operations to complete. Events for which the end user is directly responsible, such as keyboard or mouse operations, result in greater priority boosts in order to give preference to interactive tasks. Consequently, you want to choose the boost value with at least some care. Don't use IO_SOUND_INCREMENT for absolutely every operation a sound card driver finishes, for example—it's not necessary to apply this extraordinary priority increment to a get-driver-version control request.

Table 5-2. Priority boost values for IoCompleteRequest.

Manifest Constant	Numeric Priority Boost
IO_NO_INCREMENT	0
IO_CD_ROM_INCREMENT	1
IO_DISK_INCREMENT	1
IO_KEYBOARD_INCREMENT	6
IO_MAILSLOT_INCREMENT	2
IO_MOUSE_INCREMENT	6
IO_NAMED_PIPE_INCREMENT	2
IO_NETWORK_INCREMENT	2
IO_PARALLEL_INCREMENT	1
IO_SERIAL_INCREMENT	2
IO_SOUND_INCREMENT	8
IO_VIDEO_INCREMENT	1

Don't, by the way, complete an IRP with the special status code STATUS_PENDING. Dispatch routines often return STATUS_PENDING as their return value, but you should never set IoStatus.Status to this value. Just to make sure, the checked build of IoCompleteRequest generates an ASSERT failure if it sees STATUS_PENDING in the ending status. Another popular value for people to use by mistake is apparently "-1", which doesn't have any meaning as an NTSTATUS code at all. There's a checked-build ASSERT to catch that mistake, too.

Using Completion Routines

You often need to know the results of I/O requests that you pass down to lower levels of the driver hierarchy or that you originate. To find out what happened to a request, you install a completion routine by calling IoSetCompletionRoutine:

IoSetCompletionRoutine(Irp, CompletionRoutine, context, InvokeOnSuccess, InvokeOnError, InvokeOnCancel);

Irp is the request whose completion you want to know about. CompletionRoutine is the address of the completion routine you want called, and context is an arbitrary pointer-sized value you want passed as an argument to the completion routine. The InvokeOnXxx arguments are Boolean values indicating whether you want the completion routine called in three different circumstances:

InvokeOnSuccess means you want the completion routine called when something completes the IRP with a status code that passes the NT_SUCCESS test.

InvokeOnError means you want the completion routine called when something completes the IRP with a status code that does not pass the NT_SUCCESS test.

InvokeOnCancel means you want the completion routine called when something calls IoCancelIrp before completing the IRP. I worded this quite delicately: IoCancelIrp will set the Cancel flag in the IRP, and that's the condition that gets tested if you specify this argument. A cancelled IRP might end up being completed with STATUS_CANCELLED (which would fail the NT_SUCCESS test) or with any other status at all. If the IRP gets completed with an error and you specified InvokeOnError, InvokeOnError by itself would cause your completion routine to be called. Conversely, if the IRP gets completed without error and you specified InvokeOnSuccess, InvokeOnSuccess by itself would cause your completion routine to be called. In these cases, InvokeOnCancel would be redundant. But if you left out one or the other (or both) of InvokeOnSuccess or InvokeOnError, the InvokeOnCancel flag would let you see the eventual completion of an IRP whose Cancel flag had been set no matter what status is used for the completion.

At least one of these three flags must be TRUE. Note that IoSetCompletionRoutine is a macro, so you want to avoid arguments that generate side effects. The three flag arguments and the function pointer, in particular, are each referenced twice by the macro.

IoSetCompletionRoutine installs the completion routine address and context argument in the next IO_STACK_LOCATION—that is, in the stack location in which the next lower driver will find its parameters. Consequently, the lowest-level driver in a particular stack of drivers does not dare attempt to install a completion routine. Doing so would be pretty futile, of course, because—by definition of what it means to be the lowest-level driver—there's no driver left to pass the request on to.

A completion routine looks like this:

NTSTATUS CompletionRoutine(PDEVICE_OBJECT device, PIRP Irp, PVOID context) { if (Irp->PendingReturned) IoMarkIrpPending(Irp); ... return <some status code>; }

It receives pointers to the device object and the IRP, and it also receives whatever context value you specified in the call to IoSetCompletionRoutine. Completion routines are usually called at DISPATCH_LEVEL and in an arbitrary thread context, but can be called at PASSIVE_LEVEL or APC_LEVEL. To accommodate the usual case (DISPATCH_LEVEL), completion routines therefore need to be in nonpaged memory and must call only service functions that are callable at DISPATCH_LEVEL. To accommodate the possibility of being called at a lower IRQL, however, a completion routine shouldn't call functions like KeAcquireSpinLockAtDpcLevel that assume they're at DISPATCH_LEVEL to start with.

NOTE
The device object pointer argument to a completion routine is the value left in the I/O stack location's DeviceObject pointer. IoCallDriver ordinarily sets this value. People sometimes create an IRP with an extra stack location so that they can pass parameters to a completion routine without creating an extra context structure. Such a completion routine gets a NULL device object pointer unless the creator sets the DeviceObject field.

How Completion Routines Get Called

IoCompleteRequest is responsible for calling all of the completion routines that drivers installed in their respective stack locations. The way the process works, as shown in the flowchart in Figure 5-7, is this: Something calls IoCompleteRequest to signal the end of processing for the IRP. IoCompleteRequest then consults the current stack location to see whether the driver above the current level installed a completion routine. If not, it moves the stack pointer up one level and repeats the test. This process repeats until a stack location is found that does specify a completion routine or until IoCompleteRequest reaches the top of the stack. Then IoCompleteRequest takes steps that eventually result in something releasing the memory occupied by the IRP (among other things).

When IoCompleteRequest finds a stack frame with a completion routine pointer, it calls that routine and examines the return code. If the return code is anything other than STATUS_MORE_PROCESSING_REQUIRED, IoCompleteRequest moves the stack pointer up one level and continues as before. If the return code is STATUS_MORE_PROCESSING_REQUIRED, however, IoCompleteRequest stops dead in its tracks and returns to its caller. The IRP will then be in a sort of limbo state. The driver whose completion routine halted the stack unwinding process is expected to do more work with the IRP.

Within a completion routine, a call to IoGetCurrentIrpStackLocation will retrieve the same stack pointer as was current when something called IoSetCompletionRoutine to install the completion routine pointer. In other words, it returns the stack location above the one which contains the actual pointer to this completion routine. You should not rely in a completion routine on the contents of any lower stack location. To reinforce this rule, IoCompleteRequest zeroes most of the next location just before calling a completion routine.

click to view at full size.

Figure 5-7. Logic of IoCompleteRequest.

Why Completion Routines Call IoMarkIrpPending

You may have noticed these two lines at the beginning of the skeleton completion routine I just showed you:

 if (Irp->PendingReturned) IoMarkIrpPending(Irp);

This particular piece of boilerplate is required in any completion routine that doesn't return STATUS_MORE_PROCESSING_REQUIRED. If you'd like to know why, read the rest of this section. However, be aware that you should not develop drivers that rely on the information related to how the I/O Manager processes pending IRPs—that process is likely to change in future versions of Windows.

This explanation is complicated!

To maximize system throughput, the I/O Manager expects drivers to defer the completion of IRPs that take a long time to complete. A driver indicates that completion will be deferred by calling IoMarkIrpPending and returning STATUS_PENDING from the dispatch routine. Often, though, the original caller of the I/O Manager wants to wait until the operation finishes before proceeding. The I/O Manager will therefore have logic similar to this (not the actual source code of any particular Microsoft Windows NT function) to deal with the deferred completion:

Irp->UserEvent = pEvent; //  don't do this yourself status = IoCallDriver(...); if (status == STATUS_PENDING) KeWaitForSingleObject(pEvent, ...);

In other words, if IoCallDriver returns STATUS_PENDING, this code will wait on a kernel event. IoCompleteRequest is responsible for setting this event when the IRP finally completes. The address of the event (UserEvent ) is in one of the opaque fields of the IRP so that IoCompleteRequest can find it. But there's more to the story than that.

To keep things simple for the moment, suppose that there were just one driver involved in processing this request. Its dispatch function does the two things we've discussed: it calls IoMarkIrpPending, and it returns STATUS_PENDING. That status code will be the return value from IoCallDriver as well, so you can see that something is now going to wait on an event. The eventual call to IoCompleteRequest occurs in an arbitrary thread context, so IoCompleteRequest will schedule a special kernel APC to execute in the context of the original thread (which is currently blocked). The APC (asynchronous procedure call) routine will set the event, thereby releasing whatever is waiting for the operation to finish. There are reasons we don't need to go into right now for why an APC is used for this purpose instead of a simple call to KeSetEvent.

But queuing an APC is relatively expensive. Suppose that, instead of returning STATUS_PENDING, the dispatch routine were to call IoCompleteRequest and return some other status. In this case, the call to IoCompleteRequest is in the same thread context as the caller of IoCallDriver. It's not necessary to queue an APC, therefore. Furthermore, it's not even necessary to call KeSetEvent since the I/O Manager isn't going to be waiting on an event if it doesn't get STATUS_PENDING back from the dispatch routine. If IoCompleteRequest just had a way to know this case were occurring, it could optimize its processing to avoid the APC, couldn't it? That's where IoMarkIrpPending comes in.

What IoMarkIrpPending does—it's a macro in WDM.H, so you can see this for yourself—is set a flag named SL_PENDING_RETURNED in the current stack location. IoCompleteRequest will set the IRP's PendingReturned flag equal to whatever value it finds in the topmost stack location. Later on, it inspects this flag to see whether the dispatch routine has returned or will return STATUS_PENDING. If you do your job correctly, it won't matter whether the return from the dispatch routine happens before or after IoCompleteRequest makes this determination. "Doing your job correctly," in this particular case, means calling IoMarkIrpPending before you do anything that might result in the IRP getting completed.

So, anyway, IoCompleteRequest looks at the PendingReturned flag. If it's set, and if the IRP in question is of the kind that normally gets completed asynchronously, IoCompleteRequest simply returns to its caller without queuing the APC. It assumes that it's running in the originator's thread context and that some dispatch routine is shortly going to return a nonpending status code to the originator. The originator, in turn, avoids waiting for the event, which is just as well because no one is ever going to signal that event. So far, so good.

Now let's put some additional drivers into the picture. The top-level driver has no clue what will happen below it. It simply passes the request down using code such as the following. (See the next section, "Passing Requests Down to Lower Levels.")

 IoCopyCurrentIrpStackLocationToNext(Irp); IoSetCompletionRoutine(Irp, ...); return IoCallDriver(...);

In other words, the top-level driver installs a completion routine, calls IoCallDriver, and then returns whatever status code IoCallDriver happens to return. This process might now repeat additional times as other drivers pass the request down to whatever is really destined to service it. When the request reaches that level, the dispatch routine calls IoMarkIrpPending and returns STATUS_PENDING. The STATUS_PENDING value then percolates all the way back up to the top and out into the originator of the IRP, which will promptly decide to wait for something to signal the event.

But notice that the driver that called IoMarkIrpPending only managed to set SL_PENDING_RETURNED in its own stack location. The drivers above it actually returned STATUS_PENDING, but they didn't call IoMarkIrpPending on their own behalf because they didn't know they'd end up returning STATUS_PENDING as proxies for the guy at the bottom of the stack. Sorting this out is where the boilerplate code in the completion routine comes in, as follows. As IoCompleteRequest walks up the I/O stack, it pauses at each level to set the IRP's PendingReturned flag to the value of the current stack's SL_PENDING_RETURNED flag. If there's no completion routine at this level, it then sets the next higher stack's SL_PENDING_RETURNED if PendingReturned is set and repeats its loop. It doesn't change SL_PENDING_RETURNED if PendingReturned is clear. In this way, SL_PENDING_RETURNED gets propagated from the bottom to the top of the stack, and the IRP's PendingReturned flag ends up TRUE if any of the drivers ever called IoMarkIrpPending.

IoCompleteRequest does not automatically propagate SL_PENDING_RETURNED across a completion routine, however. The completion routine must do this itself by testing the IRP's PendingReturned flag (that is, did the driver below me return STATUS_PENDING?) and then calling IoMarkIrpPending. If every completion routine does its job, the SL_PENDING_RETURNED flag makes its way to the top of the stack just as if IoCompleteRequest had done all of its work.

Now that I've explained these intricacies, you can see why it's important for dispatch routines to call IoMarkIrpPending if they're going to explicitly return STATUS_PENDING and why completion routines should conditionally do so. If a completion routine were to break the chain, you'd end up with a thread waiting in vain on an event that's destined never to be signalled. Failing to see PendingReturned, IoCompleteRequest would act as if it were dealing with a same-context completion and therefore would not queue the APC that's supposed to signal the event. The same thing would happen if a dispatch routine were to omit the IoMarkIrpPending call and then return STATUS_PENDING.

On the other hand, it's okay, albeit slightly inefficient, to call IoMarkIrpPending and then complete the IRP synchronously. All that will happen is that IoCompleteRequest will queue an APC to signal an event on which no one will ever wait. (Logic is in place to make sure that the event object can't cease to exist before the call to KeSetEvent, too.) This is slower than need be, but it's not harmful.

Do not, by the way, be tempted, in the hope of avoiding the boilerplate call to IoMarkIrpPending inside your completion routine, to code like this:

 status = IoCallDriver(...); if (status == STATUS_PENDING) IoMarkIrpPending(...); //  DON'T DO THIS!

The reason this is a bad idea is that you must treat the IRP pointer as poison after you give it away by calling IoCallDriver. Whatever receives the IRP can complete it, allowing something to call IoFreeIrp, which will render your pointer invalid long before you regain control from IoCallDriver.