Managing PnP State Transitions | Programming the Microsoft Windows Driver Model

Managing PnP State Transitions

As I said at the outset of this chapter, WDM drivers need to track their devices through the state transitions diagrammed in Figure 6-1. This state tracking also ties in with how you queue and cancel I/O requests. Cancellation in turn implicates the global cancel spin lock, which is a performance bottleneck in a multi-CPU system. The standard model of IRP processing with Microsoft queuing functions can t solve all these interrelated problems. In this section, therefore, I ll describe how my DEVQUEUE object helps you cope with the complications Plug and Play creates.

Figure 6-3 illustrates the states of a DEVQUEUE. In the READY state, the queue accepts and forwards requests to your StartIo routine in such a way that the device stays busy. In the STALLED state, however, the queue doesn t forward IRPs to StartIo, even when the device is idle. In the REJECTING state, the queue doesn t even accept new IRPs. Figure 6-4 illustrates the flow of IRPs through the queue.

figure 6-3 states of a devqueue object.

Figure 6-3. States of a DEVQUEUE object.

figure 6-4 flow of irps through a devqueue.

Figure 6-4. Flow of IRPs through a DEVQUEUE.

Table 6-3 lists the support functions you can use with a DEVQUEUE. I discussed how to use InitializeQueue, StartPacket, StartNextPacket, and CancelRequest in the preceding chapter. Now it s time to discuss all the other functions.

Table 6-3. DEVQUEUE Service Routines
Support Function	Description
AbortRequests	Aborts current and future requests
AllowRequests	Undoes effect of previous AbortRequests
AreRequestsBeingAborted	Are we currently aborting new requests?
CancelRequest	Generic cancel routine
CheckBusyAndStall	Checks for idle device and stalls requests in one atomic operation
CleanupRequests	Cancels all requests for a given file object in order to service IRP_MJ_CLEANUP
GetCurrentIrp	Determines which IRP is currently being processed by associated StartIo routine
InitializeQueue	Initializes DEVQUEUE object
RestartRequests	Restarts a stalled queue
StallRequests	Stalls the queue
StartNextPacket	Dequeues and starts the next request
StartPacket	Starts or queues a new request
WaitForCurrentIrp	Waits for current IRP to finish

The real point of using a DEVQUEUE instead of one of the queue objects defined in the DDK is that a DEVQUEUE makes it easier to manage the transitions between PnP states. In all of my sample drivers, the device extension contains a state variable with the imaginative name state. I also define an enumeration named DEVSTATE whose values correspond to the PnP states. When you initialize your device object in AddDevice, you ll call InitializeQueue for each of your device queues and also indicate that the device is in the STOPPED state:

NTSTATUS AddDevice(...) {  PDEVICE_EXTENSION pdx = ...; InitializeQueue(&pdx->dqReadWrite, StartIo); pdx->state = STOPPED;  }

After AddDevice returns, the system sends IRP_MJ_PNP requests to direct you through the various PnP states the device can assume.

NOTE
If your driver uses GENERIC.SYS, GENERIC will initialize your DEVQUEUE object or objects for you. Just be sure to give GENERIC the addresses of those objects in your call to InitializeGenericExtension.

Starting the Device

A newly initialized DEVQUEUE is in a STALLED state, such that a call to StartPacket will queue a request even when the device is idle. You ll keep the queue (or queues) in the STALLED state until you successfully process IRP_MN_START_DEVICE, whereupon you ll execute code like the following:

NTSTATUS HandleStartDevice(...) { status = StartDevice(...); if (NT_SUCCESS(status)) { pdx->state = WORKING; RestartRequests(&pdx->dqReadWrite, fdo); } }

You record WORKING as the current state of your device, and you call RestartRequests for each of your queues to release any IRPs that might have arrived between the time AddDevice ran and the time you received the IRP_MN_START_DEVICE request.

Is It OK to Stop the Device?

The PnP Manager always asks your permission before sending you an IRP_MN_STOP_DEVICE. The query takes the form of an IRP_MN_QUERY_ STOP_DEVICE request that you can cause to succeed or fail as you choose. The query basically means, Would you be able to immediately stop your device if the system were to send you an IRP_MN_STOP_DEVICE in a few nanoseconds? You can handle this query in two slightly different ways. Here s the first way, which is appropriate when your device might be busy with an IRP that either finishes quickly or can be easily terminated in the middle:

NTSTATUS HandleQueryStop(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;  if (pdx->state != WORKING) return DefaultPnpHandler(fdo, Irp);  if (!OkayToStop(pdx)) return CompleteRequest(Irp, STATUS_UNSUCCESSFUL, 0);  StallRequests(&pdx->dqReadWrite); WaitForCurrentIrp(&pdx->dqReadWrite);  pdx->state = PENDINGSTOP; return DefaultPnpHandler(fdo, Irp); }

This statement handles a peculiar situation that can arise for a boot device: the PnP Manager might send you a QUERY_STOP when you haven t initialized yet. You want to ignore such a query, which is tantamount to saying yes.
At this point, you perform some sort of investigation to see whether it will be OK to revert to the STOPPED state. I ll discuss factors bearing on the investigation next.
StallRequests puts the DEVQUEUE in the STALLED state so that any new IRP just goes into the queue. WaitForCurrentIrp waits until the current request, if there is one, finishes on the device. These two steps make the device quiescent until we know whether the device is really going to stop or not. If the current IRP won t finish quickly of its own accord, you ll do something (such as calling IoCancelIrp to force a lower-level driver to finish the current IRP) to encourage it to finish; otherwise, WaitForCurrentIrp won t return.
At this point, we have no reason to demur. We therefore record our state as PENDINGSTOP. Then we pass the request down the stack so that other drivers can have a chance to accept or decline this query.

The other basic way of handling QUERY_STOP is appropriate when your device might be busy with a request that will take a long time and can t be stopped in the middle, such as a tape retension operation that can t be stopped without potentially breaking the tape. In this case, you can use the DEVQUEUE object s CheckBusyAndStall function. That function returns TRUE if the device is busy, whereupon you cause the QUERY_STOP to fail with STATUS_UNSUCCESS FUL. The function returns FALSE if the device is idle, in which case it also stalls the queue. (The operations of checking the state of the device and stalling the queue need to be protected by a spin lock, which is why I wrote this function in the first place.)

You can cause a stop query to fail for many reasons. Disk devices that are used for paging, for example, cannot be stopped. Neither can devices that are used for storing hibernation or crash dump files. (You ll know about these characteristics as a result of an IRP_MN_DEVICE_USAGE_NOTIFICATION request, which I ll discuss later in Other Configuration Functionality. ) Other reasons may also apply to your device.

Even if you have the query succeed, one of the drivers underneath you might cause it to fail for some reason. Even if all the drivers have the query succeed, the PnP Manager might decide not to shut you down. In any of these cases, you ll receive another PnP request with the minor code IRP_MN_CANCEL_STOP_DEVICE to tell you that your device won t be shut down. You should then clear whatever state you set during the initial query:

NTSTATUS HandleCancelStop(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; if (pdx->state != PENDINGSTOP) return DefaultPnpHandler(fdo, Irp); NTSTATUS status = ForwardAndWait(fdo, Irp); pdx->state = WORKING; RestartRequests(&pdx->dqReadWrite, fdo); return CompleteRequest(Irp, status); }

We first check to see whether a stop operation is even pending. Some higher-level driver might have vetoed a query that we never saw, so we d still be in the WORKING state. If we re not in the PENDINGSTOP state, we simply forward the IRP. Otherwise, we send the CANCEL_STOP IRP synchronously to the lower-level drivers. That is, we use our ForwardAndWait helper function to send the IRP down the stack and await its completion. We wait for low-level drivers because we re about to resume processing IRPs, and the drivers might have work to do before we send them an IRP. We then change our state variable to indicate that we re back in the WORKING state, and we call Restart Requests to unstall the queues we stalled when we caused the query to succeed.

While the Device Is Stopped

If, on the other hand, all device drivers have the query succeed and the PnP Manager decides to go ahead with the shutdown, you ll get an IRP_MN_STOP_DEVICE next. Your subdispatch function will look like this one:

NTSTATUS HandleStopDevice(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;  if (pdx->state != PENDINGSTOP); { <complicated stuff> }  StopDevice(fdo, pdx->state == WORKING);  pdx->state = STOPPED;  return DefaultPnpHandler(fdo, Irp); }

We expect the system to send us a QUERY_STOP before it sends us a STOP, so we should already be in the PENDINGSTOP state with all of our queues stalled. There is, however, a bug in Windows 98 such that we can sometimes get a STOP (without a QUERY_STOP) instead of a REMOVE. You need to take some action at this point that causes you to reject any new IRPs, but you mustn t really remove your device object or do the other things you do when you really receive a REMOVE request.
StopDevice is the helper function I ve already discussed that deconfigures the device.
We now enter the STOPPED state. We re in almost the same situation as we were when AddDevice was done. That is, all queues are stalled, and the device has no I/O resources. The only difference is that we ve left our registered interfaces enabled, which means that applications won t have received removal notifications and will leave their handles open. Applications can also open new handles in this situation. Both aspects are just as they should be because the stop condition won t last long.
As I previously discussed, the last thing we do to handle IRP_MN_STOP_DEVICE is pass the request down to the lower layers of the driver hierarchy.

Is It OK to Remove the Device?

Just as the PnP Manager asks your permission before shutting your device down with a stop device request, it also might ask your permission before removing your device. This query takes the form of an IRP_MN_QUERY_RE MOVE_DEVICE request that you can, once again, cause to succeed or fail as you choose. And, just as with the stop query, the PnP Manager will use an IRP_MN_CANCEL_REMOVE_DEVICE request if it changes its mind about removing the device.

NTSTATUS HandleQueryRemove(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;  if (OkayToRemove(fdo)) {  StallRequests(&pdx->dqReadWrite); WaitForCurrentIrp(&pdx->dqReadWrite);  pdx->prevstate = pdx->state; pdx->state = PENDINGREMOVE; return DefaultPnpHandler(fdo, Irp); } return CompleteRequest(Irp, STATUS_UNSUCCESSFUL, 0); } NTSTATUS HandleCancelRemove(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;  if (pdx->state != PENDINGREMOVE) return DefaultPnpHandler(fdo, Irp); NTSTATUS status = ForwardAndWait(fdo, Irp);  pdx->state = pdx->prevstate; RestartRequests(&pdx->dqReadWrite, fdo); return CompleteRequest(Irp, status); }

This OkayToRemove helper function provides the answer to the question, Is it OK to remove this device? In general, this answer includes some device-specific ingredients, such as whether the device holds a paging or hibernation file, and so on.
Just as I showed you for IRP_MN_QUERY_STOP_DEVICE, you want to stall the request queue and wait for a short period, if necessary, until the current request finishes.
If you look at Figure 6-1 carefully, you ll notice that it s possible to get a QUERY_REMOVE when you re in either the WORKING or the STOPPED state. The right thing to do if the current query is later cancelled is to return to the original state. Hence, I have a prevstate variable in the device extension to record the prequery state.
We get the CANCEL_REMOVE request when someone either above or below us vetoes a QUERY_REMOVE. If we never saw the query, we ll still be in the WORKING state and don t need to do anything with this IRP. Otherwise, we need to forward it to the lower levels before we process it because we want the lower levels to be ready to process the IRPs we re about to release from our queues.
Here we undo the steps we took when we succeeded the QUERY_REMOVE. We revert to the previous state. We stalled the queues when we handled the query and need to unstall them now.

Synchronizing Removal

It turns out that the I/O Manager can send you PnP requests simultaneously with other substantive I/O requests, such as requests that involve reading or writing. It s entirely possible, therefore, for you to receive an IRP_MN_RE MOVE_DEVICE at a time when you re still processing another IRP. It s up to you to prevent untoward consequences, and the standard way to do that involves using an IO_REMOVE_LOCK object and several associated kernel-mode support routines.

The basic idea behind the standard scheme for preventing premature removal is that you acquire the remove lock each time you start processing a request that you will pass down the PnP stack, and you release the lock when you re done. Before you remove your device object, you make sure that the lock is free. If not, you wait until all references to the lock are released. Figure 6-5 illustrates the process.

figure 6-5 operation of an io_remove_lock.

Figure 6-5. Operation of an IO_REMOVE_LOCK.

To handle the mechanics of this process, you define a variable in the device extension:

struct DEVICE_EXTENSION {  IO_REMOVE_LOCK RemoveLock;  };

You initialize the lock object during AddDevice:

NTSTATUS AddDevice(PDRIVER_OBJECT DriverObject, PDEVICE_OBJECT pdo) {  IoInitializeRemoveLock(&pdx->RemoveLock, 0, 0, 0);  }

The last three parameters to IoInitializeRemoveLock are, respectively, a tag value, an expected maximum lifetime for a lock, and a maximum lock count, none of which is used in the free build of the operating system.

These preliminaries set the stage for what you do during the lifetime of the device object. Whenever you receive an I/O request that you plan to forward down the stack, you call IoAcquireRemoveLock. IoAcquireRemoveLock will return STATUS_DELETE_PENDING if a removal operation is under way. Otherwise, it will acquire the lock and return STATUS_SUCCESS. Whenever you finish such an I/O operation, you call IoReleaseRemoveLock, which will release the lock and might unleash a heretofore pending removal operation. In the context of some purely hypothetical dispatch function that synchronously forwards an IRP, the code might look like this:

NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; NTSTATUS status = IoAcquireRemoveLock(&pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) return CompleteRequest(Irp, status, 0); status = ForwardAndWait(fdo, Irp); if (!NT_SUCCESS(status)) { IoReleaseRemoveLock(&pdx->RemoveLock, Irp); return CompleteRequest(Irp, status, 0); }  IoReleaseRemoveLock(&pdx->RemoveLock, Irp); return CompleteRequest(Irp, <some code>, <info value>); }

The second argument to IoAcquireRemoveLock and IoReleaseRemoveLock is just a tag value that a checked build of the operating system can use to match up acquisition and release calls, by the way.

The calls to acquire and release the remove lock dovetail with additional logic in the PnP dispatch function and the remove device subdispatch function. First DispatchPnp has to obey the rule about locking and unlocking the device, so it will contain the following code, which I didn t show you earlier in IRP_MJ_PNP Dispatch Function :

NTSTATUS DispatchPnp(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; NTSTATUS status = IoAcquireRemoveLock(&pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) return CompleteRequest(Irp, status, 0);  status = (*fcntab[fcn](fdo, Irp); if (fcn != IRP_MN_REMOVE_DEVICE) IoReleaseRemoveLock(&pdx->RemoveLock, Irp); return status; }

In other words, DispatchPnp locks the device, calls the subdispatch routine, and then (usually) unlocks the device afterward. The subdispatch routine for IRP_MN_REMOVE_DEVICE has additional special logic that you also haven t seen yet:

NTSTATUS HandleRemoveDevice(PDEVICE_OBJECT fdo, PIRP Irp) { Irp->IoStatus.Status = STATUS_SUCCESS; PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension;  AbortRequests(&pdx->dqReadWrite, STATUS_DELETE_PENDING); DeregisterAllInterfaces(pdx); StopDevice(fdo, pdx->state == WORKING); pdx->state = REMOVED;  NTSTATUS status = DefaultPnpHandler(pdx->LowerDeviceObject, Irp);  IoReleaseRemoveLockAndWait(&pdx->RemoveLock, Irp); RemoveDevice(fdo); return status; }

Windows 98/Me doesn t send the SURPRISE_REMOVAL request, so this REMOVE IRP may be the first indication you have that the device has disappeared. Calling StopDevice allows you to release all your I/O resources in case you didn t get an earlier IRP that caused you to release them. Calling AbortRequests causes you to complete any queued IRPs and to start rejecting any new IRPs.
We pass this request to the lower layers now that we ve done our work.
The PnP dispatch routine acquired the remove lock. We now call the special function IoReleaseRemoveLockAndWait to release that lock reference and wait until all references to the lock are released. Once you call IoReleaseRemoveLockAndWait, any subsequent call to Io AcquireRemoveLock will elicit a STATUS_DELETE_PENDING status to indicate that device removal is under way.

NOTE
You ll notice that the IRP_MN_REMOVE_DEVICE handler might block while an IRP finishes. This is certainly OK in Windows 98/Me and Windows XP, which were designed with this possibility in mind the IRP gets sent in the context of a system thread that s allowed to block. Some WDM functionality (a Microsoft developer even called it embryonic ) is present in OEM releases of Microsoft Windows 95, but you can t block a remove device request there. Consequently, if your driver needs to run in Windows 95, you need to discover that fact and avoid blocking. That discovery process is left as an exercise for you.

It bears repeating that you need to use the remove lock only for an IRP that you pass down the PnP stack. If you have the stamina, you can read the next section to understand exactly why this conclusion is true and note that it differs from the conventional wisdom that I and others have been espousing for several years. If someone sends you an IRP that you handle entirely inside your own driver, you can rely on whoever sent you the IRP to make sure your driver remains in memory until you both complete the IRP and return from your dispatch routine. If you send an IRP to someone outside your PnP stack, you ll use other means (such as a referenced file or device object) to keep the target driver in memory until it both completes the IRP and returns from its dispatch routine.

Why Do I Need This @#$! Remove Lock, Anyway?

A natural question at this point is why, in the context of a robust and full-featured modern operating system, you even need to worry about somebody unloading a driver when it knows, or should know, that it s busy handling an IRP. This question is hard to answer, but here goes.

The remove lock isn t necessary to guard against having your device object removed out from under you while you re processing an IRP. Rather, it protects you from sending an IRP down your PnP stack to a lower device object that no longer exists or that might cease to exist before the IRP finishes. To make this clear, I need to explain rather fully how the PnP Manager and the Object Manager work together to keep drivers and device objects around while they re needed. I m grossly oversimplifying here in order to emphasize the basic things you need to understand.

First of all, every object that the Object Manager manages carries a reference count. When someone creates such an object, the Object Manager initializes the reference count to 1. Thereafter, anyone can call ObReferenceObject to increment the reference count and ObDereferenceObject to decrement it. For each type of object, there is a routine that you can call to destroy the object. For example, IoDeleteDevice is the routine you call to delete a DEVICE_OBJECT. That routine never directly releases the memory occupied by the object. Instead, it directly or indirectly calls ObDereferenceObject to release the original reference. Only when the reference count drops to 0 will the Object Manager actually destroy the object.

NOTE
In Chapter 5, I advised you to take an extra reference to a file object or device object discovered via IoGetDevice ObjectPointer around the call to IoCallDriver for an asynchronous IRP. The reason for the advice may now be clear: you want to be sure the target driver for the IRP is pinned in memory until its dispatch routine returns regardless of whether your completion routine releases the reference taken by IoGetDeviceObjectPointer. Dang, but this is getting complicated!

IoDeleteDevice makes some checks before it releases the last reference to a device object. In both operating systems, it checks whether the Attached Device pointer is NULL. This field in the device object points upward to the device object for the next upward driver. This field is set by IoAttachDeviceToDeviceStack and reset by IoDetachDevice, which are functions that WDM drivers call in their AddDevice and RemoveDevice functions, respectively.

You want to think about the entire PnP stack of device objects as being the target of IRPs that the I/O Manager and drivers outside the stack send to your device. This is because the driver for the topmost device object in the stack is always first to process any IRP. Before anyone sends an IRP to your stack, however, they will have a referenced pointer to this topmost device object, and they won t release the reference until after the IRP completes. So if a driver stack contains just one device object, there will never be any danger of having a device object or driver code disappear while the driver is processing an IRP: the IRP sender s reference pins the device object in memory, even if someone calls IoDeleteDevice before the IRP completes, and the device object pins the driver code in memory.

WDM driver stacks usually contain two or more device objects, so you have to wonder about the second and lower objects in a stack. After all, whoever sends an IRP to the device has a reference only to the topmost device object, not to the objects lower down in the stack. Imagine the following scenario, then. Someone sends an IRP_MJ_SOMETHING (a made-up major function to keep us focused on the remove lock) to the topmost filter device object (FiDO), whose driver sends it down the stack to your function driver. You plan to send this IRP down to the filter driver underneath you. But, at about the same time on another CPU, the PnP Manager has sent your driver stack an IRP_MN_REMOVE_DEVICE request.

Before the PnP Manager sends REMOVE_DEVICE requests, it takes an extra reference to every device object in the stack. Then it sends the IRP. Each driver passes the IRP down the stack and then calls IoDetachDevice followed by IoDeleteDevice. At each level, IoDeleteDevice sees that AttachedDevice is not (yet) NULL and decides that the time isn t quite right to dereference the device object. When the driver at the next higher level calls IoDetachDevice, however, the time is right, and the I/O Manager dereferences the device object. Without the PnP Manager s extra reference, the object would then disappear, and that might trigger unloading the driver at that level of the stack. Once the REMOVE_DEVICE request is complete, the PnP Manager will release all the extra references. That will allow all but the topmost device object to disappear because only the topmost object is protected by the reference owned by the sender of the IRP_MJ_SOMETHING.

IMPORTANT
Every driver I ve ever seen or written processes REMOVE_DEVICE synchronously. That is, no driver ever pends a REMOVE_DEVICE request. Consequently, the calls to IoDetach Device and IoDeleteDevice at any level of the PnP stack always happen after the lower-level drivers have already performed those calls. This fact doesn t impact our analysis of the remove lock because the PnP Manager won t release its extra reference to the stack until after REMOVE_DEVICE actually completes, which requires IoComplete Request to run to conclusion.

Can you see why the Microsoft folks who understand the PnP Manager deeply are fond of saying, Game Over at this point? We re going to trust whoever is above us in the PnP stack to keep our device object and driver code in memory until we re done handling the IRP_MJ_SOMETHING that I hypothesized. But we haven t (yet) done anything to keep the next lower device object and driver in memory. While we were getting ready to send the IRP down, the IRP_MN_REMOVE_DEVICE ran to completion, and the lower driver is now gone!

And that s the problem that the remove lock solves: we simply don t want to pass an IRP down the stack if we ve already returned from handling an IRP_MN_REMOVE_DEVICE. Conversely, we don t want to return from IRP_MN_REMOVE_DEVICE (and thereby allow the PnP Manager to release what might be the last reference to the lower device object) until we know the lower driver is done with all the IRPs that we ve sent to it.

Armed with this understanding, let s look again at an IRP-handling scenario in which the remove lock is helpful. This is an example of my IRP-handling scenario 1 (pass down with completion routine) from Chapter 5:

NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp) { PDEVICE_EXTENSION pdx = (PDEVICE_EXTENSION) fdo->DeviceExtension; A NTSTATUS status = IoAcquireRemoveLock(&pdx->RemoveLock, Irp); if (!NT_SUCCESS(status)) return CompleteRequest(Irp, status, 0); IoCopyCurrentIrpStackLocationToNext(Irp); IoSetCompletionRoutine(Irp, (PIO_COMPLETION_ROUTINE) CompletionRoutine, pdx, TRUE, TRUE, TRUE); return IoCallDriver(pdx->LowerDeviceObject, Irp); } NTSTATUS CompletionRoutine(PDEVICE_OBJECT fdo, PIRP Irp, PDEVICE_EXTENSION pdx) { if (Irp->PendingReturned) IoMarkIrpPending(Irp); <desired completion processing> B IoReleaseRemoveLock(&pdx->RemoveLock, Irp); return STATUS_SUCCESS; }

In summary, we acquire the remove lock for this IRP in the dispatch routine, and we release it in the completion routine. Suppose this IRP is racing an IRP_MN_REMOVE_DEVICE down the stack. If our HandleRemoveDevice function has gotten to the point of calling IoReleaseRemoveLockAndWait before we get to point A, perhaps all the device objects in the stack are teetering on the edge of extinction because the REMOVE_DEVICE may have finished long ago. If we re the topmost device object, somebody s reference is keeping us alive. If we re lower down the stack, the driver above us is keeping us alive. Either way, it s certainly OK for us to execute instructions. We ll find that our call to IoAcquireRemoveLock returns STATUS_DELETE_PENDING, so we ll just complete the IRP and return.

Suppose instead that we win the race by calling IoAcquireRemoveLock before our HandleRemoveDevice function calls IoReleaseRemoveLockAndWait. In this case, we ll pass the IRP down the stack. IoReleaseRemoveLockAndWait will block until our completion routine (at point B) releases the lock. At this exact instant, we fall back on the IRP sender s reference or the driver above us to keep us in memory long enough for our completion routine to return.

At this point in the analysis, I have to raise an alarming point that everyone who writes WDM drivers or writes or lectures about them, including me, has missed until now. Passing an IRP down without a completion routine is actually unsafe because it allows us to send an IRP down to a driver that isn t pinned in memory. Anytime you see a call to IoSkipCurrentIrpStackLocation (there are 204 of them in the Windows XP DDK), your antennae should twitch. We ve all been getting away with this because some redundant protections are in place and because the coincidence of an IRP_MN_REMOVE_DEVICE with some kind of problem IRP is very rare. Refer to the sidebar for a discussion.

The Redundant Guards Against Early Removal

As the text says, Windows XP contains some redundant protections against early removal of device objects. In both Windows XP and Windows 2000, the PnP Manager won t send an IRP_MN_REMOVE_DEVICE if any file objects exist that point to any device object in the stack. Many IRPs are handle based in that they originate in callers that hold a referenced pointer to a file object. Consequently, there is never a concern with these handle-based IRPs that your lower device object might disappear. You can dispense with the remove lock altogether for these IRPs if you trust all the drivers who send them to you to either have a referenced file object or hold their own remove lock while they re outstanding.

There is a large class of IRP that device drivers never see because these IRPs involve file system operations on volumes. Thus, worrying about what might happen as a device driver handles an IRP_MJ_QUERY_VOLUME_INFORMATION, for example, isn t practical.

Only a few IRPs aren t handle based or aimed at file system drivers, and most of them carry their own built-in safeguards. To get an IRP_MJ_SHUTDOWN, you have to specifically register with the I/O Manager by calling IoRegisterShutdownNotification. IoDeleteDevice automatically deregisters you if you happen to forget, and you won t be getting REMOVE_DEVICE requests while shutdown notifications are in progress. The DDK doesn t say so, but you shouldn t pass this IRP down the PnP stack: any driver in the stack that wants to receive this IRP has to register separately.

IRP_MJ_SYSTEM_CONTROL is another special case. The Windows Management Instrumentation (WMI) subsystem uses this request to perform WMI query and set operations. Part of your StopDevice processing ought to be deregistering with WMI, and the deregistration call doesn t return until all of these IRPs have drained through your device. After the deregistration call, you won t get any more WMI requests.

The PnP Manager itself is the source of most IRP_MJ_PNP requests, and you can be sure that it won t overlap a REMOVE_DEVICE request with another PnP IRP. You can t, however, be sure there s no overlap with PnP IRPs sent by other drivers, such as a QUERY_DEVICE_RELATIONS to get the physical device object (PDO) address or a QUERY_INTERFACE to locate a direct-call interface.

Finally, there s IRP_MJ_POWER, which is a potential problem because the Power Manager doesn t lock an entire device stack and doesn t hold a file object pointer.

The window of vulnerability is actually pretty small. Consider the following fragment of dispatch routines in two drivers:

NTSTATUS DriverA_DispatchSomething(...) {  NTSTATUS status = IoAcquireRemoveLock(...); if (!NT_SUCCESS(status)) return CompleteRequest(...); IoSkipCurrentIrpStackLocation(...); status = IoCallDriver(...); IoReleaseRemoveLock(...); return status; } NTSTATUS DriverB_DispatchSomething(...) {  return ??; }

Driver A s use of the remove lock protects Driver B until Driver B s dispatch routine returns. Thus, if Driver B completes the IRP or itself passes the IRP down using IoSkipCurrentIrpStackLocation, Driver B s involvement with the IRP will certainly be finished by the time Driver A is able to release the remove lock. If Driver B were to pend the IRP, Driver A wouldn t be holding the remove lock by the time Driver B got around to completing the IRP. We can assume, however, that Driver B will have some mechanism in place for purging its queues of pending IRPs before returning from its own HandleRemoveDevice function. Driver A won t call IoDetachDevice or return from its own Handle RemoveDevice function until afterwards.

The only time there will be a problem is if Driver B passes the IRP down with a completion routine installed via the original IoSetCompletionRoutine macro. Even here, if the lowest driver that handles this IRP does so correctly, itsHandleRemoveDevice function won t return until the IRP is completed. We ll have just a slim chance that Driver B could be unloaded before its completion routine runs.

There is, unfortunately, no way for a driver to completely protect itself from being unloaded while processing an IRP. Any scheme you or I can devise will inevitably risk executing at least one instruction (a return) after the system removes the driver image from memory. You can, however, hope that the drivers above you minimize the risk by using the techniques I ve outlined here.

How the DEVQUEUE Works with PnP

In contrast with other examples in this book, I m going to show you the full implementation of the DEVQUEUE object, even though the source code is in the companion content. I m making an exception in this case because I think an annotated listing of the functions will make it easier for you to understand how to use it. We ve already discussed the major routines in the preceding chapter, so I can focus here on the routines that dovetail with IRP_MJ_PNP.

Stalling the Queue

Stalling the IRP queue involves two DEVQUEUE functions:

VOID NTAPI StallRequests(PDEVQUEUE pdq) {  InterlockedIncrement(&pdq->stallcount); } BOOLEAN NTAPI CheckBusyAndStall(PDEVQUEUE pdq) { KIRQL oldirql;  KeAcquireSpinLock(&pdq->lock, &oldirql);  BOOLEAN busy = pdq->CurrentIrp != NULL; if (!busy)  InterlockedIncrement(&pdq->stallcount); KeReleaseSpinLock(&pdq->lock, oldirql); return busy; }

To stall requests, we just need to set the stall counter to a nonzero value. It s unnecessary to protect the increment with a spin lock because any thread that might be racing with us to change the value will also be using an interlocked increment or decrement.
Since CheckBusyAndStall needs to operate as an atomic function, we first take the queue s spin lock.
CurrentIrp being non-NULL is the signal that the device is busy handling one of the requests from this queue.
If the device is currently idle, this statement starts stalling the queue, thereby preventing the device from becoming busy later on.

Recall that StartPacket and StartNextPacket don t send IRPs to the queue s StartIo routine while the stall counter is nonzero. In addition, InitializeQueue initializes the stall counter to 1, so the queue begins life in the stalled state.

Restarting the Queue

RestartRequests is the function that unstalls a queue. This function is quite similar to StartNextPacket, which I showed you in Chapter 5.

VOID RestartRequests(PDEVQUEUE pdq, PDEVICE_OBJECT fdo) { KIRQL oldirql;  KeAcquireSpinLock(&pdq->lock, &oldirql);  if (InterlockedDecrement(&pdq->stallcount) > 0) { KeReleaseSpinLock(&pdq->lock, oldirql); return; }  while (!pdq->stallcount && !pdq->CurrentIrp && !pdq->abortstatus && !IsListEmpty(&pdq->head)) { PLIST_ENTRY next = RemoveHeadList(&pdq->head); PIRP Irp = CONTAINING_RECORD(next, IRP, Tail.Overlay.ListEntry); if (!IoSetCancelRoutine(Irp, NULL)) { InitializeListHead(&Irp->Tail.Overlay.ListEntry); continue; } pdq->CurrentIrp = Irp; KeReleaseSpinLockFromDpcLevel(&pdq->lock); (*pdq->StartIo)(fdo, Irp); KeLowerIrql(oldirql); return; } KeReleaseSpinLock(&pdq->lock, oldirql); }

We acquire the queue spin lock to prevent interference from a simultaneous invocation of StartPacket.
Here we decrement the stall counter. If it s still nonzero, the queue remains stalled, and we return.
This loop duplicates a similar loop inside StartNextPacket. We need to duplicate the code here to accomplish all of this function s actions within one invocation of the spin lock.

NOTE
True confession: The first edition described a much simpler and incorrect implementation of RestartRequests. A reader pointed out a race between the earlier implementation and StartPacket, which was corrected on my Web site as shown here.

Awaiting the Current IRP

The handler for IRP_MN_STOP_DEVICE might need to wait for the current IRP, if any, to finish by calling WaitForCurrentIrp:

VOID NTAPI WaitForCurrentIrp(PDEVQUEUE pdq) {  KeClearEvent(&pdq->evStop);  ASSERT(pdq->stallcount != 0); KIRQL oldirql;  KeAcquireSpinLock(&pdq->lock, &oldirql); BOOLEAN mustwait = pdq->CurrentIrp != NULL; KeReleaseSpinLock(&pdq->lock, oldirql); if (mustwait) KeWaitForSingleObject(&pdq->evStop, Executive, KernelMode, FALSE, NULL); }

StartNextPacket signals the evStop event each time it s called. We want to be sure that the wait we re about to perform doesn t complete because of a now-stale signal, so we clear the event before doing anything else.
It doesn t make sense to call this routine without first stalling the queue. Otherwise, StartNextPacket will just start the next IRP if there is one, and the device will become busy again.
If the device is currently busy, we ll wait on the evStop event until someone calls StartNextPacket to signal that event. We need to protect our inspection of CurrentIrp with the spin lock because, in general, testing a pointer for NULL isn t an atomic event. If the pointer is NULL now, it can t change later because we ve assumed that the queue is stalled.

Aborting Requests

Surprise removal of the device demands that we immediately halt every outstanding IRP that might try to touch the hardware. In addition, we want to make sure that all further IRPs are rejected. The AbortRequests function helps with these tasks:

VOID NTAPI AbortRequests(PDEVQUEUE pdq, NTSTATUS status) { pdq->abortstatus = status; CleanupRequests(pdq, NULL, status); }

Setting abortstatus puts the queue in the REJECTING state so that all future IRPs will be rejected with the status value our caller supplied. Calling Cleanup Requests at this point with a NULL file object pointer so that CleanupRequests will process the entire queue empties the queue.

We don t dare try to do anything with the IRP, if any, that s currently active on the hardware. Drivers that don t use the hardware abstraction layer (HAL) to access the hardware USB drivers, for example, which rely on the hub and host-controller drivers can count on another driver to cause the current IRP to fail. Drivers that use the HAL might, however, need to worry about hanging the system or, at the very least, leaving an IRP in limbo because the nonexistent hardware can t generate the interrupt that would let the IRP finish. To deal with situations such as this, you call AreRequestsBeingAborted:

NTSTATUS AreRequestsBeingAborted(PDEVQUEUE pdq) { return pdq->abortstatus; }

It would be silly, by the way, to use the queue spin lock in this routine. Suppose we capture the instantaneous value of abortstatus in a thread-safe and multiprocessor-safe way. The value we return can become obsolete as soon as we release the spin lock.

NOTE
If your device might be removed in such a way that an outstanding request simply hangs, you should also have some sort of watchdog timer running that will let you kill the IRP after a specified period of time. See the Watchdog Timers section in Chapter 14.

Sometimes we need to undo the effect of a previous call to AbortRequest. AllowRequests lets us do that:

VOID NTAPI AllowRequests(PDEVQUEUE pdq) { pdq->abortstatus = (NTSTATUS) 0; }