One of the Privileges-and Chores-that Comes with Owning a Core Kernel Component in Microsoft Windows is That I Get to Analyze a Lot of Operating System Crashes That Appear in That component. As the Owner of the I/O Manager, I Have Had the Opportunity to Debug Many Driver-related issues. I Learned a Lot From These crashes. As I Debugged the Crash Dumps, Patterns Began to emerge.
To understand the problems holistically, I decided I needed a better understanding of the various device stacks-such as storage, audio, and display-and the interconnects-such as USB and 1394. So I launched what we called Driver Stack Reviews with the development leads of the device teams in the Windows Division. After numerous reviews, we concluded that our underlying driver model was too complex. We did not have the right abstractions, and we were putting too much burden on the driver developer.
The Windows Driver Model (WDM) grew organically over 14 years of development and was showing its age. Although WDM is very flexible and can support many different devices, it has a fairly low-level of abstraction. It was built for a small number of developers who had either a deep understanding of the Windows kernel or access to the kernel developers. It was not built for what is now a large pool of driver developers, who currently number in the thousands.
Too many of the rules were not well understood and were extremely difficult to describe clearly. Fundamental operating system changes like support for Plug and Play and power management were not integrated well with the Windows I/O subsystem, mainly because we wanted to be able to run both Plug and Play and non-Plug and Play drivers side by side. This meant that the operating system design pushed onto the drivers the huge burden of synchronizing Plug and Play and power events with I/O requests. The rules for synchronization are complex, difficult to understand, and not well documented. In addition, most drivers have not properly handled asynchronous I/O and I/O cancellation, even though asynchronous, cancelable I/O was designed into the operating system from the start.
Although these conclusions seemed intuitively obvious, we needed to validate them against external data. Microsoft Windows XP included a great feature called Windows Error Reporting (WER), which allows Online Crash Analysis (OCA). When Windows stops unexpectedly and displays an error message on a blue screen, the system creates a minidump of the crash, which we receive when users choose to send crash data to Microsoft. When we saw the high number of crashes, we knew that we needed to make some fundamental changes in how drivers were developed.
We also conducted a survey and held face-to-face sessions with third-party driver developers to validate our findings and present our proposal for simplifying the driver model. These discussions were eye opening. A majority of driver developers found our driver model-especially the components related to Plug and Play, power management, and cancellation of asynchronous requests-complex and difficult to use. The developers were strongly in favor of a simpler driver model. In addition, they added a few requirements that we had not considered before.
First, a simpler driver model had to work over a range of operating system platforms. Hardware vendors wanted to write and maintain a single driver for a range of operating system versions. A new driver model that worked only on the latest Windows version was not acceptable.
Second, driver developers could not be restricted to using a small set of APIs-an approach we had used in some of our device-class-specific driver models. They had to be able to escape out of the driver model to the underlying platform.
With this input, we started the work on the Windows Driver Foundation (WDF). The goal was to build a next-generation driver model that met the needs of all device classes.
For WDF, we used a different developmental methodology: We got external driver developers involved in the design right from the start by holding design reviews. As soon as we developed the specifications, we invited some developers to a roundtable discussion-the first in November 2002-so we got useful comments even before we started writing code. We sponsored an e-mail alias and discussion groups where we debated design choices. Several internal and external early adopters used our framework to write drivers and gave us great feedback. We also sought and received feedback at WinHEC and through the driver developer newsgroups.
WDF went through several iterations as it developed into what it is today. Based on the feedback we got during development, we redid our Plug and Play and power management implementation as well as our synchronization logic. In particular, the Plug and Play and power management implementation was redesigned to use state machines. This helped to make the operations explicit, so that it was easy to comprehend the relationships between I/O and Plug and Play. As more WDF drivers were developed, we discovered more rules related to Plug and Play and power management and incorporated the rules into the state machines. One of the key benefits of using WDF is that every driver automatically gets a copy of this well-tested, well-engineered Plug and Play and power management implementation.
The OCA data also indicated that we should address the problem with crashes in another, more radical way. OCA data showed that 85 percent of unexpected system stops were caused by drivers and not by core Windows kernel components. Analysis showed that drivers for many device classes-notably USB, Bluetooth, and 1394 interconnects-did not need to be in kernel mode. Moving drivers to user mode has many benefits. For example, crashes in user-mode drivers can be fully isolated and the system can recover without rebooting. The programming environment in user mode is considerably simpler than in kernel mode. Developers have access to many tools and rich languages to write their code. Debugging is much simpler. A significant advancement with WDF is that we provide the same driver model in both user mode and kernel mode.
Although driver model simplifications address many issues that cause system crashes, they do not address programmer errors like buffer overruns, uninitialized variables, incorrect usage of system routines-such as completing a request more than once-and so forth. The work at Microsoft Research (MSR) in the area of static analysis tools addressed this piece of the puzzle. MSR had developed prototypes of tools that could understand the rules of a driver model and formally analyze source code. We decided to turn two of these ideas into tools that would become part of WDF: Static Driver Verifier (SDV) and PREfast for Drivers (PFD).
With the release of Windows Vista, both the first version of WDF and our static tools became available to driver developers in the Windows Driver Kit (WDK). WDF and the static tools have laid a good foundation for our driver development platform. The initial release of Windows Vista included about 17 KMDF drivers, covering a wide variety of device classes. In user mode, both Microsoft Windows Sideshow and Windows Portable Media technologies support UMDF drivers. Microsoft will continue to build on this foundation to meet the needs of current and future device classes.
This book captures the essentials of the WDF frameworks and static tools, and it makes available for the first time a single source for all information related to WDF. The book should help any driver developer-even a novice-get up to speed quickly on WDF. You will find that WDF enables you to develop a higher quality driver in significantly less time than the older driver models.
Windows Device Experience Group