How DMA Works under Windows 2000

< BACK NEXT >

[oR]

Just as the Windows 2000 operating system abstracts all other pieces of system hardware, DMA operations also follow a strict abstract model. Drivers that perform DMA within the framework of this abstraction can ignore many of the hardware-specific aspects of the system platform. This section presents the major features of the Windows 2000 DMA framework.

Hiding DMA Hardware Variations with Adapter Objects

The purpose of using DMA is to minimize the CPU's involvement in data transfer operations. To do this, DMA devices use an auxiliary processor, called a DMA controller, to move data between memory and a peripheral device. This allows the CPU to continue doing other useful work in parallel with the I/O operation.

Although the exact details vary, most DMA controllers have a very similar architecture. In its simplest form, this consists of an address register for the starting address of the DMA buffer and a count register for the number of bytes or words to transfer. When these registers are properly programmed and the device started, the DMA controller begins moving data on its own. With each transfer, it increments the memory address register and decrements the count register. When the count register reaches zero, the DMA controller generates an interrupt, and the device is ready for another transfer.

Unfortunately, the needs of real-world hardware design complicate the simple picture. Consider the DMA implementation on ISA-based machines, described in chapter 2. These systems use a pair of Intel 8237 controller chips cascaded to provide four primary and three secondary DMA data channels. The primary channels (identified as zero through three) can perform single-byte transfers, while the secondary channels (five through seven) always transfer two bytes at a time. Since the 8237 uses a 16-bit transfer counter, the primary and secondary channels can handle only 64 KB and 128 KB per operation, respectively. Due to limitations of the ISA architecture, the DMA buffer must be located in the first 16 megabytes of physical memory.

Contrast this with the DMA architecture used by EISA systems. The Intel 82357 EISA I/O controller extends ISA capabilities by supporting one-, two-, and four-byte transfers on any DMA channel, as well as allowing DMA buffers to be located anywhere in the 32-bit address space. In addition, EISA introduces three new DMA bus-cycle formats (known as types A, B, and C) to give peripheral designers the ability to work with faster devices.

Even on the same ISA or EISA bus, different devices can use different DMA techniques. For example, slave DMA devices compete for shareable system DMA hardware on the motherboard, while bus masters avoid bottlenecks by using their own built-in DMA controllers.

The problem with all this variety is that it tends to make DMA drivers very platform-dependent. To avoid this trap, Windows 2000 drivers don't manipulate DMA hardware directly. Instead, they work with an abstract representation of the hardware in the form of an Adapter object. Chapter 4 briefly introduced these objects and said they help with orderly sharing of system DMA resources. It turns out that Adapter objects also simplify the task of writing platform-independent drivers by hiding many of the details of setting up the DMA hardware. The rest of this section explains more about what Adapter objects do and how to use them in a driver.

The Scatter/Gather Problem

Although virtual memory simplifies the lives of application developers, it introduces two major complications for DMA-based drivers. The first problem is that the buffer address passed to the I/O Manager is a virtual address. Since the DMA controller works with physical addresses, DMA drivers need some way to determine the physical pages making up a virtual buffer. The next section explains how Memory Descriptor Lists perform this translation.

The other problem (illustrated in Figure 12.1) is that a process doesn't necessarily occupy consecutive pages of physical memory, and what appears to be a contiguous buffer in virtual space is probably scattered throughout physical memory. The Windows 2000 Virtual Memory Manager uses the platform's address translation hardware (represented by a generic page table in the diagram) to give the process the illusion of a single, unbroken virtual address space. Unfortunately, the DMA controller doesn't participate in this illusion.

Figure 12.1. Address spaces involved in DMA operations.

Since most DMA controllers can only generate sequential physical addresses, buffers that span virtual page boundaries present a serious challenge. Consider what happens if a DMA controller starts at the top of a multipage buffer and simply increments its way through successive pages of physical memory. It's unlikely that any page after the first actually corresponds to one of the caller's virtual buffer pages. In fact, the pages touched by the DMA controller probably won't even belong to the process issuing the I/O request.

All virtual memory systems have to deal with the problem of scattering and gathering physical buffer pages during a DMA operation. Support for scatter/gather capabilities can come either from system DMA hardware or from hardware built into a smart bus master device. Once again, Windows 2000 tries to simplify the process by presenting drivers with a unified, abstract view of whatever scatter/gather hardware happens to exist on the system. This model consists of a contiguous range of addresses, called logical space, used by the DMA hardware and a set of mapping registers to translate logical space addresses into physical space addresses.

Referring to Figure 12.1, each mapping register corresponds to one page of DMA logical space, and a group of consecutively numbered registers represents a contiguous range of logical addresses. To perform a DMA transfer, a driver first allocates enough contiguous mapping registers to account for all the pages in the caller's buffer. It then loads consecutive mapping registers with the physical addresses of the caller's buffer pages. This has the effect of mapping the physically noncontiguous user buffer into a contiguous area of logical space. Finally, the driver loads the DMA controller with the starting address of the buffer in logical space and starts the device. While the operation is in progress, the DMA controller generates sequential, logical addresses that the scatter/gather hardware maps to appropriate physical page references.

While the conceptual model of mapping registers is nothing more than page tables for DMA devices, the actual implementation depends on the platform, the bus, and the I/O device. To minimize the driver's awareness of these details, Windows 2000 includes the mapping registers with the Adapter object and provides a set of routines for their management.

Memory Descriptor Lists

As described, loading physical addresses into mapping registers is an important part of setting up a DMA transfer. To make this process easier, the I/O Manager uses a structure called a Memory Descriptor List (MDL). An MDL keeps track of physical pages associated with a virtual buffer. The buffer described by an MDL can be in either user- or system-address space.

Direct I/O operations require the use of MDLs. If a Device object has the DO_DIRECT_IO bit set in its Flags field, the I/O Manager automatically builds an MDL describing the caller's buffer each time an I/O request is sent to the device. It stores the address of this MDL in the IRP's MdlAddress field, and a driver uses it to prepare the DMA hardware for a transfer.

As seen in Figure 12.2, the MDL consists of a header describing the virtual buffer followed by an array that lists the physical pages associated with the buffer. Given a virtual address within the buffer, the MDL data describes the corresponding physical page. Some of the fields in the header help clarify the use of an MDL.

Figure 12.2. Structure of a Memory Descriptor List (MDL).

StartVa and ByteOffset.
The StartVa field contains the address of the buffer described by the MDL, rounded down to the nearest virtual page boundary. Since the buffer doesn't necessarily start on a page boundary, the ByteOffset field specifies the distance from this page boundary to the actual beginning of the buffer. Keep in mind that if the buffer is in user space, a driver can use the StartVa field to calculate indexes into the MDL, but not as an actual address pointer.

MappedSystemVa.
If the buffer described by the MDL is in user space and its contents must be accessed, the buffer must first be mapped into system space with MmGetSystemAddressForMdl. This field of the MDL is used to hold the system-space address where the user-space buffer has been mapped.

ByteCount and Size.
These fields contain the number of bytes in the buffer described by the MDL and the size of the MDL itself, respectively.

Process.
If the buffer lives in user space, the Process field points to the process object that owns the buffer. The I/O Manager uses this information when it cleans up the I/O operation.

Keep in mind that MDLs are opaque data objects defined by the Virtual Memory Manager. Their actual contents may vary from platform to platform, and they might also change in future versions of the operating system. Consequently, access to an MDL should be performed using system support functions. Any other approach could lead to future (if not present) disaster. Table 12.1 lists the common MDL functions that a driver is most likely to use. Some of the functions in this table are implemented as macros.

Table 12.1. *Functions That Operate on Memory Descriptor Lists*
MDL Access Functions
Function	Description
IoAllocateMdl	Allocates an empty MDL
IoFreeMdl	Releases MDL allocation by IoAllocateMdl
MmBuildMdlForNonPagedPool	Builds MDL for an existing nonpaged pool buffer
MmGetSystemAddressForMdl	Returns a nonpaged system space address for the buffer described by an MDL
IoBuildPartialMdl	Builds an MDL describing part of a buffer
MmGetMdlByteCount	Returns count of bytes in buffer described by MDL
MmGetMdlByteOffset	Returns page-offset of buffer described by MDL
MmGetMdlVirtualAddress	Returns starting VA of buffer described by MDL

MDLs give drivers a convenient, platform-independent way of describing buffers located either in user- or system-address space. For drivers that perform DMA operations, MDLs are important because they make it easier to set up an Adapter object's mapping registers. Later parts of this chapter show the use of MDLs to set up DMA transfers.

Maintaining Cache Coherency

The final consideration is the impact of various caches on DMA operations. During a DMA transfer, data may be cached in various places, and if everything isn't coordinated properly, a device or CPU might end up with stale data. Figure 12.3 demonstrates the concern.

Figure 12.3. Caches involved in DMA processing.

CPU DATA CACHE

Modern CPUs support both on-chip and external caches for holding copies of recently used data. When the CPU wants something from physical memory, it first looks for the data in the cache. If the CPU finds what it wants, it doesn't have to make the long, slow trip down the system memory bus. For write operations, data moves from the CPU to the cache, where (depending on the cache and policy) it may stay for a while before making its way out to main memory.

The problem is that on some architectures, the CPU's cache controller and the DMA hardware are unaware of each other. This lack of awareness can lead to incoherent views of memory. For instance, if the CPU cache is holding part of the buffer, and that buffer is overwritten in physical memory by a DMA input, the CPU cache will contain stale data. Similarly, if modified data hasn't been flushed from the CPU cache when a DMA output begins, the DMA controller will be sending stale data from physical memory out to the device.

One way of handling this problem is to make sure that any portion of the DMA buffer residing in the CPU's data cache is flushed before a DMA operation begins. A driver can do this by calling KeFlushIoBuffers and giving it the MDL describing the DMA buffer. This function flushes any pages in the MDL from the data cache of every processor on the system.

Of course, the casual use of KeFlushIoBuffers can seriously impact system performance. Since many platforms automatically maintain cache coherency between CPU and DMA hardware, the call to KeFlushIoBuffers is not always necessary. On such systems, the call is defined to be a no-op. To ensure platform independence, drivers should always include the call.

ADAPTER OBJECT CACHE

The Adapter object is another place where data may be cached during a DMA transfer. Unlike the CPU cache, which is always a real piece of hardware, the Adapter object cache is an abstraction representing platform-dependent hardware or software. It might be an actual cache in a system DMA controller or a software buffer maintained by the I/O Manager. In fact, for some combinations of hardware, there might not even be a cache, but a driver still needs to use the Adapter object in order to guarantee portability.

Another benefit of using the Adapter object is that problems presented by certain buses are transparently handled for the driver. For example, the DMA controller for an ISA bus can access only the first 16 megabytes of physical memory. If any pages of a user buffer are outside this range, the I/O Manager allocates another buffer in low memory when the driver sets up the DMA mapping registers of the Adapter object. For output operation, the I/O Manager also copies the contents of the user buffer pages into this Adapter object buffer.

The Adapter object cache must be explicitly flushed after an input operation, or to notify the I/O Manager that it can release the memory in the adapter buffer. The function that performs the flush and release is FlushAdapterBuffers, a method of the Adapter object.

Packet-Based and Common Buffer DMA

The Windows 2000 DMA model divides drivers into two categories, based on the location of the DMA buffer itself: packet-based DMA and common buffer DMA.

In packet-based DMA, data moves directly between the device and the locked-down pages of a user-space buffer. This is the type of DMA associated with direct I/O operations. The significant point is that each new I/O request will probably use a different set of physical pages for its buffer. This impacts the kind of setup and cleanup steps the driver has to take for each I/O.

In common buffer DMA, the device uses a single nonpaged buffer from system space and all DMA transfers occur through this buffer.

Packet-based and common buffer DMA are not mutually exclusive categories. Some complex devices perform both kinds of DMA. One example is the Adaptec Ultra160 family of SCSI host adapters. It uses packet-based DMA to transfer data between SCSI devices and user buffers. The same controller exchanges command and status information with its driver using a set of mailboxes kept in a common buffer area.

Although all DMA drivers have similar characteristics, certain implementation details depend on whether packet-based or common buffer DMA is utilized. Later sections of this chapter present the specifics of writing each kind of driver.

Limitations of the Windows 2000 DMA Architecture

While the use of the Windows 2000 DMA abstraction simplifies driver construction, it does impose some restrictions. For one, the model is somewhat biased toward slave DMA devices. A driver is burdened with additional work to force the Adapter object model to fit a master DMA device.

More significantly, the Windows 2000 DMA model does not support device-to-device data transfers. Since modern buses such as PCI promote the concept of peer-to-peer relationships between devices, it is unfortunate that the Adapter model does not extend to nonsystem-hosted DMA.