Section 21.1. Introduction

21.1. Introduction

The best way to understand kmdb is by first understanding how mdb does things. We begin with an overview of the portions of mdb that are relevant to our later discussion of kmdb. For more information about mdb and its operation, consult the Modular Debugger AnswerBook. Having set the stage, we next discuss the major design goals behind kmdb. With those goals in mind, we return to the list of components we discussed from an mdb perspective, analyzing them this time from the point of view of kmdb, showing how their implementation fulfills kmdb's design goals. Finally, we embark on a whirlwind tour of some of the lower-level components of kmdb that weren't described in earlier sections.

21.1.1. MDB Components

In this section, we review the parts of MDB that are particularly relevant for our later discussion of kmdb, focusing on how those components are implemented in mdb. That is, we concentrate only on those components whose implementation changes significantly in kmdb. The design of MDB is sufficiently modular that we could replace the components requiring change without disrupting the remainder of the debugger. The components described are shown in Figure 21.1.

Figure 21.1. MDB Components

21.1.1.1. The Target Layer

The MDB answerbook describes targets as follows:

The target is the program being inspected by the debugger. [...] Each target exports a standard set of properties, including one or more address spaces, one or more symbol tables, a set of load objects, and a set of threads.

Targets are implemented by means of an ops vector, with each target implementing a subset of the functions in the vector. In-situ targets, such as the user process or proc, implement virtually all operations. Targets that debug entities whose execution cannot be controlled, such as the kvm target used for crash dump analysis, implement a smaller subset of the operations. As with many other parts of MDB, the targets are modular and are designed to be easily replaceable depending on the requirements of the debugging environment.

Figure 21.1 shows three of the targets used by MDB. The first is the proc target, which is used for the debugging and control of user processes as well as the analysis of user core dumps. The proc target is implemented on top of libproc, which provides the primitives used for process control. The interfaces provided by libproc simplify the implementation of the proc target by hiding the differences between in-situ and postmortem debugging (one is done with a live process, whereas the other uses a corefile). The target itself is largely concerned with mapping the requests of the debugger to the interfaces exposed by libproc.

Also shown in Figure 21.1 is the kvm target, which is used for both live and postmortem kernel debugging. Like the proc target, the kvm target uses a support library (libkvm) to abstract the differences between live and postmortem debugging. While the capabilities of the kvm and proc targets are largely the same when used for postmortem debugging, they differ when the subjects are live. The proc target fully controls process execution, whereas the kvm target allows only the inspection and alteration of kernel state. Allowing the debugger to control the execution of the kernel that is responsible for running the debugger would be difficult at best. Consequently, most debugging done with the kvm target is of the postmortem variety.

The third target shown in Figure 21.1 is used for the "debugging" of raw files. This allows the data-presentation abilities of MDB to be brought to bear upon flat (usually binary) files. This target lays the foundation for the eventual replacement of something like fsdb, the filesystem debugger.

21.1.1.2. Debugger Module Management

Today's kernels are made up of a great many modules, each implementing a different subsystem and each requiring different tools for analysis and debugging. The same can be said for modern, large-scale user processes, which can incorporate tens or even hundreds of shared libraries and subsystems. A modern modular debugger should, therefore, allow for the augmentation of its basic tool set as needed. MDB allows subsystem-specific debugging facilities to be provided through shared objects known as debugger modules, or dmods. Each dmod provides debugging commands (also known as dcmds) and walkers (iterators) that debug a given subsystem. These modules interface with MDB through the module API layer and use well-defined interfaces for data retrieval and analysis. This is enforced by the fact that, in the case of both major targets (kvm and proc), the debugger runs in a separate address space from the entity being analyzed. The dcmds are therefore forced to use the module API to access the target. While some dmods link with other support libraries to reduce the duplication of code, most dmods stand alone, consuming only the header files from the subsystems they support.

While the core debugger uses its own code for the management of debugger modules and their metadata, it relies upon a system library, libdl, for the mechanics of module unloading and unloading. It is libdl, for example, that knows how to load the dmod into memory, and it is libdl that knows how to integrate that dmod into the debugger's address space.

21.1.1.3. Terminal I/O

MDB was designed with an eye toward the eventual implementation of something like kmdb and thus performs most terminal interaction directly. Having built up a list of terminal attributes, MDB handles cursor and character manipulation directly. The MDB subsystem that performs terminal I/O is known as termio.

While termio handles a great deal itself, there is one aspect of terminal management that is provided by a support library. MDB uses libcurses to retrieve the list of terminal attributes for the current terminal from the terminfo database. The current terminal type is retrieved from the environment variable TERM.

21.1.1.4. Other Stuff

MDB is a large program, with many more subsystems than are described here. One of the benefits arising from the modular design of the debugger is that these other subsystems don't need to change even when used in an environment as radically different as kmdb is from MDB. For example, MDB implements its own routines for the management of ELF symbol tables. ELF being ELF regardless of source, the same subsystem can be used, as is, in both MDB and kmdb. A description of the MDB subsystems unaffected by kmdb is beyond the scope of this document.

21.1.2. Major `kmdb` Design Decisions

In this section we explore the rationale behind the major design decisions.

21.1.2.1. The Kernel/Debugger Interface (KDI)

When we implement an in-situ kernel debugger, we must determine the extent to which the debugger will be intermingled with the kernel being debugged. Should the debugger call kernel functions to accomplish its duties, or should the debugger be entirely self-contained? The legacy Solaris in-situ kernel debugger, kadb, hewed to the latter philosophy to a significant extent. The kadb module was as self-contained as possible, to the point where it contained copies of certain low-level kernel routines. That said, there were some kernel routines to which kadb needed access. During debugger startup, it would search for a number of functions by name, saving pointers to them for later use.

There are a number of problems with kadb's approach. First of all, by duplicating low-level kernel code in the debugger, we introduce duplication. Furthermore, this duplication, due to the layout of the Solaris source code, results in the copies being significantly separated. It's hard enough to maintain code rife with duplication when the duplicates are co-located. Maintaining duplicates located in wildly disparate locations is next to impossible. During initial analysis of kadb as part of the kmdb project, we discovered several duplicated functions in kadb that had not kept up with hardware-specific changes to the versions in the kernel. The second problem concerns the means by which kadb gained access to the kernel functions it did use. Searching for those functions by name is dangerous because it leaves the debugger vulnerable to changes in the kernel. A change in the signature of a kernel function used by kadb, for example, would not be caught until kadb failed while trying to use said function.

To some extent, the nature of a kernel debugger requires duplication. The kernel debugger cannot, for example, hold locks, and therefore requires lock-free versions of any kernel code that it must call. The lock-free version of a function may not be safe when used in a running kernel context and therefore must be kept separate from the normal version. Rather than placing that duplicate copy within the debugger itself, we decided to co-locate the duplicate with the original. This reduces the chances of code rot, since an engineer changing the normal version is much more likely to notice the debugger-specific version sitting right next to it.

Access to kernel functionality was formalized through an interface known as the KDI, or Kernel/Debugger Interface. The KDI is an ops vector through which all kernel function calls must pass. Each function called by the debugger has a member in this vector. Whereas an assessment of kernel functionality used by kadb required a search for symbol lookup routines and their consumers, a similar assessment in kmdb simply requires the review of the single ops vector. Furthermore, our use of an ops vector allowed us to use the compiler to monitor the evolution of kernel functions used by kmdb. Any change to a KDI function significant enough to change the function signature will be caught by the compiler during the initialization of the KDI ops vector. Furthermore, the initialization of said vector is easily visible to code analysis tools such as cscope, allowing engineers to quickly determine whether kmdb is a consumer of a given function. With kadb, such a check would require a check of the symbol lookup routines, something that is not automatically done by the code analysis tools used today.

21.1.2.2. Implementation as a Kernel Module

kadb was implemented as a stand-alone module. In Solaris, this means that the kadb module was an executable, directly loadable by the boot loader. It had no static dependencies on other modules, thus leading to the symbol lookup problems discussed above. When the use of kadb was requested, the boot process ran something like this:

Boot loader loads kadb.
kadb initializes.
kadb loads normal stand-alone, UNIX.
kadb loads the UNIX interpreter, krtld.
kadb passes control to krtld.
krtld loads the UNIX dependencies (genunix, CPU module, platform module, etc.).
krtld TRansfers control to UNIX.

While this allowed the debugger to take early control of the system (it could debug from the first instruction in krtld), that ability came with some significant penalties. The decision to load a 32-bit or 64-bit kernel being made after kadb had loaded and initialized, kadb had to be prepared to debug either variety. The need for kadb to execute prior to the loading of UNIX itself meant that it could not use any functions located in the kernel until the kernel was loaded. While some essential functions were dynamically located later, the result of this restriction was the location of many low-level kernel functions in the debugger itself. A further penalty comes in the form of increased debugger complexity. kadb's need to load UNIX and krtld requires that it know how to process ELF files and how to load modules into the address space. The boot loader already needs to know how to do that, as does krtld. With kadb as a stand-alone module, the number of separate copies of ELF-processing and module-loading code goes up to three.

The remaining limitations have to do with the timing of the decision to load kadb. As stated above, kadb was a stand-alone module and as such could only be loaded at boot. Moreover, an administrator was required to decide, before rebooting, whether to load kadb. Once loaded, it could not be unloaded. While the inability to unload the debugger isn't a major limitation, the inability to dynamically load it, is. Not knowing whether kadb would be needed during the life of a given system boot, administrators would be faced with an unfortunate choice. On the one hand, they could always load kadb at boot. This kept it always ready for use, but at the cost of the wiring down of a chunk of kernel address space. This could be avoided, of course, by making the other choicenot loading the debugger at boot. Administrators then ran the risk of not having the debugger around when they needed it.

The implementation of kmdb as a normal kernel module solves all of these problems, with only a minor activation-time penalty compared to kadb. When kmdb is loaded at boot, the boot process looks something like this:

Boot loader loads UNIX.
Boot loader loads the UNIX interpreter, krtld.
Boot loader passes control to krtld.
krtld loads the UNIX dependencies (genunix, CPU module, platform module, etc.).
krtld loads kmdb.
krtld transfers control to UNIX.

As shown above, kmdb loads after the primary kernel modules have been selected and loaded. kmdb can therefore assume that it will be running with the same bit width as that of the underlying kernel. That is, a 32-bit kmdb will never have to deal with a 64-bit kernel, and vice versa.

By loading after the primaries, kmdb can have static symbol dependencies on the other primary kernel modules. It is this ability that allows the KDI to exist. Even better, kmdb can rely on krtld's selection of the proper CPU and platform modules for this machine. Rather than having to carry around several processor-specific implementations of the same function (or compiling one module for each of four platform types, as kadb did), kmdb can, using the KDI, simply use the proper implementation of a given function from the proper module. When a new platform-specific KDI function is implemented, the developer implements it in a platform-specific way in each platform module. krtld selects the proper platform module on boot, and kmdb automatically ends up using the proper version for the host machine.

Last but certainly not least, the implementation of kmdb as a normal kernel module allows it to be dynamically loaded and unloaded. It can still be loaded at boot, but it can also be loaded on-demand by the administrator. If dynamically loaded, it can also be unloaded when no longer needed. This can be a consolation to wary administrators who would otherwise object to the running of a kernel debugger on certain types of machines.

The only disadvantage of the use of a normal kernel module versus a standalone one is the loss of the ability to debug the early stages of krtld. In practice, this has not turned out to be a problem, because the early stages of krtld are fairly straightforward and stable.

Every attempt has been made to minimize the effects of the two load types (boot and runtime). Obviously initialization differs in some respects, a number of common kernel subsystems simply won't be available during the initialization of boot-loaded kmdb. Largely, though, these differences are dealt with under the covers and are not visible to the user.

21.1.3. The Structure of `kmdb`

We can best understand the inner workings of kmdb by first reviewing the debugger's external structure. kmdb's external structure is dictated, to some extent, by the environments in which it will be used. Those requirements are

The debugger must be loadable at boot.
The debugger must be loadable at runtime.
The debugger must restrict its contact with the running kernel to a set of operations defined in advance.

To satisfy the first two requirements, kmdb exists as two separate kernel modules. The first, misc/kmdbmod, contains the meat of the debugger; it is the module loaded by krtld when kmdb is loaded at boot. The second module, drv/kmdb, exists solely to gather property values from the device tree and to present an ioctl-based interface to controlling userland programs such as mdb(1). When kmdb is to be loaded at runtime, mdb opens /dev/kmdb and uses the ioctl interface to command it to activate. The opening of /dev/kmdb causes drv/kmdb to load. drv/kmdb has a dependency on misc/kmdbmod, which gets loaded as well. Upon receipt of the appropriate ioctl, drv/kmdb calls into misc/kmdbmod, and the debugger is initialized.

If the debugger was loaded at boot, only misc/kmdbmod will be loaded. The module loading subsystem is not fully initialized at that point. Userland does not exist yet, and given that drv/kmdb exists only to convey ioctl requests from userland to misc/kmdbmod, there is no need to force drv/kmdb to load until an attempt is made to open /dev/kmdb. When someone does attempt to control the debugger through ioctls to /dev/kmdb, drv/kmdb is loaded. It then sends commands to misc/kmdbmod as in the runtime case above.

We now focus our attention more closely on misc/kmdbmod, which itself is composed of two parts. The first, referred to as the debugger, contains the core debugger functionality, as well as the primary subsystems needed to allow the core to control the kernel. The second, referred to as the controller, interacts with the running kernel.

The debugger interacts with the outside world only through a set of well-defined interfaces. One of these is the KDI; the other is composed of a set of functions passed during initialization by the controller. Aside from these interactions, the debugger must, by nature, function as a fully self-contained entity. Put in compilation terms, the debugger, which is built separately from the controller, must not have any unresolved symbols at link time. It is the debugger, and only the debugger, that is active when kmdb has control of the machine.

Behind the scenes, as it were, the controller works to ensure that the debugger's runtime needs are met. The debugger has a limited set of direct interactions with the kernel. And it can only be active when the world has stopped. Those two facts necessarily limit the sorts of things the debugger can do. For example, it can neither perform the early stages of kmdb initialization nor load or unload kernel modules.

The former takes place before debugger initialization starts and is taken care of by the controller. A memory region, known as Oz, is allocated and is set aside for use by the debugger. Other initialization tasks performed by the controller include the creation of trap tables or IDTs, as appropriate, after which control is passed to the debugger for the completion of initialization.

Kernel module loading and unloading, which is discussed in more detail below, is a task that must be performed by the running kernel. The debugger must rely on the controller to perform these sorts of tasks for it.

In the text that follows, we use the words driver, debugger, and controller to refer to the components we've just discussed. These three components are indicated in Figure 21.2 by regions surrounded by dotted lines. When we discuss the entire entity, we refer to it as kmdb. References to the core debugger refer to the set of shaded boxes labeled MDB. One unfortunate note: The term "controller" is a relatively recent invention. In many instances, the source code refers to the driver when it means the controller. This doesn't cause nearly as many issues as one might imagine because of the minor role played by the entity we refer to as the driver.

21.1.4. MDB Components and Their Implementation in `kmdb`

We now use our earlier discussion of mdb to motivate our review of the major subsystems used by kmdb. Recall that the three subsystems discussed were the target layer, module management, and terminal management (termio). The implementation of kmdb is largely the story of the replacement of support libraries with subsystems designed to work in kmdb's unique environment. Figure 21.2 shows how these replacement subsystems relate to the core debugger.

Figure 21.2. KMDB Structure

21.1.4.1. The Target Layer

The target layer itself is unchanged in kmdb. What changes is the target implementation itself. Gone are the proc,kvm, and file targets, replaced with a single target called kmdb_kvm. We continue to call it kmdb_kvm to avoid confusion with the kvm target used by mdb.

kmdb_kvm can be thought of as a hybrid of the proc and kvm targets. It includes the execution control aspects of proc, such as the ability to set breakpoints and watchpoints, as well as support for single-stepping, continuation, and so forth. This functionality is coupled with the kernel-oriented aspects of the kvm target. The kmdb_kvm target is common between SPARC and x86 machines and for the most part handles the bits of kernel analysis, management, and control that are generic to the two architectures. With the exceptions of stack trace construction and the display of saved registers, all architecture-specific functionality is abstracted into the DPI. The DPI's relationship to kmdb_kvm is very similar to that of libkvm to the kvm target or to that of libproc to the proc target.

A significant portion of kmdb_kvm is devoted to the monitoring of kernel state. As an example, target implementations are required to provide symbol lookup routines for use by the core debugger. Provision of this information requires access to kernel module symbol tables, which are easily accessed by kmdb_kvm. What is not so simple, however, is dealing with the constant churn in the set of loaded modules. Whenever kmdb regains control of the machine, kmdb_kvm scans the entire module list, looking for modules that have loaded or unloaded. The tracking state (symbol table references, and so forth) of kmdb_kvm modules that have unloaded is destroyed, while new state is created for modules that have been loaded. Challenges arise when a module has unloaded and then reloaded since kmdb last had control. This churn must be detected, and tracking state rebuilt.

The tracking of module movement, for lack of a better term, illustrates the interaction between the debugger and the controller. While the debugger could certainly rescan the entire list upon every entry, that approach would be wasteful. Instead, the controller subscribes to the kernel's module change notification service and bumps a counter whenever a change has occurred. kmdb_kvm can, upon reentry, check the value of that counter. If the value has changed since kmdb_kvm last saw it, a module list rescan is necessary.

While this interaction with the controller results in a useful optimization for module state management, it becomes crucial for the management of deferred breakpoints. Deferred breakpoints are breakpoints requested for modules that haven't yet loaded. The user's expectation is that the breakpoint will activate when the named module loads. The debugger is responsible for the creation, deletion, enabling, disabling, activation, and deactivation of breakpoints. The user creates the breakpoint by using the breakpoint command (::bp). This being a deferred breakpoint for a module that hasn't been loaded, the debugger leaves the breakpoint in a disabled state. When that module has loaded, the breakpoint is enabled. Enabled breakpoints are activated by the debugger when the world is resumed. The activation is what makes the breakpoint actually happen. In kmdb_kvm, the DPI installs a breakpoint instruction at the specified virtual address. The key design question: How do we detect the loading of the requested module?

The simplest, cleanest, and slowest approach would be to have kmdb_kvm place an internal breakpoint on the kernel's module loading routine. Whenever a module is loaded, the debugger would activate, would check the identity of the loaded module, and would decide whether to enable the breakpoint. Debugger entry isn't cheap. All CPUs must be stopped, and their state must be saved. This particular stop would happen after a module load, so we would need to rescan the module list. All in all, this is something that we really don't want to have to do every time a module is loaded or unloaded.

If we involve the controller, we can eliminate the unnecessary debugger activations, entering the debugger only when a module named in a deferred breakpoint is loaded or unloaded. How do we do this? We bend the boundaries between the debugger and controller slightly, exposing the list of deferred breakpoints to code that runs when the world is turning. Tie this into the controller's registration with the kernel's module change notification service, and we end up entering the debugger only when a change has occurred in a module named in a deferred breakpoint. We use a quasi-lock-free data structure to allow access to the deferred breakpoint list both from within the debugger (when the world is stopped) and within the module change check (when the world is running).

Like the proc and kvm targets, kmdb_kvm is also home to dcmds that could not be implemented elsewhere. Implemented in the target, they have access to everything the target does and can thus do things that dcmds implemented in dmods could only dream of doing. As implied above, kmdb_kvm (as well as kvm and proc) implement dcmds that provide stack tracing and register access.

21.1.4.2. Debugger Module Management

As discussed earlier, mdb uses libdl for the management of dmods, which are implemented as shared objects. The implementation of kmdb is similar, but without libdl. Nor does the debugger have the way to actually load or unload modules. Other than that, kmdb and mdb are the same.

We decompose module management into two pieces: the requesting of module loads and unloads, and the implementation of a libdl replacement atop the results of the loading and unloading.

21.1.4.3. Module Loads and Unloads: The Work Request Queue (WR)

kmdb implements debugger modules as kernel modules. While we engage in some sleight of hand to keep the dmods off the kernel's main module list, the mechanics of loading and unloading dmods is largely the same as that used for "normal" kernel modules. The primary difference is in the means by which a load or unload is requested. Recall that the debugger, which will receive the load or unload request from the user, can only run when the world is stopped. Also note that the loading or unloading of a kernel module is a process that uses many different kernel subsystems. The kernel runtime linker (krtld), the disk driver, VM system, file system, and many others come into play. Use of these subsystems of course entails the use of locks, threads, and various other things that are anathema to the debugger.

To load a dmod, the debugger must therefore ask the controller to do it. The controller runs when the world is turning and is more than capable of loading and unloading kernel modules. The only thing we need is a channel for communication between the two. That channel is provided by the Work Request Queue, or WR. The WR consists of two queues: one for messages from the debugger to the controller and one for messages from the controller to the debugger. The rough sequence of events for a module load is as follows:

User requests a dmod load with ::load.
The kmdb module layer receives the request and passes it to the WR debugger controller queue.
The controller receives the request.
The controller loads the module.
The controller returns the requests to the debugger as a (successful) reply on the controller debugger queue.
The debugger receives the reply and makes the contents of the dmod available to the debugger core.

A few details bear mentioning. The debugger can be activated at any timeeven in the midst of the controller's processing of a load request. The controller must keep this in mind when checking and manipulating the WR queues. The queues themselves are lock-free and have very strict rules regarding the methods used to access them. For example, the controller may only add to the end of the controller debugger queue. It sets the next pointer on its request and updates the tail pointer for the queue. Even though the queue is doubly linked, theres no easy way for the controller, which may be interrupted at any time by the debugger, to set the prev pointer. Accordingly, the debugger's first action upon preparing to process the controller debugger queue is to traverse it, from tail to head, building the prev pointers. The debugger doesnt have to worry about being interrupted by the controller and can thus take its time. Similar rules are in place for the debugger controller queue.

Every request must be tracked and sent back as a reply at some point. Even fire-and-forget requests, such as those establishing new module search paths, must be returned as replies, even if those replies don't come until the debugger is unloaded. To see why this is necessary, consider the source of the memory underlying the requests. Requests from the debugger are allocated from debugger memory by the debugger's allocator and can thus only be freed by the debugger. Requests initiated by the controller (for example, an automatic dmod load triggered by the loading of the corresponding kernel module) are allocated by the controller from kernel memory and can thus be freed only by the kernel. Replies therefore serve a dual purposethey provide status to the requester and also return the request to the requester for freeing.

We'd like to minimize the impact of the debugger on the running system to the extent practicable and so don't want the controller to poll for updates to the WR queues. Instead, we want the debugger to tell the controller when work is available for processing. This isn't as simple as it may seem. In the real world, we would use semaphores or condition variables to signal the availability of work. To use kernel synchronization objects, the debugger would need to call into the kernel to release them. The kernel is most definitely not prepared for a cv_broadcast() call with every CPU stuck in the debugger. Unpleasantness would ensue. The lightest-weight way to communicate with the controller is to post a soft interrupt, the implementation of which is essentially the setting of a bit in the kernel's cpu_t structure. When the world has resumed, normal Normal interrupt processing will encounter this bit and will call the soft interrupt handler registered by the controller. That handler bangs on a semaphore, which triggers the controller's WR processing. Note that these problems apply only for communications from the debugger to the controller. The debugger can simply poll for messages sent in the opposite direction. Since the debugger is activated relatively infrequently, the occasional check of a message-waiting bit doesn't impose a burden. When users request a debugger activation, the last thing on their mind is whether the debugger is wasting a few cycles to check for messages.

libdl supplies a synchronous loading and unloading interface to mdb, thus considerably simplifying its management of dmods. kmdb has no such luxury. As the reader might surmise from the preceding discussion, kmdb's loading and unloading of dmods is decidedly asynchronous. Every attempt is made to preserve the user's illusion of a blocking load, but the asynchronous nature occasionally pokes its head into the open. A breakpoint encountered before the completion of the load, for example, causes an early debugger reentry. The user is told that a load or an unload is still pending and is told how to allow it to complete.

21.1.4.4. `libdl` Wrapper

MDB's dmod management code uses the libdl interfaces for manipulating dmods. dlopen() loads modules, dlclose() unloads them, and dlsym() looks up symbols. The debugger implements its own versions of these functions (using the same function signatures) to support the illusion of libdl. Underneath, the debugger's symbol table facilities are retargeted to implement dlsym()'s searches of dmod symbol tables.

21.1.4.5. Terminal I/O

To implement terminal I/O handling, we need three things: access to the terminal type, the ability to manipulate that terminal, and routines for actually sending I/O to and from that terminal. The second of these can be further subdivided into the retrieval of terminal characteristics and the use of that knowledge to manipulate the terminal. mdb implements the most difficult of thesethe routines that actually manipulate the terminal according to the gathered characteristics. mdb handles the tracking of cursor position, in-line editing, and the implementation of a parser and knows how to use the individual terminal attributes (echo this to make the cursor move right, echo that to enable bold, etc.) to accomplish those tasks.

Left to mdb and kmdb are terminal type determination, attribute retrieval, and I/O to the terminal itself. For mdb, this is relatively straightforward. The terminal type can be gathered from the environment, attributes can be retrieved from the terminfo database with libcurses, and I/O accomplished with stdin, stdout, and stderr.

kmdb, as is its wont, has a more difficult time of things. There is no environment from which to gather the current terminal type. There's no easy access to the terminfo database. Completing the trifecta, the I/O methods vary with the type of platform, progress of the boot process, and phase of the moon. As a bonus, kmdb's termio implementation handles interrupt (^C) processing. We discuss each in turn. While the preceding sections had happy endings, in that pleasing solutions were found for the enumerated problems, the reader is warned that there are no happy endings in terminal management. Tales of wading through terminal types, to say nothing of the terminfo/termcap databases, are generally suitable only for frightening small children and always end in woe and the gnashing of teeth.

21.1.4.6. Retrieving the Terminal Type

At first glance, gaining access to the terminal type would seem straightforward. Sadly, no. kmdb can be loaded at boot or at runtime. It can be used on a locally attached console/framebuffer, or it can be used through a serial console. If loaded at runtime, the invocation could be made from a console login, or it could be made from an rsh (or telnet or ...) session. Boot-loaded kmdb on a serial console is the worst because we have no information regarding the type of terminal attached to the other end of the serial connection. We end up assuming the worst, which is a 80x24 VT100. Boot-loaded kmdb on a machine with a locally attached console or framebuffer is easier because we know the terminal type and terminal dimensions for SPARC and x86 consoles. Also easy is a runtime-loaded kmdb from a console login. Assuming that the user set the terminal type correctly, we can use the value of the TERM environment variable. But unfortunately we can't trust $TERM to be set correctly, so we ignore $TERM if the console is locally attached. We end up with a pile of heuristics, which generally come up with the right answer. If they don't, they can always be overridden.

21.1.4.7. Terminal Attributes

After considering the mess that is access to $TERM, retrieval of terminfo data is almost trivial. We don't want to compile in a copy of the terminfo database, and we can't rely on the ability to gain access to it while the debugger is running. We compromise by hard-coding a selection of terminal types into the debugger. The build process extracts the attributes for each selected terminal from the terminfo database and compiles them into the debugger. Terminal type selection in kmdb is thus limited to the types selected during the build. It turns out, though, that the vast majority of common terminal types can be covered by a set of 15 terminal types.

21.1.4.8. Console I/O

Access to the terminal entails the reading of input, the writing of output, and the retrieval of hardware parameters (terminal size and so forth), generally through an ioctl-based interface. MDB's modular I/O subsystem makes our job somewhat easier. Each I/O module provides an ops vector, exposing interfaces for reading, writing, ioctls, and so forth. kmdb has its own I/O module, called promio. promio acts as a front end for promif, which we discuss in a moment. For the most part, promio is a pass-through, with the exception of the ioctl function. promio interprets the ioctls sent from termio and invokes the appropriate promif functions to gather the necessary information. In addition to the aforementioned terminal size ioctl (TIOCGWINSZ), promio's ioctl handler is prepared to deal with requests to get (TCGETS) and set (TCSETSW) hardware parameters. The parameters of interest to kmdb are largely concerned with echoing and newlines.

promif interfaces the debugger with the system's OpenBoot PROM (OBP). While x86 systems don't have PROMs, Solaris (and thus kmdb) try very hard to pretend that they do. For the most part, this means functions called prom_something() are named to mimic their SPARC counterparts. Whereas the SPARC versions jump into OBP, the x86 versions do whatever is necessary to implement the same functionality without a PROM. promif exposes two classes of interface: those that deal with console (terminal) I/O, and those that are merely wrappers around PROM routines. We cover the former group here.

Both SPARC and x86 systems get help from the boot loader (OBP on SPARC) for console I/O during the initial stages of boot. SPARC systems without USB keyboards can use OBP for console I/O even after boot. x86 systems and SPARC systems with USB keyboards use a kernel subsystem known as polled I/O. Exposed to kmdb through the KDI, polled I/O is a method for interacting directly with the I/O hardware, be it a serial driver, the USB stack, or something completely different without blocking. Rather than waiting for interrupts, as can be done while the world is turning, the polled I/O subsystem is designed to poll I/O devices until input is available or output has been sent. The bottom line is that the method used for console I/O changes during the boot process. The portion of promif dedicated to console I/O hides this complexity from consumers, exposing only routines for reading and writing bytes. Consumers need not concern themselves with where those bytes come from or go to.

21.1.4.9. Interrupt (^C) Management

Given that kmdb console I/O is synchronous, there is no easy way for an interrupt (^C) from a user to get to the core debugger. In userland, the kernel detects interrupts asynchronously, generates a signal, and inflicts it upon the process. There is no parallel in kmdb. The debugger doesn't know about pending interrupts until it reads the interrupt character from the keyboard. With a simplistic I/O implementation, reading only when we need to, a user would never be able to interrupt anything.

promif works around this limitation by implementing a read-ahead buffer. That buffer is drained when the debugger needs input from the user. It is filled whenever input is available by a nonblocking reader. Attempts are made to fill the buffer whenever input is requested, when data is to be output, or when an attempt is made to read or write the kernel's address space. If an interrupt character is discovered during a buffer fill, control passes to the interrupt-handling routine, which halts the command that was executing. Debugger commands that aren't constantly writing to the console, reading from the kernel, or writing to the kernel are very rare (and probably of questionable utility). In practice, this means that a buffer fill attempt will be made soon after the user presses ^C. As a future enhancement, we could, barring the implementation of an asynchronous interrupt-delivery mechanism, expand the number of fill points. In practice, though, this doesn't seem like it would be necessary.

21.1.5. Conclusion

A significant portion of the design and implementation of kmdb was spent filling in the gaping holes left when mdb was separated from its supporting libraries. Certainly, we didn't realize how much is provided by those supporting libraries until we attempted to take them away. These gaps were filled by replacement subsystems whose operations were complicated by the restrictive environment in which kmdb operates. The balance of kmdb's implementation was spent in the development of the KDI functions and in the implementation of the DPI, more on which below. The DPI provides the low-level code that allows the remainder of kmdb to be largely architecture neutral.

21.1.6. Remaining Components

In this section, we cover some remaining discussion items related to the implementation of kmdb.

21.1.6.1. The Debugger/PROM Interface (DPI)

The DPI has a somewhat sordid history, the twists and turns of which have influenced the way it appears today.

kadb on x86, having no PROM, did everything itself. The SPARC version on the other hand, depended on a great many services provided by OBP. OBP provided trap handling for the debugger. It also took care of debugger entry, the saving of a portion of processor state, among other things.

kmdb was initially planned to be released in conjunction with an enhanced OBP. This new OBP would accord more sophisticated debugging facilities, thus freeing kmdb from having to deal with many low-level, hardware-specific details. For example, the new OBP would manage software breakpoints itself. It would capture and park processors during debugger execution. It would also manage watchpoints.

Recognizing that not all systems would have this new OBP, we initially designed kmdb with a pluggable interface that would allow for its use on systems with both types of OBP. That interface is called the Debugger/PROM interface, or DPI. SPARC would have one module for the old-style OBP interface, which we called the kadb-style interface (or kaif). SPARC would have a second module for the new-style OBP interface, the name for which has been buried in the sands of time. The debugger would choose between the two modules according to an assessment of OBP features. x86 systems would have a single module, also called kaif.

Some time into the implementation of kmdb (well after the terms DPI and kaif had cemented themselves throughout the source code), the plans for the new-style OBP were dropped. This turned out to be for the best, the reasons for which are beyond the scope of this document. As a result, modern-day kmdb has one module for each architecture. The intervening layer, the DPI, is not strictly necessary. It may not have been invented had it not been for our earlier plans to accommodate multiple styles of OBP interaction. It remains, though, and serves as a useful repository for some functionality common to the two kaif implementations.

The bulk of the kaif module is devoted to the performance of the following five tasks:

Coordination of debugger entry
Manipulation of processor state
Source analysis for execution control
Management of breakpoints and watchpoints
Trap handling

21.1.6.2. Coordination of Debugger Entry

kmdb is single threaded and establishes a master-slave relationship between the CPUs on the machine. The first CPU to encounter an event that triggers debugger entry, such as a breakpoint, watchpoint, or deliberate entry, becomes the master. The master then cross-traps the remaining CPUs, causing them to enter the debugger as slaves. Slaves spin in busy loops until the world is resumed or until one of them switches places with the master. If multiple CPUs encounter debugger entry events at the same time and thus race for debugger entry, only one will win. The first to grab the master lock wins, with the remainder becoming slaves.

21.1.6.3. Manipulation of Processor State

When processors enter the debugger, they save their register state into per-processor save areas. This state is then exposed to the user of the debugger. The kaif module coordinates the saving of this state and also implements the search routines that allow for its retrieval.

21.1.6.4. Source Analysis for Execution Control

MDB supports a number of execution control primitives. In addition to breakpoints and watchpoints, which we discuss shortly, it provides for single-step, step-over, step-out, and continue. Single-step halts execution at the next instruction. Step-over is similar, except that it does not step into subroutines. That is, it steps to the next instruction in the current routine. Step-out steps to the next instruction in the calling routine. Continue resumes system execution on all processors (single-step resumes execution only on the processor being stepped).

Single-step is implemented directly by the kaif module. On x86, this entails the setting of EFLAGS.TF. On SPARC, we set breakpoints at the next possible execution points. If the next instruction is a branch, for example, we may have to set two breakpoints to cover both possible results of the branch.

Step-over and step are implemented independently of single-step. For step-over, MDB calls into the target, which calls into the DPI and kaif, asking whether the next instruction requires special processing. If the next instruction is a call, kaif returns with the address of the instruction after the call. MDB places a breakpoint at that location and uses continue to "step" over the call. If the next instruction is not a call, the kaif module so indicates, and MDB uses normal single-step. When the user requests a step-out, MDB requests, through the target and the DPI, that the kaif module locate the next instruction in the calling function.

Whereas single-step releases a single processor to execute a single instruction, continue releases all processors and fully resumes the world. Continue also posts the soft interrupt to the controller if necessary, in support of debugger module management.

21.1.6.5. Management of Breakpoints and Watchpoints

Both SPARC and x86 rely on software breakpoints. That is, a specific instruction (int $3 on x86, and ta 0x7e on SPARC) is written at a given location. When control reaches that location, the debugger is entered. Breakpoints are activated by installation of one of these instructions and are deactivated by restoration of the original instruction.

Watchpoints are implemented by hardware on both platforms. Space on processors being at a premium and watchpoints being relatively rarely used (though oh-so-helpful), processors don't provide many of them and impose restrictions on the ones they do. SPARC, for example, has two watchpointsone physical and one virtual. SPARC watchpoint sizes are restricted to 8 bytes or any non-zero power of 256. x86 implements four watchpoints, even allowing watchpoints on individual I/ O port numbers, but imposes restrictions on their size and access type. Hardware activates watchpoints by writing to the appropriate hardware registers and deactivates them by clearing those registers. The kaif ensures that the target activates only the supported number of watchpoints. It also checks to make sure that the watchpoints requested meet the hardware limitations. No attempt is made to synthesize more flexible watchpoints.

21.1.6.6. Trap Handling

On SPARC, kmdb has drastically reduced its dependency upon OBP as the project has progressed. This is somewhat ironic in light of our earlier attempts to increase that dependency. Whereas kadb allowed OBP to handle traps and to coordinate entrance into the debugger, kmdb has its own trap table, handles its own debugger entry, and even handles its own MMU misses.

kmdb also installs its own trap table on x86, although the trap table there is called an IDT. Not having ever had an OBP upon which to become dependent, Solaris x86 in-situ debuggers have always handled their own traps and debugger entry.

When kmdb gains control of the machine, it switches to its trap table. When the world resumes, the trap table used prior to debugger entry is restored. While kmdb is running, traps that are immediately resolvable by the handler (MMU misses to valid addresses, for example) are handled and control is returned to the execution stream that caused the trap. Traps that are not resolvable by the handler cause a debugger reentry. In some cases, such as when an access is being made to the kernel's address space, the debugger takes precautions against traps resulting from those accesses. Reentry caused by such a trap would cause control to be transferred back to the code that initiated the access, with a return code set indicating that an error occurred. Unexpected traps are signs that something has gone wrong and are grounds for entry into a debugger fault state. The stack trace leading up to the access is displayed, and the user is offered the option to induce a crash dump.