14.4. kmdb, the Kernel Modular DebuggerThe userland debugger, mdb, debugs the running kernel and kernel crash dumps. It can also control and debug live user processes as well as user core dumps. kmdb extends the debugger's functionality to include instruction-level execution control of the kernel. mdb, by contrast, can only observe the running kernel. The goal for kmdb is to bring the advanced debugging functionality of mdb, to the maximum extent practicable, to in-situ kernel debugging. This includes loadable-debugger module support, debugger commands, ability to process symbolic debugging information, and the various other features that make mdb so powerful. kmdb is often compared with tracing tools like DTrace. DTrace is designed for tracing in the largefor safely examining kernel and user process execution at a function level, with minimal impact upon the running system. kmdb, on the other hand, grabs the system by the throat, stopping it in its tracks. It then allows for micro-level (per-instruction) analysis, allowing users observe the execution of individual instructions and allowing them to observe and change processor state. Whereas DTrace spends a great deal of energy trying to be safe, kmdb scoffs at safety, letting developers wreak unpleasantness upon the machine in furtherance of the debugging of their code. 14.4.1. Diagnosing with kmdb and moddebugDiagnosing problems with kmdb builds on the techniques used with mdb. In this section, we cover some basic examples of how to use kmdb to boot the system. 14.4.1.1. Starting kmdb from the Consolekmdb can be started from the command line of the console login with mdb and the -K option. # mdb -K Welcome to kmdb Loaded modules: [ audiosup cpc uppc ptm ufs unix zfs krtld s1394 sppp nca lofs genunix ip logindmux usba specfs pcplusmp nfs md random sctp ] [0]> $c kmdbmod'kaif_enter+8() kdi_dvec_enter+0x13() kmdbmod'kctl_modload_activate+0x112(0, fffffe85ad938000, 1) kmdb'kdrv_activate+0xfa(4c6450) kmdb'kdrv_ioctl+0x32(ab00000000, db0001, 4c6450, 202001, ffffffff8b483570, fffffe8000c48edc) cdev_ioctl+0x55(ab00000000, db0001, 4c6450, 202001, ffffffff8b483570, fffffe8000c48edc) specfs'spec_ioctl+0x99(ffffffffbc4cc880, db0001, 4c6450, 202001, ffffffff8b483570, fffffe8000c48edc) fop_ioctl+0x2d(ffffffffbc4cc880, db0001, 4c6450, 202001, ffffffff8b483570, fffffe8000c48edc) ioctl+0x180(4, db0001, 4c6450) sys_syscall+0x17b() [0]> :c 14.4.2. Booting with the Kernel DebuggerIf you experience hangs or panics during Solaris boot, whether during installation or after you've already installed, using the kernel debugger can be a big help in collecting the first set of "what happened" information. You invoke the kernel debugger by supplying the -k switch in the kernel boot arguments. So a common request from a kernel engineer starting to examine a problem is often "try booting with kmdb." Sometimes it's useful either to set a breakpoint to pause the kernel startup and examine something, or to just set a kernel variable to enable or disable a feature or to enable debugging output. If you use -k to invoke kmdb but also supply the -d switch, the debugger will be entered before the kernel really starts to do anything of consequence, so you can set kernel variables or breakpoints. To enter the debugger at boot with Solaris 10, enter b -kd at the appropriate prompt; this is slightly different whether you're installing or booting an already installed system. ok boot kmdb -d Loading kmdb... Welcome to kmdb [0]> If, instead, you're doing this with a system where GRUB boots Solaris, you add the -kd to the "kernel" line in the GRUB menu entry (you can edit GRUB menu entries for this boot by using the GRUB menu interface, and the "e" (for edit) key). kernel /platform/i86pc/multiboot -kd -B console=ttya Either way, you'll drop into the kernel debugger in short order, which will announce itself with this prompt: [0]> Now we're in the kernel debugger. The number in square brackets is the CPU that is running the kernel debugger; that number might change for later entries into the debugger. 14.4.3. Configuring a tty Console on x86Solaris uses a bitmap screen and keyboard by default. To facilitate remote debugging, it is often desirable to configure the system to use a serial tty console. To do this, change the bootenv.rc and grub boot configuration. setprop ttya-rts-dtr-off true setprop console 'text' See /boot/solaris/bootenv.rc Edit the grub boot configuration to include -B console=ttya via the grub menu at boot time, or via bootadm(1M). kernel /platform/i86pc/multiboot -kd -B console=ttya 14.4.4. Investigating HangsFor investigating hangs, try turning on module debugging output. You can set the value of a kernel variable by using the /W command ("write a 32-bit value"). Here's how you set moddebug to 0x80000000 and then continue execution of the kernel. [0]> moddebug/W 80000000 [0]> :c This command gives you debug output for each kernel module that loads. The bit masks for moddebug are shown below. Often, 0x80000000 is sufficient for the majority of initial exploratory debugging. /* * bit definitions for moddebug. */ #define MODDEBUG_LOADMSG 0x80000000 /* print "[un]loading..." msg */ #define MODDEBUG_ERRMSG 0x40000000 /* print detailed error msgs */ #define MODDEBUG_LOADMSG2 0x20000000 /* print 2nd level msgs */ #define MODDEBUG_FINI_EBUSY 0x00020000 /* pretend fini returns EBUSY */ #define MODDEBUG_NOAUL_IPP 0x00010000 /* no Autounloading ipp mods */ #define MODDEBUG_NOAUL_DACF 0x00008000 /* no Autounloading dacf mods */ #define MODDEBUG_KEEPTEXT 0x00004000 /* keep text after unloading */ #define MODDEBUG_NOAUL_DRV 0x00001000 /* no Autounloading Drivers */ #define MODDEBUG_NOAUL_EXEC 0x00000800 /* no Autounloading Execs */ #define MODDEBUG_NOAUL_FS 0x00000400 /* no Autounloading File sys */ #define MODDEBUG_NOAUL_MISC 0x00000200 /* no Autounloading misc */ #define MODDEBUG_NOAUL_SCHED 0x00000100 /* no Autounloading scheds */ #define MODDEBUG_NOAUL_STR 0x00000080 /* no Autounloading streams */ #define MODDEBUG_NOAUL_SYS 0x00000040 /* no Autounloading syscalls */ #define MODDEBUG_NOCTF 0x00000020 /* do not load CTF debug data */ #define MODDEBUG_NOAUTOUNLOAD 0x00000010 /* no autounloading at all */ #define MODDEBUG_DDI_MOD 0x00000008 /* ddi_mod{open,sym,close} */ #define MODDEBUG_MP_MATCH 0x00000004 /* dev_minorperm */ #define MODDEBUG_MINORPERM 0x00000002 /* minor perm modctls */ #define MODDEBUG_USERDEBUG 0x00000001 /* bpt after init_module() */ See sys/modctl.h 14.4.5. Collecting Information about PanicsWhen the kernel panics, it drops into the debugger and prints some interesting information; usually, however, the most interesting thing is the stack backtrace; this shows, in reverse order, all the functions that were active at the time of panic. To generate a stack backtrace, use the following: [0]> $c A few other useful information commands during a panic are ::msgbuf and ::status, as shown in Section 14.1. [0]> ::msgbuf - which will show you the last things the kernel printed onscreen, and [0]> ::status - which shows a summary of the state of the machine in panic. If you're running the kernel while the kernel debugger is active and you experience a hang, you may be able to break into the debugger to examine the system state; you can do this by pressing the <F1> and <A> keys at the same time (a sort of "F1-shifted-A" keypress). (On SPARC systems, this key sequence is <Stop>-<A>.) This should give you the same debugger prompt as above, although on a multi-CPU system you may see that the CPU number in the prompt is something other than 0. Once in the kernel debugger, you can get a stack backtrace as above; you can also use ::switch to change the CPU and get stack backtraces on the different CPU, which might shed more light on the hang. For instance, if you break into the debugger on CPU 1, you could switch to CPU 0 with the following: [1]> 0::switch 14.4.6. Working with Debugging TargetsFor the most part, the execution control facilities provided by kmdb for the kernel mirror those provided by the mdb process target. Breakpoints (:bp), watchpoints (::wp), ::continue, and the various flavors of ::step can be used. We discuss more about debugging targets in Section 13.3 and Section 14.1. The common commands for controlling kmdb targets are summarized in Table 14.1.
14.4.7. Setting BreakpointsSetting breakpoints with kmdb is done in the same way as with generic mdb targets, using the :b dcmd. Refer to Table 13.12 for a complete list of debugger dcmds. # mdb -K Loaded modules: [ crypto ] kmdb: target stopped at: kmdbmod'kaif_enter+8: popfq [0]> resume:b [0]> :c kmdb: stop at resume kmdb: target stopped at: resume: movq %gs:0x18,%rax [0]> :z [0]> :c # 14.4.8. Forcing a Crash Dump with halt -dThe following example shows how to force a crash dump and reboot of the x86-based system by using the halt -d and boot commands. Use this method to force a crash dump of the system. Afterwards, reboot the system manually. # halt -d 4ay 30 15:35:15 wacked.Central.Sun.COM halt: halted by user panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request fffffe80006bbd60 genunix:kadmin+4c1 () fffffe80006bbec0 genunix:uadmin+93 () fffffe80006bbf10 unix:sys_syscall32+101 () syncing file systems... done dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel NOTICE: adpu320: bus reset 100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded Welcome to kmdb Loaded modules: [ audiosup crypto ufs unix krtld s1394 sppp nca uhci lofs genunix ip usba specfs nfs md random sctp ] [0]> kmdb: Do you really want to reboot? (y/n) y 14.4.9. Forcing a Dump with kmdbIf you cannot use the reboot -d or the halt -d command, you can use the kernel debugger, kmdb, to force a crash dump. The kernel debugger must have been loaded, either at boot or with the mdb -k command, for the following procedure to work. Enter kmdb by using L1A on SPARC, F1-A on x86, or break on a tty. [0]> $<systemdump panic[cpu0]/thread=ffffffff83246ec0: forced crash dump initiated at user request fffffe80006bbd60 genunix:kadmin+4c1 () fffffe80006bbec0 genunix:uadmin+93 () fffffe80006bbf10 unix:sys_syscall32+101 () syncing file systems... done dumping to /dev/dsk/c1t0d0s1, offset 107675648, content: kernel NOTICE: adpu320: bus reset 100% done: 38438 pages dumped, compression ratio 4.29, dump succeeded |