Possible solutions


For every programmer who tackles a given problem, you can expect a different solution. This is also true when writing macros. It's important to remember that the only "wrong" solutions are those that generate incorrect results.

The rest of this chapter shows you the macros we used to generate example output for each exercises. We also offer some comments about the problems you may have encountered along the way, as well as some thoughts behind our own programming techniques.

Solution to exercise 1: Initial information

Did you find using the utsname macro a bit troublesome when specifying your own macro directory via the -I option of adb ? If so, welcome to the group ! It would be great if we could specify a pathlist with -I , but that's not the case, at least not yet!

adb does look for macros in the current working directory before going to the macro directory, so sometimes it's easier to just copy your macros to the directory where the system crash dump files reside.

Here is the macro we used to generate the example output shown earlier.

Example 14-1 The msgbuf macro
 ="Initial Dump Information"  ="========================"  $<</usr/kvm/lib/adb/utsname  srpc_domain/s15t"Domain name"  time/Y15t"Time of crash"  lbolt>a  *time-(*<a%0t100)=Y15t"Time of boot"  *panicstr/n"Panic string:"ts  =n"Stack traceback"  =n  $c 

Note that we specified the full pathname of the utsname macro so that adb would find it.

Solution to exercise 2: DNLC, the directory name lookup cache

In our macro, the formula used to calculate the hit rate percentage is divided into two statements to make it easier to follow. The result of the first step is stored in variable n , which is then read into the formula in the second step.

Example 14-2 The dnlcstats macro
 ="**  Directory Name Lookup Cache Statistics  **"  ="----------------------------------------------"  ncsize/D"Directory name cache size"  ncstats/D"# of cache hits that we used"  +/D"# of misses"  +/D"# of enters done"  +/D"# of enters tried when already cached"  +/D"# of long names tried to enter"  +/D"# of long name tried to look up"  +/D"# of times LRU list was empty"  +/D"# of purges of cache"  *ncstats+*(ncstats+4)+*(ncstats+14)>n  *ncstats*0t100%<n=D"Hit rate percentage"  ="(See /usr/include/sys/dnlc.h for more information)" 

The last line simply prints a reminder of where to read about the DNLC.

The interesting thing about this script is that it will produce inaccurate results when the values used in the formula are too large. The integer math, multiplying by decimal 100, then dividing, will eventually cause overflows and precision errors that adb will not catch. You'll find many versions of the UNIX vmstat command that also report negative or otherwise bizarre percentages once the values reach a certain size.

If your values become too large, consider dividing the values by an equal amount, such as 100,000, then working out the percentage. An example of this second variation of dnlcstats can be found on the Panic! CD-ROM.

Rebooting your system will reset all of the DNLC statistics back to zero.

Solution to exercise 3: Swap information

We used two macros to display the swapinfo data shown in the example output. Here they are.

Example 14-3 The swapinfo macro
 *swapinfo>c  <c,##(<c)$<swapinfo2  ="There is no swapinfo" 
Example 14-4 The swapinfo2 macro
 <c+0x10>n  *(<c+0x24)/"Swap file: "s  *<c>v  <v+24/"Type:  "tD  *(<v+0x28)%3ffff="Major: "D  *(<v+0x28)&3ffff="Minor: "D  *(<c+0x18)*8="Blocks: "D  *(<c+0x1c)*8="Free: "Dnn  *<n>c  <c,##(<c)$<swapinfo2 

In swapinfo2 , we set variable v to point to the vnode structure. This unnecessary step is only done to make life a bit easier for the macro author, making it more apparent which values are being collected from the vnode structure and which are from the swapinfo structure.

The use of 0x in address offsets is also added for readability and is unnecessary because hexadecimal is the default in adb . However, since we chose to use variable c to point to the current swapinfo structure, we didn't want to create confusion when we used the 1c address offset.

Why are we multiplying the value of "Blocks" and "Free" by 8? We wanted to report the same numbers that the UNIX swap -l command would report. The swapinfo structure maintains the number of memory pages of swap space and free space. The swap -l command shows the number of disk blocks. Most Sun systems have memory pages that are 4096 bytes in size. Disk blocks are 512 bytes in size. Thus, there are 8 blocks per page. To find out the memory page size of your Solaris 2 system, use adb to examine kernel symbol pagesize .

Those with access to source will find that the swap command multiplies the swapinfo "Blocks" and "Free" values by "disk blocks per page," 8, just as we have done.

Looking at swapinfo2 , you see that we display the major and minor numbers of the swap file by using the following lines. Were you able to figure out what these two adb command lines accomplish?

 *(<v+0x28)%3ffff="Major: "D  *(<v+0x28)&3ffff="Minor: "D 

The major and minor numbers are kept in the vnode structure as a single 32-bit word. The major number is 14 bits wide and the minor number is 18 bits wide. The following snippet from /usr/include/sys/mkdev.h confirms this.

 #define NBITSMAJOR   14       /* # of SVR4 major device bits */  #define NBITSMINOR   18       /* # of SVR4 minor device bits */  #define MAXMAJ       0x7f     /* SVR4 max major value, max 128 dev's */  #define MAXMIN       0x3ffff  /* SVR4 max minor value */ 

The line *(<v+0x28)%3ffff="Major: "D grabs the 11th full word of the vnode structure, v+0x28 , which is where the major/minor value is kept and divides it by hexadecimal 3ffff (18 binary bits set). In effect, this shifts the value 18 bits right, leaving only the high-order 14 bits, thus, the major number.

The second line ANDs the same major/minor value by 0x3ffff, so that only the low-order 18 bits are used to display a decimal value, thus, the minor number.

Extra Credit Challenge: Which process on which CPU?

If you were able to write macros that succeeded in meeting the requirements of this challenge, Congratulations! We are well aware that this was not an easy task!

We used two macros to generate the example output we showed earlier. Here they are.

Example 14-5 The proconcpu macro
 ncpus/"Number of CPUS:  "Xnn  *cpu_list>c  <c>e  <c,#(#(<c))$<proconcpu.nxt  cpu_list/X"(cpu_list ptr is NULL)"n 
Example 14-6 The proconcpu.nxt macro
 *(<c+0t28)>n  <c+8/X"Thread address"  *(<c+8)>p  <p+a0/X"Proc address"  *(<p+a0)>j  <j+260/s  .,#((*(<c+8))-(*(<c+c)))="This CPU was idle"  0,#(#(<n))&#(#(<n-<e))=n"Next CPU..."n  <n>c  <n,#(#(<n))&#(#(<n-<e))$<proconcpu.nxt 

Unless your job is centered around writing adb macros every day, it is unlikely that you will just whip out complicated macros such as these without some trial and error. In fact, we don't mind admitting that these macros tripped us up a bit!

Remember to refer to the macros that already exist, should you ever get confused .

As you discover new and exciting things in system crash dumps (and live systems), modify your own macros. The time and effort you put into writing and maintaining your own set of macros will pay off in the long run.

Now, before we move on to a new subject, talking about assembly language, go update your resume, adding " adb macro programmer " to your list of skills!



PANIC. UNIX System Crash Dump Analysis Handbook
PANIC! UNIX System Crash Dump Analysis Handbook (Bk/CD-ROM)
ISBN: 0131493868
EAN: 2147483647
Year: 1994
Pages: 289
Authors: Chris Drake

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net