How about a software problem?


The other possibility, (yes, you knew it was coming), is that another program placed a big fat zero into the ipcaccess () routine. Modules are loaded into memory with writable pages! Yes, a program running in kernel mode can write onto another module's pages, changing its instructions and data.

Diving through the SunSolve databases, there are (so far) no recorded instances of modules merrily stomping on other modules the way we have seen in this crash dump. That leaves us the following options.

  1. Search every thread that was running to see if it had very recently written a zero into ipcaccess+18 . Sounds like a good idea, but how would we do this without becoming very old in the process? We will come back to this idea in a few minutes.

  2. We could query nm and see if any of the values for the symbols happens to be that of the address of ipcaccess+18 . A long shot, at best.

  3. We notify the programmer who wrote the custom device driver, letting him know everything we found in the crash dump, so that he can decide if maybe his new driver stomped on the semsys module.

  4. We put the latest and greatest fixes to semsys onto the customer's system and hope that he never sees another crash like this one. Sure, it's a shotgun approach, but if he's using the semaphore code, we might as well help him avoid any other known bugs in the code. (Proactive support is a good thing!)

Option 1 sounds too much like looking for a needle in a haystack. But, if we had the energy to wander through all of the threads, we would probably try to locate the threads running the unknown custom drivers first. At the very least, we could look for a store instruction that writes to ipcaccess+18 . While we will reject this option (in favor of option 3), we do have another trick up our sleeve, which we will try shortly.

Option 3 takes care of option 1. The programmer will be doing the same thing, and he knows the code to the driver we are most worried about.

Option 4 has been done, with the customer fully aware of the analysis, our findings, and our concerns about his system. He knows it will crash again someday unless the root cause is located.



PANIC. UNIX System Crash Dump Analysis Handbook
PANIC! UNIX System Crash Dump Analysis Handbook (Bk/CD-ROM)
ISBN: 0131493868
EAN: 2147483647
Year: 1994
Pages: 289
Authors: Chris Drake

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net