Using adb , let's find out more about this mutex that seven threads are waiting to use. 0xe00ecb34$<mutex poll_lock: poll_lock: owner f66d2800 poll_lock: poll_lock: lock type waiters ff 0 fe02 This mutex structure, poll_lock , is currently owned by thread f66d2800. What is that thread and what is it doing? Let's find out, looking in the allthreads file we created. thread_id f66d2800 ?() + dffff7a8 data address not found That output looks a bit suspicious. Since the threadlist macro calls $c , we may not have gotten all of the information we need. Let's walk this thread's stack by hand. Better yet, let's write our own macros to walk a stack for us, given a stack pointer. Here are a set a macros designed to do the job. Example strace macro.="Easy to read stack traceback"n .,.$<strace.nxt Example strace.nxt macro.>f .=2n *(<f+0t60)/i .="First 6 arguments..."n (<f+20)/6Xn (<f+38)/"Next stack --> "Xn *(<f+0t56),.$<strace.nxt On Solaris 2.3, the thread structure maintains the stack pointer at thread+0x28 . Remember to always check the header file, /usr/include/sys/thread.h , to verify the offsets. Using this, let's look at the stack for thread f66d2800 which is holding the mutex that seven other threads are waiting to access. *(f66d2800+28)$<strace Easy to read stack traceback abc_readit+0x90: call mutex_enter First 6 arguments... 0xe1ed1718: f6080f4c f6080f52 f6080f4c c8676ff f5a98e7b f5a98e60 0xe1ed1730: Next stack --> e1ed1758 abc_pollit+0xd8: call abc_readit First 6 arguments... 0xe1ed1778: 2a5d 4 b36940ff 40 c8 0 0xe1ed1790: Next stack --> e1ed17c8 poll+0x348: jmpl %g1, %o7 First 6 arguments... 0xe1ed17e8: 1b80000 40 0 f67911d6 40 f5fdd2f4 0xe1ed1800: Next stack --> e1ed1830 syscall+0x3d8: jmpl %g1, %o7 First 6 arguments... 0xe1ed1850: 40 f67e6160 f67911d0 0 f5d43d94 f67f2e00 0xe1ed1868: Next stack --> e1ed18b8 _sys_rtt+0x4d4: call syscall First 6 arguments... 0xe1ed18d8: e00de9f0 e1ed1eb4 0 e1ed1e90 fffffffc ffffffff 0xe1ed18f0: Next stack --> e1ed1938 data address not found Fascinating! Thread f66d2800 is trying to access a mutex lock. Let's find out more about that lock. f6080f4c$<mutex 0xf6080f4c: owner 190cec0 0xf6080f4c: lock type waiters ff 0 e602 The owner of this unnamed mutex is actually thread e190cec0. Note The mutex macro on Solaris 2.3 prefixes an "f" to owner addresses when appropriate, but not the "e." Read the macro and see if you can figure out why! Next, let's find out what thread e190cec0 is doing. This time, we'll just look at the top of it's stack, then at the whole stack if we feel it's necessary. *(e190cec0+28)/16X 0xe190cab8: f66d2800 e00fd240 e00de794 e0000000 e0000000 ffffffe0 f5fbbcf4 f5fbbcf9 f5fbbcf4 f5fbbcfa f5fbbcf4 b00760ff f5a98edb f5a98ec0 e190cb18 f5fd6008 f5fd6008/i abc_alloc+0x54: call mutex_enter Oh my! Another call to mutex_enter(). We'll keep going. At this point, you should be able to follow what we are doing. f5fbbcf4$<mutex 0xf5fbbcf4: owner f600ec00 0xf5fbbcf4: lock type waiters ff 0 ec02 *(f600ec00+28)/16X 0xe1d1c368: e1937ec0 e00fd240 e00de794 e0000000 e0000000 ffffffe0 f5fde918 f5fde91d f5fde918 f5fde91e f5fde918 b36940ff f5a98e9b f5a98e80 e1d1c3c8 f5fd9cd0 f5fd9cd0/i abc_opensocket+0x40c: call mutex_enter f5fde918$<mutex abc_top_mutex: abc_top_mutex: owner f66d2800 abc_top_mutex: abc_top_mutex: lock type waiters ff 0 e802 Does the thread address f66d2800 look familiar to you? That's the same thread that owns the poll_lock mutex that seven threads are waiting to use. We've seen the stack tracebacks of all but one of the threads, e190cec0, which are waiting for mutex locks to become available. Checking e190cec0 in the allthreads file, we again see the "data address not found," so we might want to consider checking out the other threads that failed to be traced. |