Section 9.3. Analyzing Kernel Locks


9.3. Analyzing Kernel Locks

Locks are used in the kernel to serialize access to critical regions and data structures. If contention occurs around a lock, a performance problem or scalability limitation can result. Two main tools analyze lock contention in the kernel:lockstat(1M) and the DTrace lockstat provider.

9.3.1. Adaptive Locks

Adaptive locks enforce mutual exclusion to a critical section and can be acquired in most contexts in the kernel. Because adaptive locks have few context restrictions, they constitute the vast majority of synchronization primitives in the Solaris kernel. These locks are adaptive in their behavior with respect to contention. When a thread attempts to acquire a held adaptive lock, it determines if the owning thread is currently running on a CPU. If the owner is running on another CPU, the acquiring thread spins. If the owner is not running, the acquiring thread blocks.

To observe adaptive locks, first consider the spin behavior. Locks that spin excessively burn CPU cycles, behavior that is manifested as high system time. If you notice high system time with mpstat(1M), spin locks might be a contributor. You can confirm the amount of system time that results from spinning lock contention by looking at the kernel function profile; spinning locks show up as mutex_* functions high in the profile. To identify which lock is spinning and which functions are causing the lock contention, use lockstat(1M) and the DTrace lockstat provider.

Adaptive locks that block yield the CPU, and excessive blocking results in idle time and nonscalability. To identify which lock is blocking and which functions are causing the lock contention, again use lockstat(1M) and DTrace.

9.3.2. Spin Locks

Threads cannot block in some kernel contexts, such as high-level interrupt context and any context-manipulating dispatcher state. In these contexts, this restriction prevents the use of adaptive locks. Spin locks are instead used to effect mutual exclusion to critical sections in these contexts. As the name implies, the behavior of these locks in the presence of contention is to spin until the lock is released by the owning thread.

Locks that spin excessively burn CPU cycles, manifested as high system time. If you notice high system time with mpstat(1M), spin locks might be a contributor. You can confirm the amount of system time that results from spinning lock contention by looking at the kernel function profile; spinning locks show up as mutex_* functions high in the profile. To identify which lock is spinning and which functions are causing the lock contention, use lockstat(1M) and the DTrace lockstat provider.

9.3.3. Reader/Writer Locks

Readers/writer locks enforce a policy of allowing multiple readers or a single writerbut not bothto be in a critical section. These locks are typically used for structures that are searched more frequently than they are modified and for which there is substantial time in the critical section. If critical section times are short, readers/writer locks implicitly serialize over the shared memory used to implement the lock, giving them no advantage over adaptive locks. See rwlock(9F) for more details on readers/writer locks.

Reader/writer locks that block yield the CPU, and excessive blocking results in idle time and nonscalability. To identify which lock is blocking and which functions are causing the lock contention, use lockstat(1M) and the DTrace lockstat provider.

9.3.4. Thread Locks

A thread lock is a special kind of spin lock that locks a thread in order to change thread state.

9.3.5. Analyzing Locks with lockstat

The lockstat command provides summary or detail information about lock events in the kernel. By default (without the -I as previously demonstrated), it provides a systemwide summary for lock contention events for the duration of a command that is supplied as an argument. For example, to make lockstat sample for 30 seconds, we often use sleep 30 as the command. Note that lockstat doesn't actually introspect the sleep command; it's only there to control the sample window.

We recommend starting with the -P option, which sorts by the product of the number of contention events with the cost of the contention event (this puts the most resource expensive events at the top of the list).

# lockstat -P sleep 30 Adaptive mutex spin: 3486197 events in 30.031 seconds (116088 events/sec) Count indv cuml rcnt     spin Lock                   Caller ------------------------------------------------------------------------------- 1499963 43%  43% 0.00       84 pr_pidlock             pr_p_lock+0x29 1101187 32%  75% 0.00       24 0xffffffff810cdec0     pr_p_lock+0x50 285012   8%  83% 0.00       27 0xffffffff827a9858     rdip+0x506 ... 


For each type of lock, the total number of events during the sample and the length of the sample period are displayed. For each record within the lock type, the following information is provided:

  • Count. The number of contention events for this lock.

  • indv. The percentage that this record contributes to the total sample set.

  • cuml. A cumulative percentage of samples contributing to the total sample set.

  • rcnt. Average reference count. This will always be 1 for exclusive locks (mutexes, spin locks, rwlocks held as writer) but can be greater than 1 for shared locks (rwlocks held as reader).

  • nsec or spin. The average amount of time the contention event occurred for block events or the number of spins (spin locks).

  • Lock. The address or symbol name of the lock object.

  • CPU+PIL. The CPU ID and the processor interrupt level at the time of the sample. For example, if CPU 4 is interrupted while at PIL 6, this is reported as cpu[4]+6.

  • Caller. The calling function and the instruction offset within the function.

To estimate the impact of a lock, multiply Count by the cost. For example, if a blocking event on average costs 48, 944, 759 ns and the event occurs 1, 929 times in a 30-second window, we can assert that the lock is blocking threads for a total of 94 seconds during that period (30 seconds). How is this greater than 30 seconds? Multiple threads are blocking, so because of overlapping blocking events, the total blocking time can be larger than the elapsed time of the sample.

The full output from this example with the -P option follows.

# lockstat -P sleep 30 Adaptive mutex spin: 3486197 events in 30.031 seconds (116088 events/sec) Count indv cuml rcnt     spin Lock                   Caller ------------------------------------------------------------------------------- 1499963  43%  43% 0.00       84 pr_pidlock             pr_p_lock+0x29 1101187  32%  75% 0.00       24 0xffffffff810cdec0     pr_p_lock+0x50 285012   8%  83% 0.00       27 0xffffffff827a9858     rdip+0x506 212621   6%  89% 0.00       29 0xffffffff827a9858     rdip+0x134 98531   3%  92% 0.00      103 0xffffffff9321d480     releasef+0x55 92486   3%  94% 0.00       19 0xffffffff8d5c4990     ufs_lockfs_end+0x81 89404   3%  97% 0.00       27 0xffffffff8d5c4990     ufs_lockfs_begin+0x9f 83186   2%  99% 0.00       96 0xffffffff9321d480     getf+0x5d  6356   0%  99% 0.00      186 0xffffffff810cdec0     clock+0x4e9  1164   0% 100% 0.00      141 0xffffffff810cdec0     post_syscall+0x352   294   0% 100% 0.00       11 0xffffffff801a4008     segmap_smapadd+0x77   279   0% 100% 0.00       11 0xffffffff801a41d0     segmap_getmapflt+0x275   278   0% 100% 0.00       11 0xffffffff801a48f0     segmap_smapadd+0x77   276   0% 100% 0.00       11 0xffffffff801a5010     segmap_getmapflt+0x275   276   0% 100% 0.00       11 0xffffffff801a4008     segmap_getmapflt+0x275 ... Adaptive mutex block: 3328 events in 30.031 seconds (111 events/sec) Count indv cuml rcnt     nsec Lock                   Caller -------------------------------------------------------------------------------  1929  58%  58% 0.00 48944759 pr_pidlock             pr_p_lock+0x29   263   8%  66% 0.00    47017 0xffffffff810cdec0     pr_p_lock+0x50   255   8%  74% 0.00 53392369 0xffffffff9321d480     getf+0x5d   217   7%  80% 0.00    26133 0xffffffff810cdec0     clock+0x4e9   207   6%  86% 0.00   227146 0xffffffff827a9858     rdip+0x134   197   6%  92% 0.00    64467 0xffffffff8d5c4990     ufs_lockfs_begin+0x9f   122   4%  96% 0.00    64664 0xffffffff8d5c4990     ufs_lockfs_end+0x81   112   3%  99% 0.00   164559 0xffffffff827a9858     rdip+0x506 Spin lock spin: 3491 events in 30.031 seconds (116 events/sec) Count indv cuml rcnt     spin Lock                   Caller -------------------------------------------------------------------------------  2197  63%  63% 0.00     2151 turnstile_table+0xbd8  disp_lock_enter+0x35   314   9%  72% 0.00     3129 turnstile_table+0xe28  disp_lock_enter+0x35   296   8%  80% 0.00     3162 turnstile_table+0x888  disp_lock_enter+0x35   211   6%  86% 0.00     2032 turnstile_table+0x8a8  disp_lock_enter+0x35   127   4%  90% 0.00      856 turnstile_table+0x9f8  turnstile_interlock+0x171   114   3%  93% 0.00      269 turnstile_table+0x9f8  disp_lock_enter+0x35    44   1%  95% 0.00       90 0xffffffff827f4de0     disp_lock_enter_high+0x13    37   1%  96% 0.00      581 0xffffffff827f4de0     disp_lock_enter+0x35 ... Thread lock spin: 1104 events in 30.031 seconds (37 events/sec) Count indv cuml rcnt     spin Lock                   Caller -------------------------------------------------------------------------------   487  44%  44% 0.00     1671 turnstile_table+0xbd8  ts_tick+0x26   219  20%  64% 0.00     1510 turnstile_table+0xbd8  turnstile_block+0x387    92   8%  72% 0.00     1941 turnstile_table+0x8a8  ts_tick+0x26    77   7%  79% 0.00     2037 turnstile_table+0xe28  ts_tick+0x26    74   7%  86% 0.00     2296 turnstile_table+0x888  ts_tick+0x26    36   3%  89% 0.00      292 cpu[0]+0xf8            ts_tick+0x26    27   2%  92% 0.00       55 cpu[1]+0xf8            ts_tick+0x26    11   1%  93% 0.00       26 cpu[3]+0xf8            ts_tick+0x26    10   1%  94% 0.00       11 cpu[2]+0xf8            post_syscall+0x556 





Solaris Performance and Tools(c) Dtrace and Mdb Techniques for Solaris 10 and Opensolaris
Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris
ISBN: 0131568191
EAN: 2147483647
Year: 2007
Pages: 180

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net