Re: How to debug whats cause to much __mtx_lock_sleep in system

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 23 Oct 2013 10:25:01 -0400
On Monday, October 21, 2013 8:59:49 am Vitalij Satanivskij wrote:
> Hello.
> 
> Have 10.0-BETA1 #7 r256765  whith terible load's "load averages: 23.31, 30.53, 31"
> 
> wich degraded more and more with time. 
> 
> Kernel compilied with dtrace support and using script called  hotkernel from DTraceToolkit-0.99 found some stange statistics
> 
> zfs.ko`lz4_compress                                      5045   0.2%
> kernel`0xffffffff80                                      5185   0.2%
> kernel`uma_zalloc_arg                                    5302   0.2%
> kernel`bcopy                                             5322   0.2%
> kernel`_sx_xlock                                         7310   0.3%
> kernel`_sx_xunlock                                       7434   0.3%
> zfs.ko`l2arc_feed_thread                                 9797   0.4%
> zfs.ko`lzjb_compress                                     9912   0.4%
> zfs.ko`list_prev                                        17894   0.7%
> kernel`__rw_wlock_hard                                  30522   1.2%
> kernel`spinlock_exit                                    31310   1.3%
> kernel`acpi_cpu_c1                                     103495   4.1%
> kernel`_sx_xlock_hard                                  138743   5.5%
> kernel`vmem_xalloc                                     175869   7.0%
> kernel`cpu_idle                                        371159  14.8%
> kernel`__mtx_lock_sleep                               1345815  53.8%
> 
> 
> 
> Theris another same machine with simple data and usage but with old curent r245701 
> 
> Which have none problem's with load 
> 
> zfs.ko`fletcher_4_native                                 2366   0.1%
> kernel`uma_zfree_arg                                     2387   0.1%
> zfs.ko`lzjb_decompress                                   2392   0.1%
> kernel`__rw_rlock                                        2477   0.1%
> zfs.ko`dmu_zfetch                                        2553   0.1%
> kernel`bcopy                                             3035   0.1%
> kernel`vm_page_splay                                     3089   0.1%
> kernel`_mtx_trylock_flags_                               3346   0.2%
> kernel`bzero                                             3411   0.2%
> kernel`0xffffffff80                                      3665   0.2%
> kernel`_sx_xunlock                                       3818   0.2%
> kernel`uma_zalloc_arg                                    4216   0.2%
> kernel`vmtotal                                           4702   0.2%
> kernel`_sx_xlock                                         5117   0.2%
> kernel`free                                              5476   0.2%
> zfs.ko`lzjb_compress                                     6674   0.3%
> kernel`spinlock_exit                                    21590   1.0%
> kernel`__mtx_lock_sleep                                 40819   1.9%
> kernel`acpi_cpu_c1                                     311077  14.1%
> kernel`cpu_idle                                       1639418  74.6%
> 
> 
> 
> Both servers have same hardware, same software of cause not system version.
> 
> So which way is the right to investigate problem and find resolution?

You need to determine which mutex(es) are being contested.  There is a
LOCK_PROFILING kernel option you can use to investigate this further.

-- 
John Baldwin
Received on Wed Oct 23 2013 - 12:30:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:43 UTC