Freebsd-10.0-CURRENT problem in the bottom half

From: Venkata Duvvuru <VenkatKumar.Duvvuru_at_Emulex.Com>
Date: Thu, 31 Oct 2013 14:00:02 +0000
Hi,
In Freebsd 10.0-current with Emulex's OCE driver, I observe that the bottom half is hogging all the CPU which is leading to system sluggishness. I used the same hardware to check the behavior on 9.1-RELEASE, everything is fine, bottom half is not taking more than 10% of the CPU even at the line rate speed. But with half the throughput of line rate in Freebsd-10.0-current all the CPUs peak and "top -aSCHIP" shows that it's all bottom half who is eating the CPU. I also tried with Intel's 10g NIC to see if it has got the same problem and interestingly it is also having the same problem, where bottom half is hogging all the CPU.

We did some profiling to check where the problem is. I'm pasting the profiling results for reference here. Observation from the results is that  uma_zalloc_arg==> __mtx_lock_sleep is taking quite a bit of the CPU. We sleep at a few places in the bottom half and I think the bottom half scheduling is what is causing the problem. Please let me know if you see something very obvious from the profiling data.

I'm seeing similar problem with ixgbe driver as well uma_zalloc_arg==> __mtx_lock_sleep is taking quite a bit of the CPU.

Top command results are also pasted below for both 10.0-current (where the problem is seen) and 9.1-RELEASE (where the problem is not seen).

42.99%  [372357]   __mtx_lock_sleep _at_ /boot/kernel/kernel
100.0%  [372357]    __mtx_lock_flags
  80.37%  [299245]     sosend_generic
   100.0%  [299245]      soo_write
  17.27%  [64322]      uma_zalloc_arg
   50.43%  [32438]       m_copym
   45.85%  [29490]       m_getm2
   03.72%  [2393]        tcp_output
   00.00%  [1]           oce_alloc_rx_bufs _at_ /boot/kernel/oce.ko
  02.35%  [8762]       _sleep _at_ /boot/kernel/kernel
   100.0%  [8762]        sbwait
  00.00%  [11]         in_matroute
   100.0%  [11]          rtalloc1_fib
  00.00%  [10]         uma_zfree_arg
   60.00%  [6]           m_freem
   40.00%  [4]           sbdrop_internal
  00.00%  [3]          rtalloc1_fib
   100.0%  [3]           rtalloc_ign_fib
  00.00%  [3]          tcp_output
   100.0%  [3]           tcp_usr_send
  00.00%  [1]          _cv_wait
   100.0%  [1]           usb_process

34.98%  [303027]   trash_ctor _at_ /boot/kernel/kernel
87.44%  [264956]    mb_ctor_clust
  100.0%  [264956]     uma_zalloc_arg
   100.0%  [264956]      m_getm2
12.51%  [37921]     mb_ctor_mbuf
  100.0%  [37921]      uma_zalloc_arg
   48.28%  [18308]       m_getm2
   48.18%  [18272]       m_copym
   03.54%  [1341]        tcp_output
00.05%  [150]       mb_ctor_pack
  100.0%  [150]        uma_zalloc_arg
   96.67%  [145]         m_getm2
   03.33%  [5]           oce_alloc_rx_bufs _at_ /boot/kernel/oce.ko

04.36%  [37786]    __rw_wlock_hard _at_ /boot/kernel/kernel
100.0%  [37786]     _rw_wlock_cookie
  99.88%  [37741]      tcp_usr_send
   100.0%  [37741]       sosend_generic
  00.11%  [43]         in_pcblookup_hash
   100.0%  [43]          tcp_input


Top -aSCHIP with Freebsd-10.0-CURRENT (throughput 4.5 Gbps)
  PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME    CPU COMMAND
    0 root       -92    0     0K   960K -      11   4:20 100.00% [kernel{oce1 taskq}]
2325 root       103    0 70524K  6240K CPU5    5   0:39 100.00% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
    0 root       -92    0     0K   960K CPU8    8   5:17 99.37% [kernel{oce1 taskq}]
    0 root       -92    0     0K   960K CPU10  10   4:27 99.37% [kernel{oce1 taskq}]
    0 root       -92    0     0K   960K CPU9    9   3:35 98.88% [kernel{oce1 taskq}]
    0 root       -92    0     0K   960K CPU7    7   7:25 98.29% [kernel{oce1 taskq}]
   11 root       155 ki31     0K   256K RUN    12  14:55 93.90% [idle{idle: cpu12}]
   11 root       155 ki31     0K   256K CPU13  13  14:48 91.70% [idle{idle: cpu13}]
   11 root       155 ki31     0K   256K CPU14  14  18:35 89.79% [idle{idle: cpu14}]
   11 root       155 ki31     0K   256K RUN    15  20:55 87.99% [idle{idle: cpu15}]
   11 root       155 ki31     0K   256K RUN     4  19:35 73.68% [idle{idle: cpu4}]
   11 root       155 ki31     0K   256K RUN     0  19:47 66.89% [idle{idle: cpu0}]
   11 root       155 ki31     0K   256K CPU2    2  18:53 64.89% [idle{idle: cpu2}]
   11 root       155 ki31     0K   256K CPU6    6  19:55 61.18% [idle{idle: cpu6}]
   11 root       155 ki31     0K   256K CPU1    1  18:46 55.66% [idle{idle: cpu1}]
   11 root       155 ki31     0K   256K RUN     5  19:41 42.48% [idle{idle: cpu5}]
2325 root        52    0 70524K  6240K sbwait  6   0:15 38.77% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
   11 root       155 ki31     0K   256K CPU3    3  19:06 36.38% [idle{idle: cpu3}]
2325 root        49    0 70524K  6240K CPU1    1   0:14 35.06% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root        44    0 70524K  6240K sbwait  0   0:13 33.59% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root        42    0 70524K  6240K sbwait  2   0:13 33.15% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root        41    0 70524K  6240K sbwait  6   0:12 29.98% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root        36    0 70524K  6240K sbwait 15   0:09 25.10% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
2325 root        34    0 70524K  6240K CPU13  13   0:09 24.46% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}

Top -aSCHIP with Freebsd-9.1-RELEASE (throughput 9.6 Gbps)
last pid: 16802;  load averages:  0.68,  0.15,  0.05                                                                                                 up 3+06:17:33  07:35:29
187 processes: 15 running, 128 sleeping, 44 waiting
CPU 0:   0.0% user,  0.0% nice, 13.8% system,  0.4% interrupt, 85.8% idle
CPU 1:   0.0% user,  0.0% nice,  9.4% system,  0.8% interrupt, 89.8% idle
CPU 2:   0.0% user,  0.0% nice, 10.2% system,  0.0% interrupt, 89.8% idle
CPU 3:   0.0% user,  0.0% nice, 12.6% system,  0.0% interrupt, 87.4% idle
CPU 4:   0.0% user,  0.0% nice, 13.4% system,  0.0% interrupt, 86.6% idle
CPU 5:   0.4% user,  0.0% nice, 11.0% system,  0.0% interrupt, 88.6% idle
CPU 6:   0.0% user,  0.0% nice, 16.9% system,  0.0% interrupt, 83.1% idle
CPU 7:   0.0% user,  0.0% nice, 32.3% system,  0.0% interrupt, 67.7% idle
CPU 8:   0.0% user,  0.0% nice, 21.7% system,  0.0% interrupt, 78.3% idle
CPU 9:   0.0% user,  0.0% nice, 24.4% system,  0.0% interrupt, 75.6% idle
CPU 10:  0.4% user,  0.0% nice,  6.3% system,  0.0% interrupt, 93.3% idle
CPU 11:  0.0% user,  0.0% nice, 16.1% system,  0.0% interrupt, 83.9% idle
Mem: 21M Active, 2201M Inact, 1094M Wired, 1236M Buf, 8552M Free
Swap: 15G Total, 15G Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    CPU COMMAND
   11 root     155 ki31     0K   192K CPU1    1  78.1H 95.75% [idle{idle: cpu1}]
   11 root     155 ki31     0K   192K RUN    10  78.1H 95.07% [idle{idle: cpu10}]
   11 root     155 ki31     0K   192K CPU4    4  78.2H 94.29% [idle{idle: cpu4}]
   11 root     155 ki31     0K   192K CPU2    2  78.2H 94.19% [idle{idle: cpu2}]
   11 root     155 ki31     0K   192K CPU5    5  78.2H 94.09% [idle{idle: cpu5}]
   11 root     155 ki31     0K   192K CPU3    3  78.2H 94.09% [idle{idle: cpu3}]
   11 root     155 ki31     0K   192K CPU6    6  78.1H 86.77% [idle{idle: cpu6}]
   11 root     155 ki31     0K   192K CPU11  11  78.1H 86.57% [idle{idle: cpu11}]
   11 root     155 ki31     0K   192K CPU0    0  77.5H 86.28% [idle{idle: cpu0}]
   11 root     155 ki31     0K   192K RUN     8  78.1H 81.79% [idle{idle: cpu8}]
   11 root     155 ki31     0K   192K CPU9    9  78.1H 78.47% [idle{idle: cpu9}]
   11 root     155 ki31     0K   192K RUN     7  78.1H 76.86% [idle{idle: cpu7}]
    0 root     -92    0     0K   784K -       7   0:08 26.37% [kernel{oce2 taskq}]
    0 root     -92    0     0K   784K CPU4    9   0:06 22.36% [kernel{oce2 taskq}]
    0 root     -92    0     0K   784K -       8   0:07 20.65% [kernel{oce2 taskq}]
    0 root     -92    0     0K   784K CPU3    6   0:06 15.48% [kernel{oce2 taskq}]
    0 root     -92    0     0K   784K -       0   0:04 10.99% [kernel{oce2 taskq}]
    0 root     -92    0     0K   784K -      11   0:04 10.89% [kernel{oce2 taskq}]
16801 root      26    0 65196K  5292K sbwait  4   0:03 10.50% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      23    0 65196K  5292K sbwait  0   0:02  6.79% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      23    0 65196K  5292K sbwait 10   0:01  6.05% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      24    0 65196K  5292K sbwait  5   0:02  5.96% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      24    0 65196K  5292K sbwait  2   0:01  5.66% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      22    0 65196K  5292K sbwait 10   0:01  4.79% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      23    0 65196K  5292K sbwait 11   0:01  4.59% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      23    0 65196K  5292K sbwait 10   0:01  3.96% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      22    0 65196K  5292K sbwait 10   0:01  3.96% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
16801 root      21    0 65196K  5292K sbwait  3   0:01  3.27% iperf -c 12.0.238.24 -t 180 -i 3 -P 10{iperf}
Received on Thu Oct 31 2013 - 13:00:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:43 UTC