Re: dhclient sucks cpu usage...

From: Alexander V. Chernikov <melifaro_at_FreeBSD.org>
Date: Tue, 10 Jun 2014 21:33:15 +0400
On 10.06.2014 20:24, John-Mark Gurney wrote:
> Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 13:17 +0400:
>> On 10.06.2014 07:03, Bryan Venteicher wrote:
>>> Hi,
>>>
>>> ----- Original Message -----
>>>> So, after finding out that nc has a stupidly small buffer size (2k
>>>> even though there is space for 16k), I was still not getting as good
>>>> as performance using nc between machines, so I decided to generate some
>>>> flame graphs to try to identify issues...  (Thanks to who included a
>>>> full set of modules, including dtraceall on memstick!)
>>>>
>>>> So, the first one is:
>>>> https://www.funkthat.com/~jmg/em.stack.svg
>>>>
>>>> As I was browsing around, the em_handle_que was consuming quite a bit
>>>> of cpu usage for only doing ~50MB/sec over gige..  Running top -SH shows
>>>> me that the taskqueue for em was consuming about 50% cpu...  Also pretty
>>>> high for only 50MB/sec...  Looking closer, you'll see that bpf_mtap is
>>>> consuming ~3.18% (under ether_nh_input)..  I know I'm not running tcpdump
>>>> or anything, but I think dhclient uses bpf to be able to inject packets
>>>> and listen in on them, so I kill off dhclient, and instantly, the
>>>> taskqueue
>>>> thread for em drops down to 40% CPU... (transfer rate only marginally
>>>> improves, if it does)
>>>>
>>>> I decide to run another flame graph w/o dhclient running:
>>>> https://www.funkthat.com/~jmg/em.stack.nodhclient.svg
>>>>
>>>> and now _rxeof drops from 17.22% to 11.94%, pretty significant...
>>>>
>>>> So, if you care about performance, don't run dhclient...
>>>>
>>> Yes, I've noticed the same issue. It can absolutely kill performance
>>> in a VM guest. It is much more pronounced on only some of my systems,
>>> and I hadn't tracked it down yet. I wonder if this is fallout from
>>> the callout work, or if there was some bpf change.
>>>
>>> I've been using the kludgey workaround patch below.
>> Hm, pretty interesting.
>> dhclient should setup proper filter (and it looks like it does so:
>> 13:10 [0] m_at_ptichko s netstat -B
>>    Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
>>   1224    em0 -ifs--l  41225922         0        11     0     0 dhclient
>> )
>> see "match" count.
>> And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for
>> each consumer on interface).
>> It should not introduce significant performance penalties.
> Don't forget that it has to process the returning ack's... So, you're
Well, it can be still captured with the proper filter like "ip && udp && 
port 67 or port 68".
We're using tcpdump on high packet ratios (>1M) and it does not 
influence process _much_.
We should probably convert its rwlock to rmlock and use per-cpu counters 
for statistics, but that's a different story.
> looking around 10k+ pps that you have to handle and pass through the
> filter...  That's a lot of packets to process...
>
> Just for a bit more "double check", instead of using the HD as a
> source, I used /dev/zero...   I ran a netstat -w 1 -I em0 when
> running the test, and I was getting ~50.7MiB/s w/ dhclient running and
> then I killed dhclient and it instantly jumped up to ~57.1MiB/s.. So I
> launched dhclient again, and it dropped back to ~50MiB/s...
dhclient uses different BPF sockets for reading and writing (and it 
moves write socket to privileged child process via fork().
The problem we're facing with is the fact that dhclient does not set 
_any_ read filter on write socket:
21:27 [0] zfscurr0# netstat -B
   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
  1529    em0 --fs--l     86774     86769     86784  4044  3180 dhclient
--------------------------------------- ^^^^^ --------------------------
  1526    em0 -ifs--l     86789         0         1     0     0 dhclient

so all traffic is pushed down introducing contention on BPF descriptor 
mutex.

(That's why I've asked for netstat -B output.)

Please try an attached patch to fix this. This is not the right way to 
fix this, we'd better change BPF behavior not to attach to interface 
readers for write-only consumers.
This have been partially implemented as net.bpf.optimize_writers hack, 
but it does not work for all direct BPF consumers (which are not using 
pcap(3) API).

>
> and some of this slowness is due to nc using small buffers which I will
> fix shortly..
>
> And with witness disabled it goes from 58MiB/s to 65.7MiB/s..  In
> both cases, that's a 13% performance improvement by running w/o
> dhclient...
>
> This is using the latest memstick image, r266655 on a (Lenovo T61):
> FreeBSD 11.0-CURRENT #0 r266655: Sun May 25 18:55:02 UTC 2014
>      root_at_grind.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
> FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
> WARNING: WITNESS option enabled, expect reduced performance.
> CPU: Intel(R) Core(TM)2 Duo CPU     T7300  _at_ 2.00GHz (1995.05-MHz K8-class CPU)
>    Origin="GenuineIntel"  Id=0x6fb  Family=0x6  Model=0xf  Stepping=11
>    Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>    Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
>    AMD Features=0x20100800<SYSCALL,NX,LM>
>    AMD Features2=0x1<LAHF>
>    TSC: P-state invariant, performance statistics
> real memory  = 2147483648 (2048 MB)
> avail memory = 2014019584 (1920 MB)
>


Received on Tue Jun 10 2014 - 15:35:15 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC