Re: dhclient sucks cpu usage...

From: Alexander V. Chernikov <melifaro_at_FreeBSD.org> Date: Wed, 11 Jun 2014 01:46:31 +0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC

On 10.06.2014 22:56, John-Mark Gurney wrote:
> Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 21:33 +0400:
>> On 10.06.2014 20:24, John-Mark Gurney wrote:
>>> Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 13:17 
>>> +0400:
>>>> On 10.06.2014 07:03, Bryan Venteicher wrote:
>>>>> Hi,
>>>>>
>>>>> ----- Original Message -----
>>>>>> So, after finding out that nc has a stupidly small buffer size (2k
>>>>>> even though there is space for 16k), I was still not getting as good
>>>>>> as performance using nc between machines, so I decided to generate some
>>>>>> flame graphs to try to identify issues...  (Thanks to who included a
>>>>>> full set of modules, including dtraceall on memstick!)
>>>>>>
>>>>>> So, the first one is:
>>>>>> https://www.funkthat.com/~jmg/em.stack.svg
>>>>>>
>>>>>> As I was browsing around, the em_handle_que was consuming quite a bit
>>>>>> of cpu usage for only doing ~50MB/sec over gige..  Running top -SH shows
>>>>>> me that the taskqueue for em was consuming about 50% cpu...  Also pretty
>>>>>> high for only 50MB/sec...  Looking closer, you'll see that bpf_mtap is
>>>>>> consuming ~3.18% (under ether_nh_input)..  I know I'm not running 
>>>>>> tcpdump
>>>>>> or anything, but I think dhclient uses bpf to be able to inject packets
>>>>>> and listen in on them, so I kill off dhclient, and instantly, the
>>>>>> taskqueue
>>>>>> thread for em drops down to 40% CPU... (transfer rate only marginally
>>>>>> improves, if it does)
>>>>>>
>>>>>> I decide to run another flame graph w/o dhclient running:
>>>>>> https://www.funkthat.com/~jmg/em.stack.nodhclient.svg
>>>>>>
>>>>>> and now _rxeof drops from 17.22% to 11.94%, pretty significant...
>>>>>>
>>>>>> So, if you care about performance, don't run dhclient...
>>>>>>
>>>>> Yes, I've noticed the same issue. It can absolutely kill performance
>>>>> in a VM guest. It is much more pronounced on only some of my systems,
>>>>> and I hadn't tracked it down yet. I wonder if this is fallout from
>>>>> the callout work, or if there was some bpf change.
>>>>>
>>>>> I've been using the kludgey workaround patch below.
>>>> Hm, pretty interesting.
>>>> dhclient should setup proper filter (and it looks like it does so:
>>>> 13:10 [0] m_at_ptichko s netstat -B
>>>>   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
>>>>  1224    em0 -ifs--l  41225922         0        11     0     0 dhclient
>>>> )
>>>> see "match" count.
>>>> And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for
>>>> each consumer on interface).
>>>> It should not introduce significant performance penalties.
>>> Don't forget that it has to process the returning ack's... So, you're
>> Well, it can be still captured with the proper filter like "ip && udp && 
>> port 67 or port 68".
>> We're using tcpdump on high packet ratios (>1M) and it does not 
>> influence process _much_.
>> We should probably convert its rwlock to rmlock and use per-cpu counters 
>> for statistics, but that's a different story.
>>> looking around 10k+ pps that you have to handle and pass through the
>>> filter...  That's a lot of packets to process...
>>>
>>> Just for a bit more "double check", instead of using the HD as a
>>> source, I used /dev/zero...   I ran a netstat -w 1 -I em0 when
>>> running the test, and I was getting ~50.7MiB/s w/ dhclient running and
>>> then I killed dhclient and it instantly jumped up to ~57.1MiB/s.. So I
>>> launched dhclient again, and it dropped back to ~50MiB/s...
>> dhclient uses different BPF sockets for reading and writing (and it 
>> moves write socket to privileged child process via fork().
>> The problem we're facing with is the fact that dhclient does not set 
>> _any_ read filter on write socket:
>> 21:27 [0] zfscurr0# netstat -B
>>   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
>>  1529    em0 --fs--l     86774     86769     86784  4044  3180 dhclient
>> --------------------------------------- ^^^^^ --------------------------
>>  1526    em0 -ifs--l     86789         0         1     0     0 dhclient
>>
>> so all traffic is pushed down introducing contention on BPF descriptor 
>> mutex.
>>
>> (That's why I've asked for netstat -B output.)
>>
>> Please try an attached patch to fix this. This is not the right way to 
>> fix this, we'd better change BPF behavior not to attach to interface 
>> readers for write-only consumers.
>> This have been partially implemented as net.bpf.optimize_writers hack, 
>> but it does not work for all direct BPF consumers (which are not using 
>> pcap(3) API).
> 
> Ok, looks like this patch helps the issue...
> 
> netstat -B; sleep 5; netstat -B:
>   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
>   958    em0 --fs--l   3880000        14        35  3868  2236 dhclient
>   976    em0 -ifs--l   3880014         0         1     0     0 dhclient
>   Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
>   958    em0 --fs--l   4178525        14        35  3868  2236 dhclient
>   976    em0 -ifs--l   4178539         0         1     0     0 dhclient
> 
> and now the rate only drops from ~66MiB/s to ~63MiB/s when dhclient is
> running...  Still a significant drop (5%), but better than before...
Interesting.
Can you provide some traces (pmc or dtrace ones)?

I'm unsure if this will help, but it's worth trying:
please revert my previous patch, apply an attached kernel patch,
reboot, set net.bpf.optimize_writers to 1 and try again?

>