Re: dhclient sucks cpu usage...

From: John-Mark Gurney <jmg_at_funkthat.com>
Date: Tue, 10 Jun 2014 11:49:20 -0700
Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 22:21 +0400:
> On 10.06.2014 22:11, Bryan Venteicher wrote:
> >
> >----- Original Message -----
> >>On 10.06.2014 07:03, Bryan Venteicher wrote:
> >>>Hi,
> >>>
> >>>----- Original Message -----
> >>>>So, after finding out that nc has a stupidly small buffer size (2k
> >>>>even though there is space for 16k), I was still not getting as good
> >>>>as performance using nc between machines, so I decided to generate some
> >>>>flame graphs to try to identify issues...  (Thanks to who included a
> >>>>full set of modules, including dtraceall on memstick!)
> >>>>
> >>>>So, the first one is:
> >>>>https://www.funkthat.com/~jmg/em.stack.svg
> >>>>
> >>>>As I was browsing around, the em_handle_que was consuming quite a bit
> >>>>of cpu usage for only doing ~50MB/sec over gige..  Running top -SH shows
> >>>>me that the taskqueue for em was consuming about 50% cpu...  Also pretty
> >>>>high for only 50MB/sec...  Looking closer, you'll see that bpf_mtap is
> >>>>consuming ~3.18% (under ether_nh_input)..  I know I'm not running 
> >>>>tcpdump
> >>>>or anything, but I think dhclient uses bpf to be able to inject packets
> >>>>and listen in on them, so I kill off dhclient, and instantly, the
> >>>>taskqueue
> >>>>thread for em drops down to 40% CPU... (transfer rate only marginally
> >>>>improves, if it does)
> >>>>
> >>>>I decide to run another flame graph w/o dhclient running:
> >>>>https://www.funkthat.com/~jmg/em.stack.nodhclient.svg
> >>>>
> >>>>and now _rxeof drops from 17.22% to 11.94%, pretty significant...
> >>>>
> >>>>So, if you care about performance, don't run dhclient...
> >>>>
> >>>Yes, I've noticed the same issue. It can absolutely kill performance
> >>>in a VM guest. It is much more pronounced on only some of my systems,
> >>>and I hadn't tracked it down yet. I wonder if this is fallout from
> >>>the callout work, or if there was some bpf change.
> >>>
> >>>I've been using the kludgey workaround patch below.
> >>Hm, pretty interesting.
> >>dhclient should setup proper filter (and it looks like it does so:
> >>13:10 [0] m_at_ptichko s netstat -B
> >>    Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
> >>   1224    em0 -ifs--l  41225922         0        11     0     0 dhclient
> >>)
> >>see "match" count.
> >>And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for
> >>each consumer on interface).
> >>It should not introduce significant performance penalties.
> >>
> >
> >It will be a bit before I'm able to capture that. Here's a Flamegraph from
> >earlier in the year showing an absurd amount of time spent in bpf_mtap():
> Can you briefly describe test setup?

For mine, one machine is sink:
nc -l 2387 > /dev/null

The machine w/ dhclient is source:
nc carbon 2387 < /dev/zero

> (Actually I'm interested in overall pps rate, bpf filter used and match 
> ratio).

the overal rate is ~26k pps both in and out (so total ~52kpps)...

So, netstat -B; sleep 5; netstat -B gives:
  Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
  919    em0 --fs--l   6275907   6275938   6275961  4060  2236 dhclient
  937    em0 -ifs--l   6275992         0         1     0     0 dhclient
  Pid  Netif   Flags      Recv      Drop     Match Sblen Hblen Command
  919    em0 --fs--l   6539717   6539752   6539775  4060  2236 dhclient
  937    em0 -ifs--l   6539806         0         1     0     0 dhclient

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
Received on Tue Jun 10 2014 - 16:49:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC