On 10.06.2014 22:56, John-Mark Gurney wrote: > Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 21:33 +0400: >> On 10.06.2014 20:24, John-Mark Gurney wrote: >>> Alexander V. Chernikov wrote this message on Tue, Jun 10, 2014 at 13:17 >>> +0400: >>>> On 10.06.2014 07:03, Bryan Venteicher wrote: >>>>> Hi, >>>>> >>>>> ----- Original Message ----- >>>>>> So, after finding out that nc has a stupidly small buffer size (2k >>>>>> even though there is space for 16k), I was still not getting as good >>>>>> as performance using nc between machines, so I decided to generate some >>>>>> flame graphs to try to identify issues... (Thanks to who included a >>>>>> full set of modules, including dtraceall on memstick!) >>>>>> >>>>>> So, the first one is: >>>>>> https://www.funkthat.com/~jmg/em.stack.svg >>>>>> >>>>>> As I was browsing around, the em_handle_que was consuming quite a bit >>>>>> of cpu usage for only doing ~50MB/sec over gige.. Running top -SH shows >>>>>> me that the taskqueue for em was consuming about 50% cpu... Also pretty >>>>>> high for only 50MB/sec... Looking closer, you'll see that bpf_mtap is >>>>>> consuming ~3.18% (under ether_nh_input).. I know I'm not running >>>>>> tcpdump >>>>>> or anything, but I think dhclient uses bpf to be able to inject packets >>>>>> and listen in on them, so I kill off dhclient, and instantly, the >>>>>> taskqueue >>>>>> thread for em drops down to 40% CPU... (transfer rate only marginally >>>>>> improves, if it does) >>>>>> >>>>>> I decide to run another flame graph w/o dhclient running: >>>>>> https://www.funkthat.com/~jmg/em.stack.nodhclient.svg >>>>>> >>>>>> and now _rxeof drops from 17.22% to 11.94%, pretty significant... >>>>>> >>>>>> So, if you care about performance, don't run dhclient... >>>>>> >>>>> Yes, I've noticed the same issue. It can absolutely kill performance >>>>> in a VM guest. It is much more pronounced on only some of my systems, >>>>> and I hadn't tracked it down yet. I wonder if this is fallout from >>>>> the callout work, or if there was some bpf change. >>>>> >>>>> I've been using the kludgey workaround patch below. >>>> Hm, pretty interesting. >>>> dhclient should setup proper filter (and it looks like it does so: >>>> 13:10 [0] m_at_ptichko s netstat -B >>>> Pid Netif Flags Recv Drop Match Sblen Hblen Command >>>> 1224 em0 -ifs--l 41225922 0 11 0 0 dhclient >>>> ) >>>> see "match" count. >>>> And BPF itself adds the cost of read rwlock (+ bgp_filter() calls for >>>> each consumer on interface). >>>> It should not introduce significant performance penalties. >>> Don't forget that it has to process the returning ack's... So, you're >> Well, it can be still captured with the proper filter like "ip && udp && >> port 67 or port 68". >> We're using tcpdump on high packet ratios (>1M) and it does not >> influence process _much_. >> We should probably convert its rwlock to rmlock and use per-cpu counters >> for statistics, but that's a different story. >>> looking around 10k+ pps that you have to handle and pass through the >>> filter... That's a lot of packets to process... >>> >>> Just for a bit more "double check", instead of using the HD as a >>> source, I used /dev/zero... I ran a netstat -w 1 -I em0 when >>> running the test, and I was getting ~50.7MiB/s w/ dhclient running and >>> then I killed dhclient and it instantly jumped up to ~57.1MiB/s.. So I >>> launched dhclient again, and it dropped back to ~50MiB/s... >> dhclient uses different BPF sockets for reading and writing (and it >> moves write socket to privileged child process via fork(). >> The problem we're facing with is the fact that dhclient does not set >> _any_ read filter on write socket: >> 21:27 [0] zfscurr0# netstat -B >> Pid Netif Flags Recv Drop Match Sblen Hblen Command >> 1529 em0 --fs--l 86774 86769 86784 4044 3180 dhclient >> --------------------------------------- ^^^^^ -------------------------- >> 1526 em0 -ifs--l 86789 0 1 0 0 dhclient >> >> so all traffic is pushed down introducing contention on BPF descriptor >> mutex. >> >> (That's why I've asked for netstat -B output.) >> >> Please try an attached patch to fix this. This is not the right way to >> fix this, we'd better change BPF behavior not to attach to interface >> readers for write-only consumers. >> This have been partially implemented as net.bpf.optimize_writers hack, >> but it does not work for all direct BPF consumers (which are not using >> pcap(3) API). > > Ok, looks like this patch helps the issue... > > netstat -B; sleep 5; netstat -B: > Pid Netif Flags Recv Drop Match Sblen Hblen Command > 958 em0 --fs--l 3880000 14 35 3868 2236 dhclient > 976 em0 -ifs--l 3880014 0 1 0 0 dhclient > Pid Netif Flags Recv Drop Match Sblen Hblen Command > 958 em0 --fs--l 4178525 14 35 3868 2236 dhclient > 976 em0 -ifs--l 4178539 0 1 0 0 dhclient > > and now the rate only drops from ~66MiB/s to ~63MiB/s when dhclient is > running... Still a significant drop (5%), but better than before... Interesting. Can you provide some traces (pmc or dtrace ones)? I'm unsure if this will help, but it's worth trying: please revert my previous patch, apply an attached kernel patch, reboot, set net.bpf.optimize_writers to 1 and try again? >
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC