On Mon, Dec 2, 2013 at 12:29 PM, Oleg Moskalenko <mom040267_at_gmail.com>wrote: > Sepherosa, while reading your description I noticed another long-standing > problem for UDP application developers: the UDP sockets are always hashed > with 2-tuple. But UDP sockets can be "connected", too, to a remote address, > with connect(...) > The connected UDP sockets will be in connect hash, which is hashed using faddr/laddr/fport/lport. SO_REUSEPORT only affects wildcard sockets. > function. Unfortunately, with 2-tuple hashing, that pattern is useless for > large-scale applications: if a large number of UDP sockets on the same > local port are "connected" to remote address, then the kernel have to go > thru the long list of UDP sockets with the same hash value. > > If the connected UDP sockets would use 4-tuples, then it would be very > helpful for the new generation of the UDP-based media applications. For > example, servers which use DTLS protocol would become simpler and more > efficient. > > If you are talking about RSS, then igb, ixgbe and mxge (and may be other drivers) support RSS extension (mxge is not using RSS, but still 4-tuple hash), which will include UDP fport/lport into Toeplitz hash calculation. Well, for fragments of a UDP datagram, if the ports are taken into consideration the RSS hash will be different for leading fragment and rest of the fragments; I think that's why MS didn't include ports for UDP. Best Regards, sephe > Thanks > Oleg > > > > On Sun, Dec 1, 2013 at 8:17 PM, Sepherosa Ziehau <sepherosa_at_gmail.com>wrote: > >> >> >> >> On Sat, Nov 30, 2013 at 2:42 AM, Ermal Luçi <eri_at_freebsd.org> wrote: >> >>> Well seems Dragonfly has some version of it already from commit [1]. >>> >>> >> The distribution algorithm was changed a little bit after initial commit >> to gain more idle time (bnx(4) output has already been maxed out): >> >> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be28b150d3f4fd518914bdeba6 >> >> Well, I also addressed a reasonable concern from nginx folks (I am not >> quite sure about Linux's position on it; Linux original implementation of >> SO_REUSEPORT from Google had this drawback, which I mentioned in the commit >> message): >> >> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45eb69750219f79f5e8982272 >> >> As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in >> dports; should be easier to be back ported to FreeBSD's ports. I failed to >> convince nginx folks to merge it into mainline and I am currently onto >> other stuffs, will come back to them later. If FreeBSD is going to >> implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx >> mainline will be easier. >> >> I also put up a brief description of SO_REUSEPORT in dfly; may be useful >> to you: >> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt >> >> Best Regards, >> sephe >> >> >>> In FreeBSD there is the framework for this with by defining PCBGROUP. >>> Also the explanation of it at [2] and [3]. >>> It can achieve approximately the same features of SO_RESUSEPORT of linux. >>> The only thing missing is the marketing behind it and i think and better >>> RSS support. >>> By looking at dates the support is there before linux so all you guys >>> looking for it can experiment with it. >>> >>> What i was trying to accomplish was something else from performance >>> improvement and >>> maybe put a sysctl behind it to make it more acceptable.. >>> >>> [1] >>> >>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9 >>> [2] >>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51 >>> [3] >>> http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html >>> >>> >>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko <mom040267_at_gmail.com >>> >wrote: >>> >>> > Tim, you are wrong. Read what is "multicast" definition, and read how >>> UDP >>> > and TCP sockets work in Linux 3.9+ kernels. >>> > >>> > Oleg . >>> > >>> > >>> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle <kientzle_at_freebsd.org >>> >wrote: >>> > >>> >> >>> >> On Nov 29, 2013, at 4:04 AM, Ermal Luçi <eri_at_freebsd.org> wrote: >>> >> >>> >> > Hello, >>> >> > >>> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two >>> daemons to >>> >> > share the same port and possibly listening ip … >>> >> >>> >> These flags are used with TCP-based servers. >>> >> >>> >> I’ve used them to make software upgrades go more smoothly. >>> >> Without them, the following often happens: >>> >> >>> >> * Old server stops. In the process, all of its TCP connections are >>> >> closed. >>> >> >>> >> * Connections to old server remain in the TCP connection table until >>> the >>> >> remote end can acknowledge. >>> >> >>> >> * New server starts. >>> >> >>> >> * New server tries to open port but fails because that port is “still >>> in >>> >> use” by connections in the TCP connection table. >>> >> >>> >> With these flags, the new server can open the port even though >>> >> it is “still in use” by existing connections. >>> >> >>> >> >>> >> > This is not the case today. >>> >> > Only multicast sockets seem to have the behaviour of broadcasting >>> the >>> >> data >>> >> > to all sockets sharing the same properties through these options! >>> >> >>> >> That is what multicast is for. >>> >> >>> >> If you want the same data sent to all listeners, then >>> >> that is multicast behavior and you should be using >>> >> a multicast socket. >>> >> >>> >> > The patch at [1] implements/corrects the behaviour for UDP sockets. >>> >> >>> >> You’re trying to turn all UDP sockets with those options >>> >> into multicast sockets. >>> >> >>> >> If you want a multicast socket, you should ask for one. >>> >> >>> >> Tim >>> >> >>> >> _______________________________________________ >>> >> freebsd-net_at_freebsd.org mailing list >>> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe_at_freebsd.org >>> " >>> >> >>> > >>> > >>> >>> >>> -- >>> Ermal >>> _______________________________________________ >>> freebsd-current_at_freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to " >>> freebsd-current-unsubscribe_at_freebsd.org" >>> >> >> >> >> -- >> Tomorrow Will Never Die >> > > -- Tomorrow Will Never DieReceived on Mon Dec 02 2013 - 10:36:48 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:44 UTC