Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

From: Sepherosa Ziehau <sepherosa_at_gmail.com> Date: Fri, 6 Dec 2013 11:04:29 +0800 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:45 UTC

On Tue, Dec 3, 2013 at 5:41 AM, Adrian Chadd <adrian_at_freebsd.org> wrote:
>
> On 2 December 2013 03:45, Sepherosa Ziehau <sepherosa_at_gmail.com> wrote:
> >
> > On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd <adrian_at_freebsd.org> wrote:
> >
> >> Ok, so given this, how do you guarantee the UTHREAD stays on the given
> >> CPU? You assume it stays on the CPU that the initial listen socket was
> >> created on, right? If it's migrated to another CPU core then the
> >> listen queue still stays in the original hash group that's in a netisr
> >> on a different CPU?
> >
> > As I wrote in the above brief introduction, Dfly currently relies on the
> > scheduler doing the proper thing (the scheduler does do a very good job
> > during my tests).  I need to export certain kind of socket option to make
> > that information available to user space programs.  Force UTHREAD binding in
> > kernel is not helpful, given in reverse proxy application, things are
> > different.  And even if that kind of binding information was exported to
> > user space, user space program still would have to poll it periodically (in
> > Dfly at least), since other programs binding to the same addr/port could
> > come and go, which will cause reorganizing of the inp localgroup in the
> > current Dfly implementation.
>
> Right. I kinda gathered that. It's fine, I was conceptually thinking
> of doing some thead pinning into this anyway.
>
> How do you see this scaling on massively multi-core machines? Like 32,
> 48, 64, 128 cores? I had some vague handwav-y notion of maybe limiting

We do have a 48 core box.  It is mainly used for package building and
other stuffs.  I didn't run network stress tests on it.  However, we
do address some message passing problems on it which will not be
unveiled on 8 cpu boxes.

> the concept of pcbgroup hash / netisr threads to a subset of CPUs, or
> have them be able to float between sockets but only have 1 (or n,

Floating around may be good, but by pinning netisr to a specific CPU
you could enjoy lockless per-cpu data.

> maybe) per socket. Or just have a fixed, smaller pool. The idea then

We used to have dedicated threads for UDP and TCP processing, but it
turns out that one netisr per cpu works best in Dfly.  You probably
need to try and measure before deciding to move to 1 or N netisrs per
cpu.

Best Regards,
sephe

> is the scheduler would need to be told that a given userland
> thread/process belongs to a given netisr thread, and to schedule them
> on the same CPU when possible.
>
> Anyway, thanks for doing this work. I only wish that you'd do it for
> FreeBSD. :-)
>
>
>
> -adrian

-- 
Tomorrow Will Never Die