On 19 February 2014 14:09, Alexander Motin <mav_at_freebsd.org> wrote: > On 19.02.2014 23:44, Slawa Olhovchenkov wrote: >> >> On Wed, Feb 19, 2014 at 11:04:49PM +0200, Alexander Motin wrote: >> >>> On 19.02.2014 22:04, Adrian Chadd wrote: >>>> >>>> On 19 February 2014 11:59, Alexander Motin <mav_at_freebsd.org> wrote: >>>> >>>>>> So if we're moving towards supporting (among others) a pcbgroup / RSS >>>>>> hash style work load distribution across CPUs to minimise >>>>>> per-connection lock contention, we really don't want the scheduler to >>>>>> decide it can schedule things on other CPUs under enough pressure. >>>>>> That'll just make things worse. >>>> >>>> >>>>> True, though it is also not obvious that putting second thread on CPU >>>>> run >>>>> queue is better then executing it right now on another core. >>>> >>>> >>>> Well, it depends if you're trying to optimise for "run all runnable >>>> tasks as quickly as possible" or "run all runnable tasks in contexts >>>> that minimise lock contention." >>>> >>>> The former sounds great as long as there's no real lock contention >>>> going on. But as you add more chances for contention (something like >>>> "100,000 concurrent TCP flows") then you may end up having your TCP >>>> timer firing stuff interfere with more TXing or RXing on the same >>>> connection. >>> >>> >>> 100K TCP flows probably means 100K locks. That means that chance of lock >>> collision on each of them is effectively zero. More realistic it could >> >> >> What about 100K/N_cpu*PPS timer's queue locks for remove/insert TCP >> timeouts callbacks? > > > I am not sure what this formula means, but yes, per-CPU callout locks can > much more likely be congested. They are only per-CPU, not per-flow. It's not just that, but also TX versus RX ACK processing and further TX being done on different threads. -aReceived on Wed Feb 19 2014 - 21:55:04 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:47 UTC