Re: [net] protecting interfaces from races between control and data ?

From: Scott Long <scott4long_at_yahoo.com> Date: Mon, 5 Aug 2013 11:32:02 -0600 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:40 UTC

On Aug 5, 2013, at 11:20 AM, Adrian Chadd <adrian_at_freebsd.org> wrote:

> .. and I bet it's not a design pattern, and this is total conjecture on my part:
> 
> * the original drivers weren't SMP safe;
> * noone really sat down and figured out how to correctly synchronise
> all of this stuff;
> * people did the minimum amount of work to keep the driver from
> immediately crashing, but didn't really think things through at a
> larger scale.
> 
> Almost every driver is this way Luigi. :-)
> 
> 

Yes, but let me expand.  The original work to make NIC drivers SMP focused
around just putting everything under 1 lock.  The lock was acquired in
if_start and held through the transmission of the packet, it was held in
if_ioctl all the way through whatever action was taken, and it was held in
the interrupt handler over all of the RX and TX-complete processing.  This
worked fine up until the RX path called if_input() with the netisr path set
for direct dispatch.  In this mode, the code could flow up the stack and
then immediately back down into the if_start handler for the driver,
where it would try to acquire the same lock again.  Also, it meant that
forwarding packets between to interfaces would have the lock from the
RX context of one interface held into the TX context of the other interface.

Obviously, this was a big mess, so the "solution" was to just drop the
lock around the call to if_input().  No consideration was made for driver
state that might change during the lock release, nor was any consideration
made for the performance impact of dropping the lock on every packet and
then having to contend with top-half TX threads to pick it back up.  But
this became the quick-fix pattern.

As for the original question of why the RX path can operate unlocked, it's
fairly easy.  The RX machinery of most NICs is fairly separate from the TX
machinery, so little synchronization is needed.  When there's a state change
in the driver, in terms of an interface going down, a queue size changing,
or the driver being unloaded, it's up to the driver to pause and drain the
RX processing prior to this state change.  It worked well when I did it in
if_em, and it appears to work well with cxgbe.

Scott