Re: LOR on em in HEAD ( was Re: em driver regression

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 12 Apr 2010 12:56:12 -0400
On Monday 12 April 2010 12:26:06 pm Jack Vogel wrote:
> On Mon, Apr 12, 2010 at 7:52 AM, John Baldwin <jhb_at_freebsd.org> wrote:
> 
> > On Friday 09 April 2010 3:09:24 pm Jack Vogel wrote:
> > > Someone else also pointed this out. I'm dubious about its claim.
> > > This happens because there is an RX lock taken in rxeof, its held
> > > thru the call into the stack, it then encounters another lock there
> > > and hence this complaint. I've had the RX hold as it is for a long
> > > while and would rather not have to give it up, can someone look
> > > at it and advise?
> >
> > I've seen it happen with igb.  I suspect it is a transitive lock order.
> >  That
> > is, you probably never have the UDP lock acquired before an em/igb RX 
lock.
> > However, if you have an em/igb adapter TX lock held when you acquire an
> > em/igb
> > RX lock in one place, and in if_start() you acquire the TX lock while the
> > UDP
> > lock is held, that can trigger the LOR.  Specifically, those two paths
> > would
> > give you these two orders:
> >
> > TX -> RX
> > UDP -> TX
> >
> > which implies the order
> >
> > UDP -> RX
> >
> > (lock order relationsips are transitive, just like a > b and b > c implies
> > a > c).
> >
> > However, I haven't been able to track down what the raw orders are that
> > might
> > lead to this transitive order.  Attilio added some sysctls to dump all the
> > raw
> > lock orders in one of the debug.witness sysctls.  You can also try
> > hardcoding
> > the 'RX -> UDP' order using WITNESS_DEFINEORDER() before any of the em/igb
> > RX/TX locks are acquired to see what different LOR is triggered.  If that
> > LOR
> > looks valid then you can keep hardcoding valid orders until you find the
> > invalid one.
> >
> > Do you think releasing the RX lock before the stack entry would get rid of
> the problem?
> 
> Other ideas?

Well, while that might quiet the LOR, I suspect it would be masking another 
problem that is the "real" LOR.

-- 
John Baldwin
Received on Mon Apr 12 2010 - 14:57:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:02 UTC