Re: nve locking fixes round 2

From: John Baldwin <jhb_at_freebsd.org> Date: Mon, 28 Nov 2005 12:03:59 -0500 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:48 UTC

On Friday 25 November 2005 11:09 am, Matthew Dillon wrote:
> :...
> :
> :>    The reason I set sc->pending_txs to 0 in DFly after the reinit is
> :>    because when a watchdog timeout occurs and you reset the device,
> :>    *ALL* mbufs still sitting in the transmit ring are lost.  They will
> :>    never be acknowledged, ever.  So pending_txs will never drop back to
> :> 0 on its own.  This is what led to continuous watchdog timeout reports
> :> when, in fact, only one timeout actually occured.
> :
> :the problem is that with some versions of the hardware you are not
> :even able to get the first packet out.
> :
> :--
> :Bjoern A. Zeeb				bzeeb at Zabbadoz dot NeT
>
>     I'm not sure if its the same as what happened to me, but I believe
>     I have observed this as well.  But at least in my case it turned out
>     to be a bug in (if_nv.c for DFly) that issued ABI calls before
> resetting the hardware.  I think it had something to do with nv_stop()
> being called before the initial hardware reset and nv_stop() then making an
> ABI call or two that expected the hardware to already be in a sane state
> (when it wasn't).  You'd have to look at the DFly commit to see for sure.

Yes, I have that patch in my tree though I'm not sure it is in the patch I 
posted.  I'll update the patch to include that.  Actually, try the first hunk 
in the patch at http://www.FreeBSD.org/~jhb/patches/nve_dffixes.patch it is 
the change Matt is referring to.

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org