Re: LOR route vr0

From: M. Warner Losh <imp_at_bsdimp.com>
Date: Sat, 27 Aug 2005 12:04:48 -0600 (MDT)
In message: <20050827184153.A24510_at_fledge.watson.org>
            Robert Watson <rwatson_at_FreeBSD.org> writes:
: On Sat, 27 Aug 2005, M. Warner Losh wrote:
: 
: > : Generally speaking, network interface device driver locks follow network
: > : stack locks in the lock order.  However, I've not really looked much at
: > : the route table locking so can't speak to whether that is the case
: > : specifically for routing locks.  If it is, the below traces reflect the
: > : correct order, and you might want to add a hard-coded entry to witness in
: > : order to catch the reverse order.
: >
: > Can you pose a quickie summary on how to do that? I tried last night and 
: > was unsuccessful...
: 
: You need to add an entry to subr_witness.c creating a graph edge between 
: the softc lock and the routing lock.  An example of an entry in 
: subr_witness.c:
: 
:          /*
:           * TCP/IP
:           */
:          { "tcp", &lock_class_mtx_sleep },
:          { "tcpinp", &lock_class_mtx_sleep },
:          { "so_snd", &lock_class_mtx_sleep },
:          { NULL, NULL },
: 
: Note that sets of ordered entries are terminated with a double-null.  This 
: declares that locks of type "tcp" preceed "tcpinp" which preceed 
: "so_snd".

So would I add "ed1" to the list or "network driver"?

: > : Lock order reversals between the
: > : network stack and device drivers tend to occur as a result of the device
: > : driver calling into the network stack while holding the device driver
: > : mutex.
: >
: > I'm as sure as I can be that no locks are held when I call INTO the 
: > network layer.  As far as I can tell, I only do that when I call 
: > ifp->if_input, and I drop the locks to do that.
: 
: If I had to guess, you do a media status update, which can cause routing 
: socket events indicating the link went up or down.

No link moditoring, since the ED card I'm testing has no mii bus.
That might be ANOTHER problem, but it isn't this one :-).

: > : Someone (tm) should work out if the right order is route locks ->
: > : device driver locks, as it's likely a common calss of bugs across many
: > : drivers.
: >
: > I just discovered the problem in my code.  I'm not sure where the
: > other order happens, but in my code I do the following:
: >
: > 	ED_LOCK(sc);
: > 	ed_setrcr(sc);
: > 	    ed_ds_getmcst(sc);
: > 		IF_ADDR_LOCK(sc->ifp);
: > 		TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) {
: > 		...
: > 		IF_ADDR_UNLOCK(sc->ifp);
: > 	ED_UNLOCK(sc);
: >
: > since the lock for ED should be a leaf lock, this causes problems. I'm 
: > guessing that the network layer calls into the driver with this lock 
: > held.  Without hard coding the locking into witness (see above), I'm 
: > unsure where this happens.  A quick grep of the code doesn't reveal 
: > anything obvious...
: 
: I think this case should be OK, and we should document that as being the 
: case using a hard-coded witness entry.

rearranging the code in this case would be at the very least awkward.
Maybe quite difficult, but likely doable.

: > When I comment out the abouve IF_ADDR locks, I have no more LORs, but I 
: > think maybe other problems :-).
: 
: Hmmm.  I was thinking that it was a separate issue.  Could you try adding 
: a graph edge to witness forcing the ifaddrmtx's to fall before the driver 
: mutexes, in order to identify a path by which ifaddrmtx preceeds the 
: driver mutex?

I'll try again.

Warner
Received on Sat Aug 27 2005 - 16:04:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC