Re: Locking fixes for sf(4)

From: Markus Brueffer <markus_at_freebsd.org>
Date: Mon, 29 Aug 2005 15:42:38 +0200
Hi John,

On Friday 12 August 2005 15:16, John Baldwin wrote:
> On Friday 12 August 2005 08:15 am, Christian Brueffer wrote:
> > On Thu, Aug 11, 2005 at 11:24:10AM -0400, John Baldwin wrote:
> > > On Thursday 11 August 2005 11:00 am, Christian Brueffer wrote:
> > > > On Wed, Aug 10, 2005 at 04:58:09PM -0400, John Baldwin wrote:
> > > > > I've fixed up the locking in sf(4) but do not have the hardware
> > > > > to test the changes.  Can someone please test these patches? 
> > > > > Thanks.
> > > > >
> > > > > http://www.freebsd.org/~jhb/patches/sf_locking.patch
> > > >
> > > > Results in a "recursed on non-recursive mutex" panic. 
> > > > Unfortunately the dump looks busted, I'll get a good one tomorrow
> > > > (can also test the my(4) patch then).
> > >
> > > Ok.  If you could just get the backtrace from ddb that would probably
> > > be sufficient.  Thanks for testing!
> >
> > panic: _mtx_lock_sleep: recursed on non-recursive mutex sf0 _at_
> > /usr/home/build/src/sys/modules/sf/..
> > /../pci/if_sf.c:477
> >
> > CPUID = 1
> > KDB: enter: panic
> > [thread pid 220 tid 100072 ]
> > Stopped at      kdb_enter+0x30: leave
> > db> tr
> > Tracing pid 220 tid 100072 td 0xc1d63480
> > kdb_enter(c079421b,1,c0793681,d8945ab4,c1d63480) at kdb_enter+0x30
> > panic(c0793681,c1ad6ab0,c08ed18a,1dd,1dd) at panic+0x14e
> > _mtx_lock_sleep(c1ac3c4c,c1d63480,0,c08ed18a,1dd) at
> > _mtx_lock_sleep+0x47 _mtx_lock_flags(c1ac3c4c,0,c08ed18a,1dd,0) at
> > _mtx_lock_flags+0x9c
> > sf_ifmedia_upd(c1adb800,1000,c08ed18a,4b7,c1ac3c4c) at
> > sf_ifmedia_upd+0x3e sf_init_locked(c1ac3c4c,0,c08ed18a,4aa,c1adb800) at
> > sf_init_locked+0x4bc sf_init(c1ac3c00,740,c07a8534,8020690c,c1ac3c00)
> > at sf_init+0x39 ether_ioctl(c1adb800,8020690c,c1d67e00,c05a7cd1,0) at
> > ether_ioctl+0x67 sf_ioctl(c1adb800,8020690c,c1d67e00,100,1) at
> > sf_ioctl+0xbb
> > in_ifinit(c1adb800,c1d67e00,c1cef3d0,0,1) at in_ifinit+0x208
> > in_control(c1dfcde8,8040691a,c1cef3c0,c1adb800,c1d63480) at
> > in_control+0x986 ifioctl(c1dfcde8,8040691a,c1cef3c0,c1d63480,2) at
> > ifioctl+0x1cd
> > soo_ioctl(c1d59d80,8040691a,c1cef3c0,c19dca80,c1d63480) at
> > soo_ioctl+0x3ef ioctl(c1d63480,d8945d04,c,422,3) at ioctl+0x45d
> > syscall(3b,3b,3b,80beac0,1) at syscall+0x2c0
> > Xint0x80_syscall() at Xint0x80_syscall+0x1f
> > --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x8055473, esp =
> > 0xbfbfe5fc, ebp = 0xbfbfee68 ---
>
> Ah, ok, thanks.  This is the first driver I've seen that calls its
> ifmedia_update routine internally.  I'll fix this and update the patch.
> Thanks!

I'm getting this LOR with the latest if_sf.c in RELENG_6:

lock order reversal
 1st 0xc1b0d7cc sf0 (network driver) _at_ /usr/src/sys/pci/if_sf.c:1201
 2nd 0xc07a49e0 Giant (Giant) _at_ /usr/src/sys/kern/kern_poll.c:460
KDB: stack backtrace:
kdb_backtrace(c0742772,c07a49e0,c074dd5f,c074dd5f,c073e09e) at 
kdb_backtrace+0x2e
witness_checkorder(c07a49e0,9,c073e09e,1cc,18a) at witness_checkorder+0x6c3
_mtx_lock_flags(c07a49e0,0,c073e09e,1cc,c1b0d780) at _mtx_lock_flags+0x8a
ether_poll_deregister(c1af3000,0,c074def0,5c3,c1af3000) at 
ether_poll_deregister+0x2e
sf_stop(c1b0d780,1,c074def0,4be,c1b0d780) at sf_stop+0x52
sf_init_locked(c1b0d780,0,c074def0,4b1,c1af3000) at sf_init_locked+0x44
sf_init(c1b0d780,c055f18d,c07ac2c0,8020690c,c1b0d780) at sf_init+0x3a
ether_ioctl(c1af3000,8020690c,c1c19a00,c07423ea,0) at ether_ioctl+0x67
sf_ioctl(c1af3000,8020690c,c1c19a00,c1c19a7c,1) at sf_ioctl+0x270
in_ifinit(c1af3000,c1c19a00,c1c6aa10,0,1) at in_ifinit+0x208
in_control(c1e98de8,8040691a,c1c6aa00,c1af3000,c1c16a80) at in_control+0x986
ifioctl(c1e98de8,8040691a,c1c6aa00,c1c16a80,2) at ifioctl+0x1bc
soo_ioctl(c1c57168,8040691a,c1c6aa00,c19d7a80,c1c16a80) at soo_ioctl+0x3ef
ioctl(c1c16a80,d7827d04,c,422,3) at ioctl+0x45d
syscall(3b,3b,3b,8058aa0,0) at syscall+0x2c0
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x280d17ef, esp = 0xbfbfe99c, 
ebp = 0xbfbfe9c8 ---

brueffer_at_galaxy:/usr/src/sys/pci > ident if_sf.c
if_sf.c:
     $FreeBSD: src/sys/pci/if_sf.c,v 1.82.2.3 2005/08/26 14:50:16 jhb Exp $

This results in a stalling network connection (watchdog timeout messages on 
the console), sometimes after 5 Minutes, sometimes after several hours.

Markus

-- 
Markus Brueffer    | GPG-Key: http://people.FreeBSD.org/~markus/markus.asc
markus_at_brueffer.de | FP: 3F9B EBE8 F290 E5CC 1447 8760 D48D 1072 78F8 A8D4
markus_at_FreeBSD.org | FreeBSD: The Power to Serve!

Received on Mon Aug 29 2005 - 11:42:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC