On Thursday 01 September 2005 08:39 pm, Don Lewis wrote: > On 1 Sep, John Baldwin wrote: > > On Thursday 01 September 2005 01:22 pm, Don Lewis wrote: > >> On 1 Sep, Fredrik Lindberg wrote: > >> > I'm seeing both the rentry and the tcpinp LORs on my fxp interface > >> > on a machine running a few days old -current (Aug 25). > >> > > >> > lock order reversal > >> > 1st 0xc1e30d38 inp (tcpinp) _at_ /usr/src/sys/netinet/tcp_input.c:742 > >> > 2nd 0xc1b74018 fxp0 (network driver) > >> > _at_/usr/src/sys/dev/fxp/if_fxp.c:1172 > >> > > >> > lock order reversal > >> > 1st 0xc1e06bb8 rtentry (rtentry) _at_ /usr/src/sys/net/route.c:1269 > >> > 2nd 0xc1b74018 fxp0 (network driver) > >> > _at_/usr/src/sys/dev/fxp/if_fxp.c:1172 > >> > > >> > As for their backtraces they are almost identical to the > >> > once already posted. > >> > >> Are you using any applications that use multicast? Can you break into > >> DDB and capture the output of "show witness"? > > > > Also, are you using DEVICE_POLLING? > > I can reproduce this if I add DEVICE_POLLING to my kernel. And I see > Giant under "network driver" in the output of "show witness". > > If I apply your witness patch: > http://www.FreeBSD.org/~jhb/patches/witness.patch > then I get the following LOR: > > lock order reversal > 1st 0xc23e2018 fxp0 (network driver) _at_ /usr/src/sys/dev/fxp/if_fxp.c:1907 > 2nd 0xc09387e0 Giant (Giant) _at_ /usr/src/sys/kern/kern_poll.c:460 > KDB: stack backtrace: > kdb_backtrace(0,ffffffff,c0946470,c0947f28,c08d3a84) at kdb_backtrace+0x29 > witness_checkorder(c09387e0,9,c086d0d3,1cc) at witness_checkorder+0x53c > _mtx_lock_flags(c09387e0,0,c086d0d3,1cc) at _mtx_lock_flags+0x5b > ether_poll_deregister(c23de000,c23e2000,c23e2018,0,e9295b60) at > ether_poll_deregister+0x1d fxp_stop(c23e2000,c23e2018,1,c084c9ff,787) at > fxp_stop+0x21 > fxp_init_body(c23e2000,c23e2018,0,c084c9ff,773) at fxp_init_body+0x31 > fxp_init(c23e2000,8020690c,c23e2000,c264bb00,e9295bc0) at fxp_init+0x23 > ether_ioctl(c23de000,8020690c,c264bb00,0,c264bb00) at ether_ioctl+0x50 > fxp_ioctl(c23de000,8020690c,c264bb00,1,c0a86503) at fxp_ioctl+0x232 > in_ifinit(c23de000,c264bb00,c24b3490,0,e9295c38) at in_ifinit+0x206 > in_control(c270fde8,8040691a,c24b3480,c23de000,c248e900) at > in_control+0x882 ifioctl(c270fde8,8040691a,c24b3480,c248e900,0) at > ifioctl+0x198 > soo_ioctl(c2647dc8,8040691a,c24b3480,c2271d00,c248e900) at soo_ioctl+0x2db > ioctl(c248e900,e9295d04,3,1,286) at ioctl+0x370 > syscall(3b,3b,3b,8056e40,8059140) at syscall+0x22f > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x48136e4b, esp = 0xbfbfe5ec, > ebp = 0xbfbfee38 --- fxp0: link state changed to UP Yeah, because of this bug, DEVICE_POLLING really needs debug.mpsafenet=0. Perhaps someone should add NET_NEEDS_GIANT(polling) to sys/kern/kern_poll.c for now? The problem is that the polling code needs to use something other than Giant to protect its internal data that it accesses in ether_poll_deregister() since all the drivers I've seen call ether_poll_deregister() with the driver lock held. > I also get another LOR: > > cd0: Attempt to query device size failed: NOT READY, Medium not present > lock order reversal > 1st 0xe35e0cc4 g_xdown (g_xdown) _at_ /usr/src/sys/geom/geom_io.c:465 > 2nd 0xc09387e0 Giant (Giant) _at_ /usr/src/sys/geom/geom_disk.c:99 > KDB: stack backtrace: > kdb_backtrace(0,ffffffff,c0945e30,c0947f28,c08d3a84) at kdb_backtrace+0x29 > witness_checkorder(c09387e0,9,c0866bc0,63) at witness_checkorder+0x53c > _mtx_lock_flags(c09387e0,0,c0866bc0,63) at _mtx_lock_flags+0x5b > g_disk_start(c2632a50,e35e0cc4,0,c086722e,1d1) at g_disk_start+0x152 > g_io_schedule_down(c2275480) at g_io_schedule_down+0x160 > g_down_procbody(0,e35e0d38,0,c0606960,0) at g_down_procbody+0x5a > fork_exit(c0606960,0,e35e0d38) at fork_exit+0xa0 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xe35e0d6c, ebp = 0 --- > Trying to mount root from ufs:/dev/da0s1a Hummmm. That means if anyone does a msleep(g_xdown) while holding Giant then it could deadlock on resume since msleep() always acquires Giant first. Perhaps g_xdown should be an sx lock or some such. -- John Baldwin <jhb_at_FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.orgReceived on Fri Sep 02 2005 - 16:51:12 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC