On Sun, 10 Dec 2006, Maxim Konovalov wrote: >>> I didn't suggest to turn off mpsafenet forever and forget, I just wanted >>> to check my guess. I would like to help to debug the problem but I need >>> some initial instructions to start. There is a firewire console. What do >>> I need to check? >> >> Start with the information in my followup e-mail to Andrew: >> >> - Configure WITNESS and see if you get any console output regarding >> lock order problems. > > Yes, there is one: > > lock order reversal > 1st 0xd0f277c8 inp (rawinp) _at_ /usr/src/sys/netinet/raw_ip.c > 2nd 0xd0ecbb54 wi0 (network driver) _at_ /usr/src/sys/modules/wi/../../dev/wi/if_wi.c > KDB > db_trace_self_wrapper(ce626f9d) at db_trace_self_wrapper+0x25 > kdb_backtrace(ffffffff,ce6a6378,ce6a6b20,ce65bd24,ce6e4ed0,...) at kdb_backtrace+0x29 > witness_checkorder(d0ecbb54,9,d0e73d13,388) at witness_checkorder+0x4db > _mtx_lock_flags(d0ecbb54,0,d0e73d13,388,ce4d8cdd,...) at _mtx_lock_flags+0x1e > wi_start(d0e05800) at wi_start+0x32 > if_start(d0e05800) at if_start+0x53 > ether_output_frame(d0e05800,d0d18100,0,1,0,...) at ether_output_frame+0x180 > ether_output(d0e05800,d0d18100,d0e652b0,d0e61bb8,ce6e6b18,...) at ether_output+0x3c0 > ieee80211_output(d0e05800,d0d18100,d0e652b0,d0e61bb8,0,...) at ieee80211_output+0x33 > ip_output(d0d18100,0,e1afbb38,20,0,...) at ip_output+0x7f0 > rip_output(d0d18100,d102ee44,1d2722c3,2000,e1afbbf0,...) at rip_output+0x29b > rip_send(d102ee44,0,d0d18100,0,0,...) at rip_send+0x4f > sosend_generic(d102ee44,0,0,d0d18100,0,...) at sosend_generic+0x3e1 > sosend(d102ee44,0,0,d0d18100,0,...) at sosend+0x22 > ng_ksocket_rcvdata(d10ab280,d104f750,1,e1afbc78,0,...) at ng_ksocket_rcvdata+0xa3 > ng_apply_item(d10ab200,d104f750,0,0,d10ab200,...) at ng_apply_item+0xf8 > ngintr(0) at ngintr+0x13d > swi_net(0) at swi_net+0xba > ithread_execute_handlers(d09acb40,d09dba00) at ithread_execute_handlers+0xce > ithread_loop(d09dc180,e1afbd38,ce697af0,0,ce622832,328) at ithread_loop+0x4f > fork_exit(ce4cdf0c,d09dc180,e1afbd38) at fork_exit+0x68 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xe1afbd6c, ebp = 0 --- > > At this point ifconfig wlan0 hangs, reboot hangs. > >> - Try setting net.isr.direct=0 and see if the problem goes away. > > This indeed help. LOR has gone and wireless works. > >> - Try removing options PREEMPTION and see if the problem goes away. > > Haven't try. As speculated by others, this is a bug in the if_wi driver, which improperly holds a device driver lock over a call into the network stack. While this can result in a deadlock under other circumstances, net.isr.direct makes the chances of that deadlock much greater. It appears also that you have netgraph in the mix somehow, which might well also increase the chances of the deadlock triggering. Someone(tm) needs to fix if_wi to operate properly with respect to the network stack lock order; another feature likely to trigger the same device driver bug is IP fast forwarding from a wireless interface. Sam has mentioned to me that this same bug exists in several wireless drivers. Robert N M Watson Computer Laboratory University of CambridgeReceived on Sun Dec 10 2006 - 11:57:15 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:03 UTC