Re: CURRENT freezes on Laitude D520

From: Tai-hwa Liang <avatar_at_mmlab.cse.yzu.edu.tw> Date: Mon, 11 Dec 2006 21:50:14 +0800 (CST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:03 UTC

On Sun, 10 Dec 2006, Robert Watson wrote:
>
> On Sun, 10 Dec 2006, Tai-hwa Liang wrote:
>
>>> which get a bit more to the heart of most problems.  debug.mpsafenet=1 
>>> really exists for the purposes of supporting components which are not 
>>> sufficiently locked to allow the stack to run MPSAFE, rather than as a 
>>> means of disabling direct dispatch and preemption, which speak to 
>>> different types of problems. The main reason that I haven't removed the 
>>> administrator tunable to date is that I suspect it will be quite helpful 
>>> when KAME IPSEC locking happens, but since that appears not to have 
>>> happened yet, debug.mpsafenet as an option is likely causing more harm 
>>> than good by being available as a stand-in sysctl masking other problems, 
>>> causing people to not get to the point of properly identifying the actual 
>>> cause (device driver bugs, etc).
>> 
>> Can the aforementioned tricks(1/2/3) being applied to RELENG_6 as well?
>
> WITNESS is available in RELENG_6, and should be used in combination with 
> INVARIANTS, DDB, KDB, and BREAK_TO_DEBUGGER to debug deadlocks.

   Would a kernel with WITNESS/[KD]DB/BREAK_TO_DEBUGGER enabled but w/o
INVARIANTS compiled adequate to dump useful information through remote
serial console?

> In RELENG_6, net.isr.direct is not enabled by default, so unless you've 
> enabled it yourself (or are using IP fast forwarding, which is functionally 
> similar), that won't apply.
>
> In RELENG_6, PREEMPTION is in GENERIC and hence enabled by default, and it 
> can be disabled by removing it from your kernel configuration.  I'd like it 
> if we could add a run-time sysctl to disable preemption even if PREEMPTION is 
> compiled in, as it would make it easier to explore its stability and 
> performance impact.  However, this is also just a debugging step to see if 
> that quiesces the problem, and not a fix for the actual bug.
>
> Right now, we're discussing removing the manual debug.mpsafenet configuration 
> flag from 7.x, and not 6.x.  I fully recognize the importance of having it in 
> place as a workaround for bugs in production, although it concerns me greatly 
> that we're not getting these problems debugged and fixed, and instead masking 
> them.  Architectural changes are on the way that will require these bugs to 
> be fixed properly, not just masked.
>
>> We are using RELENG_6 as our production server(postfix, squid, pf 
>> firewall/NAT, FAST_IPSEC VPN, ...), which is a dual Athlon MP board with 
>> three NICs(two fxp cards and one onboard xl, connected to three different 
>> networks).
>> 
>> I haven't try WITNESS, yet; however, I'm very sure that net.isr.direct=0 
>> plus that there is no PREEMPTION in current kernel.  The problem is that, 
>> with debug.mpsafenet=1, we'll always run into hard freeze w/o having any 
>> kdb> prompt on console.
>> 
>> Whilst turning debug.mpsafenet off only masks the real problem, I'm still 
>> wondering about if there is any less damaging way to track such problem 
>> down in a _production_ environment.
>
> It sounds like you need to follow the instructions for kernel debugging. 
> Depending on your tolerance of performance loss, downtime, etc, a good 
> starting point is to configure the kernel with INVARIANTS and WITNESS. 
> WITNESS is particularly important, if you can tolerate the performance hit, 
> as it warns of potential deadlocks, not just actual deadlocks.  Also, compile

   With WITNESS enabled(debug.mpsafenet=0), I got another three pf related
warnings in the last 8 hours:

#
# this one looks like lor#046; however, I'm not sure if it is safe to drop
# pf_task_mtx with PF_UNLOCK before invoking pf_test() in pf_check_out().
#
lock order reversal:
1st 0xc06fca2c tcp (tcp) _at_ netinet/tcp_input.c:625
2nd 0xc69a1060 pf task mtx (pf task mtx) _at_ /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(ffffffff,c06bc0d0,c06ba6b8,c06847a4,c06f8b00,...) at kdb_backtrace+0x29
witness_checkorder(c69a1060,9,c699e609,18f2) at witness_checkorder+0x4cd
_mtx_lock_flags(c69a1060,0,c699e609,18f2,0,...) at _mtx_lock_flags+0x1e
pf_test(2,c64c6800,e4f6cae0,0,0,...) at pf_test+0x6b
pf_check_out(0,e4f6cae0,c64c6800,2,0,...) at pf_check_out+0x3d
pfil_run_hooks(c06fc5e0,e4f6cb5c,c64c6800,2,0,...) at pfil_run_hooks+0xb3
ip_output(c64e0300,0,e4f6cb28,0,0,...) at ip_output+0x75e
tcp_respond(0,c650a030,c650a044,c64e0300,eb232182,0,14) at tcp_respond+0x22a
tcp_input(c64e0300,14,8,121245cb,1,...) at tcp_input+0x2559
ip_input(c64e0300) at ip_input+0x729
netisr_processqueue(c06fa298) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0x8c
ithread_execute_handlers(c63ff430,c644a080) at ithread_execute_handlers+0xce
ithread_loop(c63b6b10,e4f6cd38,c06aafc0,0,c06514db,...) at ithread_loop+0x4f
fork_exit(c04dd70c,c63b6b10,e4f6cd38) at fork_exit+0x61
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe4f6cd6c, ebp = 0 ---

lock order reversal:
1st 0xc6b3984c inp (tcpinp) _at_ netinet/tcp_usrreq.c:368
2nd 0xc69a1060 pf task mtx (pf task mtx) _at_ /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(ffffffff,c06bc0a8,c06ba6b8,c06847a4,c06f8b38,...) at kdb_backtrace+0x29
witness_checkorder(c69a1060,9,c699e609,18f2) at witness_checkorder+0x4cd
_mtx_lock_flags(c69a1060,0,c699e609,18f2,0,...) at _mtx_lock_flags+0x1e
pf_test(2,c6568800,e8ef7b40,0,c6b397bc,...) at pf_test+0x6b
pf_check_out(0,e8ef7b40,c6568800,2,c6b397bc,...) at pf_check_out+0x3d
pfil_run_hooks(c06fc5e0,e8ef7bbc,c6568800,2,c6b397bc,...) at 
pfil_run_hooks+0xb3
ip_output(c66d3400,0,e8ef7b88,0,0,c6b397bc) at ip_output+0x75e
tcp_output(c6cf8740) at tcp_output+0xd51
tcp_usr_connect(c6ca242c,c6c25260,c6b48600) at tcp_usr_connect+0xe3
soconnect(c6ca242c,c6c25260,c6b48600) at soconnect+0x4e
kern_connect(c6b48600,4,c6c25260,c6c25260,0,...) at kern_connect+0xbb
connect(c6b48600,e8ef7d04) at connect+0x2f
syscall(3b,3b,3b,804f380,80521a0,...) at syscall+0x22f
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (98, FreeBSD ELF32, connect), eip = 0x28126fbf, esp = 0xbfbfd26c, ebp = 0xbfbfd3d8 ---

lock order reversal:
1st 0xc6d667d0 ipsec request (ipsec request) _at_ netipsec/xform_esp.c:897
2nd 0xc69a1060 pf task mtx (pf task mtx) _at_ /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(ffffffff,c06ba4b0,c06ba6b8,c06847a4,c06f8ac8,...) at kdb_backtrace+0x29
witness_checkorder(c69a1060,9,c699e609,18f2) at witness_checkorder+0x4cd
_mtx_lock_flags(c69a1060,0,c699e609,18f2,0,...) at _mtx_lock_flags+0x1e
pf_test(2,c64c6800,e4f6c7cc,0,0,...) at pf_test+0x6b
pf_check_out(0,e4f6c7cc,c64c6800,2,0,...) at pf_check_out+0x3d
pfil_run_hooks(c06fc5e0,e4f6c848,c64c6800,2,0,...) at pfil_run_hooks+0xb3
ip_output(c64e4e00,0,e4f6c814,2,0,...) at ip_output+0x75e
ipsec_process_done(c64e4e00,c6d66780,c6f59e88,c6f4a240,c0693460,...) at ipsec_process_done+0x120
esp_output_cb(c6f59e88) at esp_output_cb+0x1ed
crypto_done(c6f59e88,2,c6452000,c6f59e88,c6f4a264,...) at crypto_done+0xa0
swcr_process(0,c6f59e88,0,c04e7b7d,c0693460,...) at swcr_process+0x126
crypto_invoke(c6452000,c6f59e88,0) at crypto_invoke+0x9e
crypto_dispatch(c6f59e88,1a,c6f59e88,c6f5be38,c6f0d488,...) at crypto_dispatch+0x4a
esp_output(c64e4e00,c6d66780,0,14,9,...) at esp_output+0x4ee
ipsec4_process_packet(c64e4e00,c6d66780,0,0,c6786400,...) at ipsec4_process_packet+0x206
ip_output(c64e4e00,0,c6de9bb0,0,0,...) at ip_output+0x6df
in_gif_output(c6539000,2,c64e4e00,c6de9b84,c6de9b80,...) at in_gif_output+0x237
gif_output(c6539000,c6b95300,c6d73790,c6ba04a4,c6e1de00,...) at gif_output+0x20f
ip_output(c6b95300,0,e4f6caf8,0,0,...) at ip_output+0x8de
syncache_respond(c6d5eaf0,c6b95300) at syncache_respond+0x2a1
syncache_add(e4f6cbec,e4f6cc3c,c694a86c,e4f6cbe8,c6912b00,e4f6cc3c,c694a880,8,1) at syncache_add+0x37f
tcp_input(c6912b00,14,8,c065b728,0,...) at tcp_input+0x6dc
ip_input(c6912b00) at ip_input+0x729
netisr_processqueue(c06fa298) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0x8c
ithread_execute_handlers(c63ff430,c644a080) at ithread_execute_handlers+0xce
ithread_loop(c63b6b10,e4f6cd38,c06aafc0,0,c06514db,...) at ithread_loop+0x4f
fork_exit(c04dd70c,c63b6b10,e4f6cd38) at fork_exit+0x61
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe4f6cd6c, ebp = 0 ---

> the kernel with KDB, DDB, and BREAK_TO_DEBUGGER, and user a serial or 
> firewire console.  If the hang occurs, see if you can get into the debugger, 
> in which case the logged output from DDB for the following commands would be 
> very useful:
>
> show pcpu
> show allpcpu
> trace
> alltrace
> ps
> show locks
> show alllocks
> show lockedvnods
> show uma
> show malloc
>
> Please open a PR that describes your configuration, includes your kernel 
> config (since it seems quite customized), any loader.conf settings, a 
> detailed description of the problem, and the output.  I'd be quite interested

   Okay, I'll file a PR once I can collect more information with the serial
console(probably weekend).  For now our system administrator is pretty
nervous about my suggestion on turning debug.mpsafenet back to 1. ;)

> in know, once the machine is in a hung state, whether the numlock light goes 
> on and off when you hit the numlock key on the keyboard.

   The numlock light doesn't respond to the key when the machine is hanging;
hence Ctrl-Alt-Esc wouldn't break to debugger.

-- 
Thanks,

Tai-hwa Liang