RE: Lock order reversal in 5.2-CURRENT

From: Terrence Koeman <root_at_mediamonks.net>
Date: Wed, 11 Aug 2004 15:28:10 +0200
I think something else is wrong, as I get different lock order reversals and
some other errors that all lockup the box. Earlier I had a corrupted cc
binary after a buildworld.

Everything points to a hardware failure somewhere, but I already switched
the hardware before this happened, I swapped RAID arrays in identical
machines, and the machine where -CURRENT runs on now was a production server
that ran 4.9/4.10-STABLE for months under heavy load without any problems
whatsoever.

The following is what I got today:

Second bad
/: bad dir ino 16110954 at offset 24: mangled entry
panic: ufs_dirbad: bad dir
KDB: stack backtrace:
kdb_backtrace(c05fb5b8,c0646f80,c060ae08,de301814,100) at kdb_backtrace+0x2e
panic(c060ae08,c1738200,f5d56a,18,c060adc2) at panic+0xb7
ufs_dirbad(c2180118,18,c060adc2,0,de301890) at ufs_dirbad+0x50
ufs_lookup(de301950,de30198c,c0501c99,de301950,de301bf0) at ufs_lookup+0x457
ufs_vnoperate(de301950,de301bf0,de301c04,c1c61420,c1aa0b00) at
ufs_vnoperate+0x18
vfs_cache_lookup(de3019d0,de3019ec,c05072c2,de3019d0,20002) at
vfs_cache_lookup+0xe9
ufs_vnoperate(de3019d0,20002,c1aa0b00,de3019d0,c1aa0b00) at
ufs_vnoperate+0x18
lookup(de301bdc,0,c0602e3d,a4,c1aa0b00) at lookup+0x332
namei(de301bdc,c064ab40,246,9,c1aa0b00) at namei+0x2ae
vn_open_cred(de301bdc,de301cdc,1a4,c1c2e600,3) at vn_open_cred+0x24b
vn_open(de301bdc,de301cdc,1a4,3,c04d1ea0) at vn_open+0x33
kern_open(c1aa0b00,bfbfd440,0,1,1b6) at kern_open+0xf2
open(c1aa0b00,de301d14,c,c04d02b2,3) at open+0x30
syscall(2f,2f,2f,8084d7b,4) at syscall+0x2e0
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (5, FreeBSD ELF32, open), eip = 0x80662a7, esp = 0xbfbfd3ec, ebp
= 0xbfbfd418 ---
KDB: enter: panic
[thread 100096]
Stopped at      kdb_enter+0x30: leave
db> 

panic: mtx_lock() of spin mutex ае _at_ /usr/src/sys/vm/vm_object.c:1587
KDB: stack backtrace:
kdb_backtrace(c05fb5b8,c0646f80,c05fa793,d4da6bb0,100) at kdb_backtrace+0x2e
panic(c05fa793,c1009ca0,c060c69a,633,c064c098) at panic+0xb7
_mtx_lock_flags(c100939c,0,c060c69a,633,c06729c0) at _mtx_lock_flags+0x69
vm_object_collapse(c1a0f000,0,c060bde0,900,c058e2fe) at
vm_object_collapse+0x5b
vm_map_copy_entry(c154a940,c154bb90,c19b9618,c1aa0348,c04d2787) at
vm_map_copy_entry+0x9f
vmspace_fork(c154a940,1,c060b912,26e,c1542aa0) at vmspace_fork+0x30f
vm_forkproc(c157ab00,c1a64000,c178c000,14,753) at vm_forkproc+0xee
fork1(c157ab00,14,0,d4da6cdc,c157ab00) at fork1+0xfd9
fork(c157ab00,d4da6d14,c04a8825,c1539e00,0) at fork+0x29
syscall(280b002f,284c002f,bfbf002f,1,2) at syscall+0x2e0
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (2, FreeBSD ELF32, fork), eip = 0x283dd42f, esp = 0xbfbfecdc,
ebp = 0xbfbfed08 ---
KDB: enter: panic
[thread 100041]
Stopped at      kdb_enter+0x30: leave
db> 

lock order reversal
 1st 0xc0645aa0 sched lock (sched lock) _at_
/usr/src/sys/kern/subr_sleepqueue.c:623
 2nd 0xc06486ec sleepq chain (sleepq chain) _at_
/usr/src/sys/kern/subr_sleepqueue.c:223
KDB: stack backtrace:
kdb_backtrace(c05fe96f,c06486ec,c05fdc8a,c05fdc8a,c05fdc97) at
kdb_backtrace+0x2e
witness_checkorder(c06486ec,9,c05fdc97,df,67e) at witness_checkorder+0x6a6
_mtx_lock_spin_flags(c06486ec,0,c05fdc97,df,0) at _mtx_lock_spin_flags+0x8d
sleepq_lookup(c0641d80,c05fe5e3,687,c0645aa0,d3bbc9ec) at sleepq_lookup+0x57
sleepq_broadcast(c0641d80,0,ffffffff,d3bbca14,c04b4152) at
sleepq_broadcast+0x31
wakeup(c0641d80,1,c05fbc09,179,c1a632c0) at wakeup+0x21
setrunnable(c1a632c0,0,c05fdc97,26f,c066ef84) at setrunnable+0xb2
sleepq_resume_thread(c1a632c0,ffffffff,c05fdc97,31e,c2259540) at
sleepq_resume_thread+0xa0
sleepq_remove(c1a632c0,c066ef84,c05fef7d,464,c2259540) at
sleepq_remove+0x117
doselwakeup(c2259540,58,d3bbcaa8,c04f1771,c2259540) at doselwakeup+0x110
selwakeuppri(c2259540,58,c060126c,18e,c20f18c0) at selwakeuppri+0x18
sowakeup(c22594f0,c2259540,c0606e75,4fd,0) at sowakeup+0x41
tcp_input(c1915500,14,f,0,14) at tcp_input+0x1350
ip_input(c1915500,0,c0605194,1d0,c1791318) at ip_input+0x712
transmit_event(c1791300,0,c0605194,300,c0670300) at transmit_event+0x128
dummynet(0,0,c05fc4ec,fd,0) at dummynet+0x138
softclock(0,0,c05f8d63,263,c1545534) at softclock+0x20e
ithread_loop(c1539580,d3bbcd48,c05f8b5a,32b,0) at ithread_loop+0x172
fork_exit(c04944a0,c1539580,d3bbcd48) at fork_exit+0xc7
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xd3bbcd7c, ebp = 0 ---
KDB: enter: witness_checkorder
[thread 100024]
Stopped at      kdb_enter+0x30: leave
db> 

Fatal trap 18: integer divide fault while in kernel mode
instruction pointer     = 0x8:0xc056b238
stack pointer           = 0x10:0xde08c91c
frame pointer           = 0x10:0xde08c980
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 14276 (httpd)
[thread 100058]
Stopped at      softdep_setup_freeblocks+0x408: divl    0xb8(%ecx),%eax
db>


Does someone have any ideas?

-- 
Regards,
Terrence Koeman
 
MediaMonks B.V. (www.mediamonks.com)
Please quote all replies in correspondence.     

Received on Wed Aug 11 2004 - 11:28:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:05 UTC