On Wed, 21 Jan 2004, Kris Kennaway wrote: > On Wed, Jan 21, 2004 at 10:21:38AM -0800, Kris Kennaway wrote: > > Well, I managed to panic bento in under 12 hours of use, after Joe had > > been using it for the past 2 months without problems :-) > > It panicked again a few minutes after rebooting: It got a page fault (probably a null pointer) in propagate_priority(). The rest is bugs in the debugging code. It's interesting that it managed to print messages about a bug in siocnputc() (actually mostly in witness) using siocnputc(). > lock order reversal > 1st 0xc06e6864 turnstile chain (turnstile chain) _at_ kern/subr_turnstile.c:233 The turnstile lock seems to be tc->tc_lock. sched_lock is also held in propagate_priority() and doesn't seem to cause any LORs or other problems with serial consoles, but holding it for calls to printf() at least used to be fatal for syscons consoles. > 2nd 0xc0717580 sio (sio) _at_ dev/sio/sio.c:3203 > Stack backtrace: > backtrace(c06923f4,c0717580,c06d2020,c06d2020,c06a0ede) at backtrace+0x17 The LOR is bogus. cnputc() must be callable at almost any time so that it can print debugging messages like this one, so acquiring sio_lock must work no matter what other locks are held. > witness_lock(c0717580,8,c06a0ede,c83,3f8) at witness_lock+0x672 > _mtx_lock_spin_flags(c0717580,0,c06a0ed5,c83,c6758aa8) at _mtx_lock_spin_flags+0xda > siocnputc(c06d21a0,6b,5,e1bffa1c,6b) at siocnputc+0x81 > cnputc(6b,c06e9688,1,c64f1640,c06a51d9) at cnputc+0x7a > putchar(6b,e1bffa1c,c06e97c8,c06e9688,c06e7c80) at putchar+0x6c > kvprintf(c06a51d8,c0529640,e1bffa1c,a,e1bffa3c) at kvprintf+0x8d I think the printf is about a trap with interrupts disabled. That shouldn't happen any more than the page fault, but it is certain to happen for page faults in propagate_priority() in -current, since sched_lock is held and spinlocks mask interrupts too much. > printf(c06a51d8,c,1,c068e464,14b) at printf+0x57 > trap(c06e0018,ffc00010,c06e0010,b4,c64f2740) at trap+0xd7 Got a page fault in propagate_priority(). > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0xc052e9f0, esp = 0xe1bffac0, ebp = 0xe1bffae4 --- > propagate_priority(c64f1640,0,c0691bdc,1dd,c06e7080) at propagate_priority+0x220 > turnstile_wait(0,c06e34e0,c64f13c0,1cc,c06e34e0) at turnstile_wait+0x33d > _mtx_lock_sleep(c06e34e0,0,c06929fb,2e7,c6846500) at _mtx_lock_sleep+0x125 > _mtx_lock_flags(c06e34e0,0,c06929fb,2e7,0) at _mtx_lock_flags+0x98 > kern_select(c64f1640,7,bfbfed40,0,0) at kern_select+0x47 > select(c64f1640,e1bffd14,c06a525a,3ee,5) at select+0x66 > syscall(2f,2f,2f,bfbfedc0,80810a0) at syscall+0x2c0 > Xint0x80_syscall() at Xint0x80_syscall+0x1d > --- syscall (93), eip = 0x28105b8f, esp = 0xbfbfed0c, ebp = 0xbfbfee58 --- > > and then page faulted and went into a hard lockup. Perhaps it recursed some more into the debugging code. I used the following in an old uncommitted version of sio to prevent sio locking going anywhere near witness even in the non-console case: %%% Index: sio.c =================================================================== RCS file: /home/ncvs/src/sys/dev/sio/sio.c,v retrieving revision 1.418 diff -u -2 -r1.418 sio.c --- sio.c 18 Jan 2004 12:26:33 -0000 1.418 +++ sio.c 19 Jan 2004 00:18:06 -0000 _at__at_ -522,7 +640,7 _at__at_ while (sio_inited != 2) if (atomic_cmpset_int(&sio_inited, 0, 1)) { + /* XXX might still need MTX_QUIET. */ mtx_init(&sio_lock, sio_driver_name, NULL, - (comconsole != -1) ? - MTX_SPIN | MTX_QUIET : MTX_SPIN); + MTX_SPIN | MTX_NOWITNESS); atomic_store_rel_int(&sio_inited, 2); } %%% The current version uses lower level locking so witnessing the locking is just unavailable. BruceReceived on Wed Jan 21 2004 - 22:17:39 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:39 UTC