Re: RELENG_5 : Page fault when running quagga ospf with IPv6

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Wed, 1 Sep 2004 16:03:59 -0400
On Wednesday 25 August 2004 05:57 pm, pg_at_imp.ch wrote:
> System with RELENG_5 (Aug 25)
>
> When starting /usr/local/sbin/ospf6d (from /usr/ports/net/quagga)
> I instantly get a kernel page fault. Reproducable. Trace:
>
> #0  doadump () at pcpu.h:159
> [...]
> #22 0x0000000c in ?? ()
> #23 0x00000000 in ?? ()
> #24 0xc055954e in propagate_priority (td=0xc1dcc420) at
> /usr/src/sys/kern/subr_turnstile.c:243
> #25 0xc0559ded in turnstile_wait (ts=0xc1dc5500, lock=0xc076b640,
> owner=0x0) at /usr/src/sys/kern/subr_turnstile.c:556
> #26 0xc0529ca7 in _mtx_lock_sleep (m=0xc076b640, td=0xc1d8e6e0, opts=0,
> file=0x0, line=0)
>      at /usr/src/sys/kern/kern_mutex.c:540
> #27 0xc051ac9f in ithread_loop (arg=0xc1d9e880) at
> /usr/src/sys/kern/kern_intr.c:545
> #28 0xc0519afd in fork_exit (callout=0xc051ab03 <ithread_loop>, arg=0x0,
> frame=0x0)
>      at /usr/src/sys/kern/kern_fork.c:820
> #29 0xc06bc8ac in fork_trampoline () at
> /usr/src/sys/i386/i386/exception.s:209
>
> Frame 24:
> 238                         ts->ts_lockobj->lo_name));
> 239
> 240                     /*
> 241                      * Pick up the lock that td is blocked on.
> 242                      */
> 243                     ts = td->td_blocked;
> 244                     MPASS(ts != NULL);
> 245                     tc = TC_LOOKUP(ts->ts_lockobj);
> 246                     mtx_lock_spin(&tc->tc_lock);
> 247
>
> ts->td_blocked is NULL ...
>
> kgdb) print *td
> $2 = {td_proc = 0xc1ddc000, td_ksegrp = 0xc1d97000, td_plist = {tqe_next =
> 0x0, tqe_prev = 0xc1ddc010}, td_kglist = {
>      tqe_next = 0x0, tqe_prev = 0xc1d9701c}, td_slpq = {tqe_next = 0x0,
> tqe_prev = 0x0}, td_lockq = {tqe_next = 0x0,
>      tqe_prev = 0x0}, td_runq = {tqe_next = 0x0, tqe_prev = 0x0}, td_selq =
> {tqh_first = 0x0, tqh_last = 0x0},
>    td_sleepqueue = 0xc1dc7360, td_turnstile = 0xc1dc5540, td_tid = 100036,
> td_flags = 0, td_inhibitors = 16,
>    td_pflags = 0, td_last_kse = 0xc1d95690, td_kse = 0xc1d95690, td_dupfd =
> 0, td_wchan = 0x0, td_wmesg = 0x0,
>    td_lastcpu = 0 '\0', td_oncpu = 255 ', td_locks = 0, td_blocked = 0x0,
> td_ithd = 0xc1d8cc00, td_lockname = 0x0,
>    td_contested = {lh_first = 0xc2a72400}, td_sleeplocks = 0x0,
> td_intr_nesting_level = 0, td_pinned = 0,
>    td_mailbox = 0x0, td_ucred = 0xc1d7c200, td_standin = 0x0, td_prticks =
> 0, td_upcall = 0x0, td_sticks = 0,
>    td_uuticks = 0, td_usticks = 0, td_intrval = 0, td_oldsigmask = {__bits
> = {0, 0, 0, 0}}, td_sigmask = {__bits = {0,
>        0, 0, 0}}, td_siglist = {__bits = {0, 0, 0, 0}}, td_waitset = 0x0,
> td_umtx = {tqe_next = 0x0, tqe_prev = 0x0},
>    td_generation = 7664, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags =
> 0}, td_kflags = 0, td_xsig = 0,
>    td_profil_addr = 0, td_profil_ticks = 0, td_base_pri = 40 '(',
> td_priority = 24 '\030', td_pcb = 0xd55c7da0,
>    td_state = TDS_INHIBITED, td_retval = {0, 0}, td_slpcallout = {c_links =
> {sle = {sle_next = 0x0}, tqe = {
>          tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func
> = 0, c_flags = 8}, td_frame = 0xd55c7d48,
>    td_kstack_obj = 0xc143e210, td_kstack = 3579600896, td_kstack_pages = 2,
> td_altkstack_obj = 0x0, td_altkstack = 0,
>    td_altkstack_pages = 0, td_critnest = 1, td_md = {md_savecrit = 582},
> td_sched = 0xc1dcc57c}
>
> Any ideas ?

Seems the thread has an inhibitor of TDI_IWAIT which it should never be in 
when it gets to this point.  Perhaps a lock was leaked from an interrupt 
handler.  Running with INVARIANTS + WITNESS might help to catch this.  
WITNESS especially might help as it would give a warning about what handler 
returns with a lock held and which lock and where the lock was acquired.

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Wed Sep 01 2004 - 19:13:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:10 UTC