syscons problem in ddb, if_afdata initialization (was: Re: 5.3-BETA3 , panic, probably IPv6+SMP+mpsafenet related)

From: Robert Watson <rwatson_at_freebsd.org> Date: Sat, 4 Sep 2004 13:51:06 -0400 (EDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:10 UTC

The two sentence summary: this appears to involve three problems, one
known.  There's a lock order reversal involving TCP and the "uid/gid" 
layering violation in pf/ipfw, a new issue reported regarding if_afdata
dereferencing from nd6_slowtimo, and an issue involving syscons calling
into the tty code from it's low level DDB console API (should never
happen). 

On Sat, 4 Sep 2004, Jan Srzednicki wrote:

> I'm running a fresh (cvsupped just after the BETA3 commit in newvers.sh) 
> RELENG_5 system. It's a SMP with 2 processors, it's doing some
> home-routing with pf. pf and some netgraph stuff is compiled into
> kernel, otherwise it's mostly GENERIC. 
> 
> It seems to me that the panic is IPv6 and mpsafenet (and probably SMP) 
> related. It happens about an hour after a boot (happened twice, so it's
> repeatable). 

Saw your follow-up e-mail which suggests mpsafenet is not at cause.

> http://wrzask.pl/stuff/smieci/vmcore.2.dmesg

%  1st 0xc165bdec inp (tcpinp) _at_ /usr/src/sys/netinet/tcp_syncache.c:1161
%  2nd 0xc07cd240 pf task mtx (pf task mtx) _at_ /usr/src/sys/contrib/pf/net/pf.c:5828
% KDB: stack backtrace:
% witness_checkorder(c07cd240,9,c075e590,16c4,1) at witness_checkorder+0x5bb
% _mtx_lock_flags(c07cd240,0,c075e590,16c4,c07d70a0) at _mtx_lock_flags+0x54
% pf_test(2,c1379000,d51b09a4,c13f9840,c08090e0) at pf_test+0x87
% pf_check_out(0,d51b09a4,c1379000,2,0) at pf_check_out+0x47
% pfil_run_hooks(c08090e0,d51b0a30,c1379000,2,c08159c4) at pfil_run_hooks+0x14e
% ip_output(c1481b00,0,d51b09fc,0,0) at ip_output+0x3f2
% syncache_respond(d51b0ad0,c0809b80,1c,346,7d) at syncache_respond+0x2d0
% syncache_add(d51b0b80,d51b0bf0,c139d834,d51b0b7c,c138e100) at syncache_add+0x662
% tcp_input(c138e100,14,c1379000,1,c058b034) at tcp_input+0x2df7
% ip_input(c138e100,0,c077cdd8,e5,c08091a0) at ip_input+0xea
% netisr_processqueue(c07d70e0,0,c077cdd8,152,c12e0700) at netisr_processqueue+0x15
% swi_net(0,0,c0770f78,268,d51b0d10) at swi_net+0x97
% ithread_loop(c12b9b80,d51b0d48,c0770d4d,32c,0) at ithread_loop+0x1fd
% fork_exit(c057dab0,c12b9b80,d51b0d48) at fork_exit+0xa9
% fork_trampoline() at fork_trampoline+0x8

This one is a known lock order reversal and there's been some discussion
of how to fix it, but no good fix as yet.  However, in the general case it
should be relatively harmless, as it's a symptom of a potential deadlock
but not apparently easily triggered (as yet).

% --- trap 0x1, eip = 0, esp = 0xd51b0d7c, ebp = 0 ---
% lock order reversal
%  1st 0xc0808240 ifnet (ifnet) _at_ /usr/src/sys/netinet6/nd6.c:1777
%  2nd 0xc07d36c4 user map (user map) _at_ /usr/src/sys/vm/vm_map.c:2997
% KDB: stack backtrace:
% witness_checkorder(c07d36c4,9,c0787eb3,bb5,0) at witness_checkorder+0x5bb
% _sx_xlock(c07d36c4,c0787eb3,bb5,1000108,0) at _sx_xlock+0x5d
% vm_map_lookup(d57e3b5c,0,1,d57e3b60,d57e3b50) at vm_map_lookup+0x3a
% vm_fault(c07d3680,0,1,0,c12e8580) at vm_fault+0x7a
% trap_pfault(8,d57e3c50,c05bd344,c07e2aa0,8) at trap_pfault+0x13e
% trap(c0770018,10,c07e0010,3a00,c1405c00) at trap+0x363
% calltrap() at calltrap+0x5
% --- trap 0xc, eip = 0xc06677a6, esp = 0xd57e3c84, ebp = 0xd57e3ca0 ---

This is the real failure down here, the kernel page faulting; the lock
order reversal is because the VM code tries to process the page fault
running into lock order problems.

What follows though is a bit hard to read.

% nd6_slowtimo(0,8,c0774781,f7,57e49) at nd6_slowtimo+0x66
% softclock(0,0,c0770f78,268,d57e3d10) at softclock+0x12f
% ithread_loop(c12b9c80,d57e3d48,c0770d4d,32c,0) at ithread_loop+0x1fd
% fork_exit(c057dab0,c12b9c80,d57e3d48) at fork_exit+0xa9
% fork_trampoline() at fork_trampoline+0x8
% --- trap 0x1, eip = 0, esp = 0xd57e3d7c, ebp = 0 ---

It looks like this is a cascading problem that started with dereferencing
ND_IFINFO(ifp) in nd6_slowtimo().  if_afdata is initialized on demand by
if_attachdomain1().  It would be interesting to know the contents of *ifp
in the nd6_slowtimo() frame.  This might be a race/bug involving if_afdata
initialization...

% Fatal trap 12: page fault while in kernel mode
% cpuid = 0; apic id = 01
% fault virtual address	= 0x8
% fault code		= supervisor read, page not present
% instruction pointer	= 0x8:0xc06677a6
% stack pointer	        = 0x10:0xd57e3c84
% frame pointer	        = 0x10:0xd57e3ca0
% code segment		= base 0x0, limit 0xfffff, type 0x1b
% 			= DPL 0, pres 1, def32 1, gran 1
% processor eflags	= interrupt enabled, resume, IOPL = 0
% current process		= 36 (swi5: clock sio)
% panic: blockable sleep lock (sleep mutex) tty _at_ /usr/src/sys/kern/kern_event.c:1450
% cpuid = 0
% boot() called on cpu#0
% Uptime: 1h0m18s
% Dumping 127 MB
%  16 32 48 64 80 96 112

This one is pretty unfortunate, but hopefully easily fixable.  Here's my
interpretation:

% #0  doadump () at pcpu.h:159
% #1  0xc0595216 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:396
% #2  0xc0594ceb in panic (fmt=0xc0776fd5 "blockable sleep lock (%s) %s _at_ %s:%d")
%     at /usr/src/sys/kern/kern_shutdown.c:552
% #3  0xc05bd589 in witness_checkorder (lock=0xc13fd910, flags=9, 
%     file=0xc0770940 "/usr/src/sys/kern/kern_event.c", line=1450)
%     at /usr/src/sys/kern/subr_witness.c:703
% #4  0xc058b374 in _mtx_lock_flags (m=0xc13fd910, opts=0, 
%     file=0xc0770940 "/usr/src/sys/kern/kern_event.c", line=1450)
%     at /usr/src/sys/kern/kern_mutex.c:261
% #5  0xc0575559 in knote (list=0xc13fd898, hint=0, islocked=0)
%     at /usr/src/sys/kern/kern_event.c:1450
% #6  0xc05cb793 in ttwwakeup (tp=0xc13fd800) at /usr/src/sys/kern/tty.c:2394
% #7  0xc0715ef8 in scstart (tp=0xc13fd800)
%     at /usr/src/sys/dev/syscons/syscons.c:1367
% #8  0xc0716489 in scgetc (sc=0xc082ef60, flags=3)
%     at /usr/src/sys/dev/syscons/syscons.c:3209
% #9  0xc07167de in sccngetch (flags=3)
%     at /usr/src/sys/dev/syscons/syscons.c:1553
% #10 0xc05d0ad9 in cncheckc () at /usr/src/sys/kern/tty_cons.c:567
% #11 0xc05d0b15 in cngetc () at /usr/src/sys/kern/tty_cons.c:548
% #12 0xc04ab335 in db_readline (lstart=0xc07cea00 "", lsize=120)

The polled console support used by DDB has lead somehow to a call to
ttwwakeup(), which should never happen (TM) using the polled interface.
This is because we can't run all the TTY level code from the debugger,
only low level polled console operation.  I'm not sure who to point at for 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Principal Research Scientist, McAfee Research