reproducible panic in netisr

From: Navdeep Parhar <np_at_FreeBSD.org>
Date: Tue, 4 Aug 2009 22:58:06 +0000
This occurs on today's HEAD + some unrelated patches.  That makes it
8.0BETA2+ code.  I haven't tried older builds.

The system panics everytime I try to run netpipe on localhost:
# NPtcp &
# NPtcp -h localhost

Here are the details:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x318
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80575aea
stack pointer	        = 0x28:0xffffff803e1b65a0
frame pointer	        = 0x28:0xffffff803e1b65f0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= resume, IOPL = 0
current process		= 1093 (NPtcp)


(kgdb) where
#0  doadump () at pcpu.h:223
#1  0xffffffff801f424c in db_fncall (dummy1=Variable "dummy1" is not available.
) at /usr/src/sys/ddb/db_command.c:548
#2  0xffffffff801f4581 in db_command (last_cmdp=0xffffffff80c1e920, cmd_table=Variable "cmd_table" is not available.
) at /usr/src/sys/ddb/db_command.c:445
#3  0xffffffff801f47c9 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
#4  0xffffffff801f6657 in db_trap (type=Variable "type" is not available.
) at /usr/src/sys/ddb/db_main.c:229
#5  0xffffffff805b2b22 in kdb_trap (type=12, code=0, tf=0xffffff803e1b64f0) at /usr/src/sys/kern/subr_kdb.c:534
#6  0xffffffff8085ba1e in trap_fatal (frame=0xffffff803e1b64f0, eva=Variable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:847
#7  0xffffffff8085c701 in trap (frame=0xffffff803e1b64f0) at /usr/src/sys/amd64/amd64/trap.c:345
#8  0xffffffff80843327 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224
#9  0xffffffff80575aea in _mtx_lock_sleep (m=0xffffffff8144d867, tid=18446742974247061280, opts=Variable "opts" is not available.
)
    at /usr/src/sys/kern/kern_mutex.c:407
#10 0xffffffff80575d51 in _mtx_lock_flags (m=0xffffffff8144d867, opts=0, 
    file=0xffffffff8096b355 "/usr/src/sys/net/netisr.c", line=830) at /usr/src/sys/kern/kern_mutex.c:203
#11 0xffffffff8063430d in netisr_queue_internal (proto=1, m=0xffffff0011b19d00, cpuid=Variable "cpuid" is not available.
)
    at /usr/src/sys/net/netisr.c:830
#12 0xffffffff806343e8 in netisr_queue_src (proto=1, source=Variable "source" is not available.
) at /usr/src/sys/net/netisr.c:860
#13 0xffffffff806306a4 in if_simloop (ifp=0xffffff0002705800, m=0xffffff0011b19d00, af=2, hlen=0)
    at /usr/src/sys/net/if_loop.c:368
#14 0xffffffff806307b9 in looutput (ifp=0xffffff0002705800, m=0xffffff0011b19d00, dst=0xffffff803e1b67a0, ro=Variable "ro" is not available.
)
    at /usr/src/sys/net/if_loop.c:265
#15 0xffffffff80691158 in ip_output (m=0xffffff0011b19d00, opt=Variable "opt" is not available.
) at /usr/src/sys/netinet/ip_output.c:618
#16 0xffffffff806f46d2 in tcp_output (tp=0xffffff00119db370) at /usr/src/sys/netinet/tcp_output.c:1187
#17 0xffffffff80700b08 in tcp_usr_send (so=0xffffff0011bbf7f8, flags=0, m=Variable "m" is not available.
) at tcp_offload.h:282
#18 0xffffffff805ea734 in sosend_generic (so=0xffffff0011bbf7f8, addr=0x0, uio=0xffffff803e1b6b00, 
    top=0xffffff000273ee00, control=0x0, flags=Variable "flags" is not available.
) at /usr/src/sys/kern/uipc_socket.c:1259
#19 0xffffffff805ce1de in soo_write (fp=Variable "fp" is not available.
) at /usr/src/sys/kern/sys_socket.c:102
#20 0xffffffff805c783c in dofilewrite (td=0xffffff0002edc720, fd=3, fp=0xffffff0002b05550, auio=0xffffff803e1b6b00, 
    offset=Variable "offset" is not available.
) at file.h:239
#21 0xffffffff805c8cf5 in kern_writev (td=0xffffff0002edc720, fd=3, auio=0xffffff803e1b6b00)
    at /usr/src/sys/kern/sys_generic.c:446
#22 0xffffffff805c8de0 in write (td=Variable "td" is not available.
) at /usr/src/sys/kern/sys_generic.c:362
#23 0xffffffff8085bf5a in syscall (frame=0xffffff803e1b6c80) at /usr/src/sys/amd64/amd64/trap.c:984
#24 0xffffffff80843601 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373
#25 0x000000080073046c in ?? ()


(kgdb) f 9
#9  0xffffffff80575aea in _mtx_lock_sleep (m=0xffffffff8144d867, tid=18446742974247061280, opts=Variable "opts" is not available.
)
    at /usr/src/sys/kern/kern_mutex.c:407
407			owner = (struct thread *)(v & ~MTX_FLAGMASK);



(kgdb) info locals
ts = (struct turnstile *) 0xffffff0002edda80
v = 144
owner = (volatile struct thread *) 0x90
spin_cnt = 1
sleep_cnt = 0
sleep_time = 0


(kgdb) p *m
$18 = {
  lock_object = {
    lo_name = 0xffffffff8096b331 "netisr_mtx", 
    lo_flags = 16973824, 
    lo_data = 0, 
    lo_witness = 0xffffff800021f680
  }, 
  mtx_lock = 4
}


The fault address makes sense.  TD_IS_RUNNING() accesses owner->td_state
(0x90 + 0x288) = 0x318.  But I'm not sure why v, which was read earlier on
line 388 (v = m->mtx_lock), would ever be 0x90.  My reading is that it
can only be a tid or some combo of the 3 least significant bits (going by
MTX_FLAGMASK).  At the point of the dump, m->mtx_lock is 4 (MTX_UNOWNED).

Regards,
Navdeep
Received on Tue Aug 04 2009 - 20:58:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:53 UTC