Re: panic: vm_fault: fault on nofault entry

From: Glen Barber <gjb_at_FreeBSD.org>
Date: Mon, 10 Mar 2014 14:05:08 -0400
On Mon, Mar 10, 2014 at 11:51:15AM -0400, Glen Barber wrote:
> On Mon, Mar 10, 2014 at 05:46:06PM +0200, Konstantin Belousov wrote:
> > On Sun, Mar 09, 2014 at 02:16:57PM -0400, Glen Barber wrote:
> > > panic: vm_fault: fault on nofault entry, addr: fffffe03becbc000
> > 
> > I see, this panic is for access to the kernel map, not for the direct map.
> > I think that this is a race with other CPU unmapping some page in the
> > kernel map, which cannot be solved by access checks.
> > 
> > Please try the following.  I booted with the patch and checked that
> > kgdb /boot/kernel/kernel /dev/mem works, but did not tried to reproduce
> > the issue.
> > 
> 
> Thank you for looking into this.  I will report back.
> 

The machine this was tested paniced again, but a bit differently.

This is the kgdb session from this vmcore:


Script started on Mon Mar 10 17:58:33 2014
command: /bin/sh
# kgdb ./kernel.debug /var/crash/vmcore.last
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
Sleeping thread (tid 100702, pid 24712) owns a non-sleepable lock
KDB: stack backtrace of thread 100702:
sched_switch() at sched_switch+0x29e/frame 0xfffffe18390b8820
mi_switch() at mi_switch+0xe1/frame 0xfffffe18390b8860
sleepq_catch_signals() at sleepq_catch_signals+0xab/frame 0xfffffe18390b88e0
sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfffffe18390b8910
_sleep() at _sleep+0x2a3/frame 0xfffffe18390b8990
pipe_read() at pipe_read+0x34a/frame 0xfffffe18390b89f0
dofileread() at dofileread+0x95/frame 0xfffffe18390b8a40
kern_readv() at kern_readv+0x68/frame 0xfffffe18390b8a90
sys_read() at sys_read+0x63/frame 0xfffffe18390b8ae0
amd64_syscall() at amd64_syscall+0x3fb/frame 0xfffffe18390b8bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe18390b8bf0
--- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800b8443a, rsp = 0x7fffffffac88, rbp = 0x7fffffffb500 ---
panic: sleeping thread
cpuid = 19
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe18392db010
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe18392db0c0
panic() at panic+0x155/frame 0xfffffe18392db140
propagate_priority() at propagate_priority+0x259/frame 0xfffffe18392db170
turnstile_wait() at turnstile_wait+0x3fe/frame 0xfffffe18392db1c0
__mtx_lock_sleep() at __mtx_lock_sleep+0x163/frame 0xfffffe18392db240
vm_map_lookup() at vm_map_lookup+0x38/frame 0xfffffe18392db2c0
vm_fault_hold() at vm_fault_hold+0xd1/frame 0xfffffe18392db510
vm_fault() at vm_fault+0x77/frame 0xfffffe18392db550
trap_pfault() at trap_pfault+0x199/frame 0xfffffe18392db5f0
trap() at trap+0x4a0/frame 0xfffffe18392db800
calltrap() at calltrap+0x8/frame 0xfffffe18392db800
--- trap 0xc, rip = 0xffffffff80d972cd, rsp = 0xfffffe18392db8c0, rbp = 0xfffffe18392db920 ---
copyin() at copyin+0x3d/frame 0xfffffe18392db920
pipe_write() at pipe_write+0x10ea/frame 0xfffffe18392db9f0
dofilewrite() at dofilewrite+0x87/frame 0xfffffe18392dba40
kern_writev() at kern_writev+0x68/frame 0xfffffe18392dba90
sys_write() at sys_write+0x63/frame 0xfffffe18392dbae0
amd64_syscall() at amd64_syscall+0x3fb/frame 0xfffffe18392dbbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe18392dbbf0
--- syscall (4, FreeBSD ELF64, sys_write), rip = 0x800b35afc, rsp = 0x7fffffffd3b8, rbp = 0x41 ---
KDB: enter: panic

Reading symbols from /boot/kernel/zfs.ko.symbols...done.
Loaded symbols for /boot/kernel/zfs.ko.symbols
Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
Loaded symbols for /boot/kernel/opensolaris.ko.symbols
Reading symbols from /boot/kernel/ums.ko.symbols...done.
Loaded symbols for /boot/kernel/ums.ko.symbols
Reading symbols from /boot/kernel/tmpfs.ko.symbols...done.
Loaded symbols for /boot/kernel/tmpfs.ko.symbols
Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
Loaded symbols for /boot/kernel/nullfs.ko.symbols
Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
Loaded symbols for /boot/kernel/linprocfs.ko.symbols
Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
#0  doadump (textdump=-959294432) at pcpu.h:219
219		__asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) bt
#0  doadump (textdump=-959294432) at pcpu.h:219
#1  0xffffffff8034a175 in db_fncall (dummy1=<value optimized out>, dummy2=<value optimized out>, dummy3=<value optimized out>, dummy4=<value optimized out>)
    at /usr/src/sys/ddb/db_command.c:578
#2  0xffffffff80349e5d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:449
#3  0xffffffff80349bd4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502
#4  0xffffffff8034c630 in db_trap (type=<value optimized out>, code=0) at /usr/src/sys/ddb/db_main.c:231
#5  0xffffffff80987329 in kdb_trap (type=3, code=0, tf=<value optimized out>) at /usr/src/sys/kern/subr_kdb.c:656
#6  0xffffffff80d99059 in trap (frame=0xfffffe18392daff0) at /usr/src/sys/amd64/amd64/trap.c:571
#7  0xffffffff80d7dd22 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231
#8  0xffffffff80986a8e in kdb_enter (why=0xffffffff8100edaf "panic", msg=<value optimized out>) at cpufunc.h:63
#9  0xffffffff809462b5 in panic (fmt=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:752
#10 0xffffffff80999949 in propagate_priority (td=<value optimized out>) at /usr/src/sys/kern/subr_turnstile.c:226
#11 0xffffffff8099a3ce in turnstile_wait (ts=<value optimized out>, owner=<value optimized out>, queue=<value optimized out>) at /usr/src/sys/kern/subr_turnstile.c:742
#12 0xffffffff8092f923 in __mtx_lock_sleep (c=0xfffff800020000b8, tid=18446735278394692752, opts=<value optimized out>, file=0x80 <Address 0x80 out of bounds>, line=-16843009)
    at /usr/src/sys/kern/kern_mutex.c:508
#13 0xffffffff80c14138 in vm_map_lookup (var_map=0xfffffe18392db4a8, vaddr=18446741977052954624, fault_typea=2 '\002', out_entry=0xfffffe18392db4b0, object=0xfffffe18392db498, 
    pindex=0xfffffe18392db4a0) at /usr/src/sys/vm/vm_map.c:3843
#14 0xffffffff80c07a71 in vm_fault_hold (map=0xfffff80002000000, vaddr=18446741977052954624, fault_type=<value optimized out>, fault_flags=0, m_hold=0x0) at /usr/src/sys/vm/vm_fault.c:255
#15 0xffffffff80c07957 in vm_fault (map=0xfffff80002000000, vaddr=<value optimized out>, fault_type=2 '\002', fault_flags=128) at /usr/src/sys/vm/vm_fault.c:217
#16 0xffffffff80d99849 in trap_pfault (frame=0xfffffe18392db810, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:767
#17 0xffffffff80d99070 in trap (frame=0xfffffe18392db810) at /usr/src/sys/amd64/amd64/trap.c:455
#18 0xffffffff80d7dd22 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231
#19 0xffffffff80d972cd in copyin () at /usr/src/sys/amd64/amd64/support.S:292
#20 0xffffffff8099bb5f in uiomove_faultflag (cp=<value optimized out>, n=<value optimized out>, uio=0xfffffe18392dbab0, nofault=<value optimized out>) at /usr/src/sys/kern/subr_uio.c:194
#21 0xffffffff809a53ba in pipe_write (fp=0xfffff80adc4e2640, uio=0xfffffe18392dbab0, active_cred=<value optimized out>, flags=8, td=0x0) at /usr/src/sys/kern/sys_pipe.c:1215
#22 0xffffffff809a1297 in dofilewrite (td=0xfffff8002e61d490, fd=1, fp=0xfffff80adc4e2640, auio=0xfffffe18392dbab0, offset=<value optimized out>, flags=0) at file.h:307
#23 0xffffffff809a0fc8 in kern_writev (td=0xfffff8002e61d490, fd=1, auio=0xfffffe18392dbab0) at /usr/src/sys/kern/sys_generic.c:467
#24 0xffffffff809a0f53 in sys_write (td=<value optimized out>, uap=<value optimized out>) at /usr/src/sys/kern/sys_generic.c:382
#25 0xffffffff80d9a0bb in amd64_syscall (td=0xfffff8002e61d490, traced=0) at subr_syscall.c:133
#26 0xffffffff80d7e00b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390
#27 0x0000000800b35afc in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) frame 10
#10 0xffffffff80999949 in propagate_priority (td=<value optimized out>) at /usr/src/sys/kern/subr_turnstile.c:226
226				panic("sleeping thread");
(kgdb) l
221			if (TD_IS_SLEEPING(td)) {
222				printf(
223			"Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
224				    td->td_tid, td->td_proc->p_pid);
225				kdb_backtrace_thread(td);
226				panic("sleeping thread");
227			}
228	
229			/*
230			 * If this thread already has higher priority than the
(kgdb) tid 100702
[Switching to thread 624 (Thread 100702)]#0  sched_switch (td=0xfffff8001797a920, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1933
1933			cpuid = PCPU_GET(cpuid);
(kgdb) p cpuid
No symbol "cpuid" in current context.
(kgdb) quit
# exit
Script done on Mon Mar 10 17:59:07 2014

Glen


Received on Mon Mar 10 2014 - 17:05:12 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:47 UTC