i386 kernel page fault in generic_bcopy() during shutdown

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Tue, 6 Feb 2007 16:02:51 -0800 (PST)
My Pentium-M laptop has consistently paniced during shutdown since I
updated kernel and world in early January.  It still has the problem
even after I updated the kernel and world a couple days ago.  My Athlon
XP desktop machine does not exhibit this problem.  The kernel on the
affected machine is close to GENERIC, with SMP, apic, gif, faith, and
atapicd removed, and with atapicam added.

The page faults occur in a couple of different places.  I've seen
generic_bcopy() and pmap_allocpte().  Occasionally I see a double fault.


kgdb seems to have trouble unwinding the stack from the last crash:

# kgdb /boot/kernel/kernel /var/crash/vmcore.6
kgdb: kvm_nlist(_stopped_cpus): 
kgdb: kvm_nlist(_stoppcbs): 
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xd6247d90
fault code              = supervisor write, page not present
instruction pointer     = 0x20:0xc089d9c6
stack pointer           = 0x28:0xd4ff0bb8
frame pointer           = 0x28:0xd4ff0be4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1018 (shutdown)
Physical memory: 502 MB
Dumping 67 MB: 52 36 20 4

#0  doadump () at pcpu.h:166
166     pcpu.h: No such file or directory.
        in pcpu.h
#0  doadump () at pcpu.h:166
#1  0xc0475a57 in db_fncall (dummy1=-721483344, dummy2=0, dummy3=-1063115424, 
    dummy4=0xd4ff098c "_at_z\ufffd\ufffd") at /usr/src/sys/ddb/db_command.c:486
#2  0xc0475863 in db_command (last_cmdp=0xc09fb064, cmd_table=0x0)
    at /usr/src/sys/ddb/db_command.c:401
#3  0xc047591e in db_command_loop () at /usr/src/sys/ddb/db_command.c:453
#4  0xc0477569 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:222
#5  0xc06cabc9 in kdb_trap (type=12, code=0, tf=0x0)
    at /usr/src/sys/kern/subr_kdb.c:502
#6  0xc089feed in trap_fatal (frame=0xd4ff0b78, eva=3592715664)
    at /usr/src/sys/i386/i386/trap.c:859
#7  0xc089fc4f in trap_pfault (frame=0xd4ff0b78, usermode=0, eva=3592715664)
    at /usr/src/sys/i386/i386/trap.c:777
#8  0xc089f872 in trap (frame=0xd4ff0b78) at /usr/src/sys/i386/i386/trap.c:462
#9  0xc089009b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#10 0xd6247d90 in ?? ()
Previous frame inner to this frame (corrupt stack?)

According to the instruction pointer in the trap frame, this time the
fault is occured inside generic_bcopy().


(kgdb) list *0xc089d9c6
0xc089d9c6 is at /usr/src/sys/i386/i386/support.s:490.
485             cmpl    %ecx,%eax                       /* overlapping
&& src < dst? */ 486             jb      1f
487
488             shrl    $2,%ecx                         /* copy by
32-bit words */ 489             cld
/* nope, copy forwards */ 490             rep
491             movsl
492             movl    20(%esp),%ecx
493             andl    $3,%ecx                         /* any bytes
left? */ 494             rep


I just rebooted again and got this stack trace in DDB:

pmap_allocpte() at pmap_allocpte+0x2f
pmap_copy() at pmap_copy+0x1c5
vm_map_copy_entry() at vm_map_copy_entry+0x119
vmspace_fork() at vmspace_fork+0x1f8
vm_forkproc() at vm_forkproc()+0xb3
fork1() at fork1+0xdc9
fork() at fork+0x18
syscall() at ...

The problem seems to consistently happen with a fork1() call on the
stack.

This is what kgdb reports for the second crash.

# kgdb /boot/kernel/kernel /var/crash/vmcore.7
kgdb: kvm_nlist(_stopped_cpus): 
kgdb: kvm_nlist(_stoppcbs): 
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex pmap r = 0 (0xc31131ac) locked _at_ /usr/src/sys/i386/i386/pmap.c:2773
exclusive sleep mutex pmap r = 0 (0xc29640a8) locked _at_ /usr/src/sys/i386/i386/pmap.c:2772
exclusive sleep mutex vm page queue mutex r = 0 (0xc0a7e61c) locked _at_ /usr/src/sys/i386/i386/pmap.c:2767
KDB: stack backtrace:
db_trace_self_wrapper(c092a31e) at db_trace_self_wrapper+0x25
kdb_backtrace(3,c295c000,c,d3ad2b1c,d3ad2b10,...) at kdb_backtrace+0x29
witness_warn(5,0,c094defe) at witness_warn+0x192
trap(d3ad2b1c) at trap+0xfb
calltrap() at calltrap+0x6
--- trap 0xd624f000, eip = 0, esp = 0x10212, ebp = 0xc31131ac ---
(null)(1430000,c0a34ac8,c2959360,0,d624f000,...) at 0
__func__.0(61727420,78302070,202c3731,20706965,2325203d,...) at 0xc094ad95


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xd624f080
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc089a513
stack pointer           = 0x28:0xd3ad2b5c
frame pointer           = 0x28:0xd3ad2b68
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1 (init)
Physical memory: 502 MB
Dumping 101 MB: 86 70 54 38 22 6

#0  doadump () at pcpu.h:166
166     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) where
#0  doadump () at pcpu.h:166
#1  0xc0475a57 in db_fncall (dummy1=-743626368, dummy2=0, dummy3=-1063115424, 
    dummy4=0xd3ad295c "_at_z\ufffd\ufffd") at /usr/src/sys/ddb/db_command.c:486
#2  0xc0475863 in db_command (last_cmdp=0xc09fb064, cmd_table=0x0)
    at /usr/src/sys/ddb/db_command.c:401
#3  0xc047591e in db_command_loop () at /usr/src/sys/ddb/db_command.c:453
#4  0xc0477569 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:222
#5  0xc06cabc9 in kdb_trap (type=12, code=0, tf=0x0)
    at /usr/src/sys/kern/subr_kdb.c:502
#6  0xc089feed in trap_fatal (frame=0xd3ad2b1c, eva=3592745088)
    at /usr/src/sys/i386/i386/trap.c:859
#7  0xc089f59b in trap (frame=0xd3ad2b1c) at /usr/src/sys/i386/i386/trap.c:276
#8  0xc089009b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#9  0xd624f080 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) list *0xc089a513
0xc089a513 is in pmap_allocpte (/usr/src/sys/i386/i386/pmap.c:1401).
1396            ptepindex = va >> PDRSHIFT;
1397    retry:
1398            /*
1399             * Get the page directory entry
1400             */
1401            ptepa = pmap->pm_pdir[ptepindex];
1402
1403            /*
1404             * This supports switching from a 4MB page to a
1405             * normal 4K page.
Received on Tue Feb 06 2007 - 23:25:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC