Re: Page fault in amd64 pmap_qremove from vm_thread_new()

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Tue, 13 Feb 2007 14:10:30 -0500
On Tue, Feb 13, 2007 at 09:02:23PM +0200, Kostik Belousov wrote:
> On Tue, Feb 13, 2007 at 01:53:12PM -0500, Kris Kennaway wrote:
> > I get this frequently when running stress2 on an 8-core amd64 system:
> > 
> > Fatal trap 12: page fault while in kernel mode
> > Fatal trap 12: page fault while in kernel mode
> > 
> > 
> > cpuid = 2;
> > 
> > 
> > apic id = 02
> > 
> > Fatal trap 12: page fault while in kernel mode
> > 
> > cpuid = 5; fault virtual address        = 0xffff807ffffff040
> > Fatal trap 12: page fault while in kernel mode
> > Fatal trap 12: page fault while in kernel mode
> > 
> > cpuid = 4; apic id = 05
> > apic id = 04
> > fault virtual address   = 0xffff807ffffff0e0
> > fault virtual address   = 0xffff807ffffff0b8
> > cpuid = 0; fault code           = supervisor write data, page not present
> > 
> > instruction pointer     = 0x8:0xffffffff803deedd
> > cpuid = 3; stack pointer                = 0x10:0xffffffffc7647720
> > fault code              = supervisor write data, page not present
> > 
> > instruction pointer     = 0x8:0xffffffff803deedd
> > apic id = 00
> > stack pointer           = 0x10:0xffffffffcfd7e720
> > fault code              = supervisor write data, page not present
> > frame pointer           = 0x10:0xffffffffc7647730
> > frame pointer           = 0x10:0xffffffffcfd7e730
> > Fatal trap 12: page fault while in kernel mode
> > 
> > cpuid = 6;
> > instruction pointer     = 0x8:0xffffffff803deedd
> > 
> > stack pointer           = 0x10:0xffffffffb2b93720
> > 
> > frame pointer           = 0x10:0xffffffffb2b93730
> > 
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > 
> >                         = DPL 0, pres 1, long 1, def32 0, gran 1
> > 
> > processor eflags        =
> > interrupt enabled,
> > resume, Fatal trap 12: page fault while in kernel mode
> > apic id = 06
> > cpuid = 7; fault virtual address        = 0xffff807ffffff108
> > apic id = 07
> > fault code              = supervisor write data, page not present
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > apic id = 03
> >                         = DPL 0, pres 1, long 1, def32 0, gran 1
> > fault virtual address   = 0xffff807ffffff068
> > IOPL = 0
> > fault code              = supervisor write data, page not present
> > fault virtual address   = 0xffff807ffffff018
> > instruction pointer     = 0x8:0xffffffff803deedd
> > instruction pointer     = 0x8:0xffffffff803deedd
> > Fatal trap 12: page fault while in kernel mode
> > stack pointer           = 0x10:0xffffffffbf901720
> > cpuid = 4; stack pointer                = 0x10:0xffffffffb1c11720
> > processor eflags        = frame pointer         = 0x10:0xffffffffb1c11730
> > interrupt enabled, resume, fault code           = supervisor write data, page not present
> > IOPL = 0
> > instruction pointer     = 0x8:0xffffffff803deedd
> > current process         = stack pointer         = 0x10:0xffffffffd5b25720
> > frame pointer           = 0x10:0xffffffffbf901730
> > frame pointer           = 0x10:0xffffffffd5b25730
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > current process         =                       = DPL 0, pres 1, long 1, def32 0, gran 1
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > code segment            = base 0x0, limit 0xfffff, type 0x1b
> > 18747 (thr2)
> > [thread pid 18747 tid 142909 ]
> > Stopped at      pmap_qremove+0x2d:      movq    $0,(%rcx,%rax,8)
> > db> wh
> > Tracing pid 18747 tid 142909 td 0xffffff0095710cd0
> > pmap_qremove() at pmap_qremove+0x2d
> > vm_thread_new() at vm_thread_new+0x8d
> > thread_init() at thread_init+0x16
> > slab_zalloc() at slab_zalloc+0x282
> > uma_zone_slab() at uma_zone_slab+0x1ae
> > uma_zalloc_bucket() at uma_zalloc_bucket+0x19d
> > uma_zalloc_arg() at uma_zalloc_arg+0x3a3
> > thread_alloc() at thread_alloc+0x1f
> > create_thread() at create_thread+0xc5
> > kern_thr_new() at kern_thr_new+0x75
> > thr_new() at thr_new+0x62
> > syscall() at syscall+0x310
> > Xfast_syscall() at Xfast_syscall+0xab
> > --- syscall (455, FreeBSD ELF64, thr_new), rip = 0x8007a1cac, rsp = 0x7fffffffdef8, rbp = 0 ---
> > db> show allpcpu
> > Current CPU: 2
> > 
> > cpuid        = 0
> > curthread    = 0xffffff00717e8290: pid 18944 "thr2"
> > curpcb       = 0xffffffffe2e33d50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9aa6520: pid 17 "idle: cpu0"
> > spin locks held:
> > 
> > cpuid        = 1
> > curthread    = 0xffffff0015e9d7b0: pid 18736 "thr2"
> > curpcb       = 0xffffffffbceefd50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9aa6290: pid 16 "idle: cpu1"
> > spin locks held:
> > exclusive spin mutex sio r = 0 (0xffffffff806bf3c0) locked _at_ dev/sio/sio.c:1390
> > 
> > cpuid        = 2
> > curthread    = 0xffffff0095710cd0: pid 18747 "thr2"
> > curpcb       = 0xffffffffcfd7ed50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9aa6000: pid 15 "idle: cpu2"
> > spin locks held:
> > 
> > cpuid        = 3
> > curthread    = 0xffffff00ad485290: pid 18743 "thr2"
> > curpcb       = 0xffffffffd5b25d50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9a63cd0: pid 14 "idle: cpu3"
> > spin locks held:
> > 
> > cpuid        = 4
> > curthread    = 0xffffff0098fc7000: pid 18942 "thr2"
> > curpcb       = 0xffffffffc77fad50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9a63000: pid 13 "idle: cpu4"
> > spin locks held:
> > exclusive spin mutex turnstile chain r = 0 (0xffffffff80613ed8) locked _at_ kern/subr_turnstile.c:489
> > 
> > cpuid        = 5
> > curthread    = 0xffffff00215b8cd0: pid 18708 "thr2"
> > curpcb       = 0xffffffffb2b93d50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9a8fcd0: pid 12 "idle: cpu5"
> > spin locks held:
> > 
> > cpuid        = 6
> > curthread    = 0xffffff005b72d520: pid 18718 "thr2"
> > curpcb       = 0xffffffffb1c11d50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9a8fa40: pid 11 "idle: cpu6"
> > spin locks held:
> > 
> > cpuid        = 7
> > curthread    = 0xffffff0078aae7b0: pid 18782 "thr2"
> > curpcb       = 0xffffffffbf901d50
> > fpcurthread  = none
> > idlethread   = 0xffffff00b9a8f7b0: pid 10 "idle: cpu7"
> > spin locks held:
> > 
> > For some reason ddb doesn't give sensible backtraces for the running threads:
> > 
> > db> wh 18944
> > Tracing pid 18944 tid 130433 td 0xffffff009daa7290
> > fork_trampoline() at fork_trampoline
> > db> wh 18736
> > Tracing pid 18736 tid 165977 td 0xffffff00632b2cd0
> > fork_trampoline() at fork_trampoline
> > db> wh 18747
> > Tracing pid 18747 tid 165890 td 0xffffff0037403000
> > fork_trampoline() at fork_trampoline
> > db> wh 18743
> > Tracing pid 18743 tid 165929 td 0xffffff004f59e000
> > fork_trampoline() at fork_trampoline
> > db> wh 18942
> > Tracing pid 18942 tid 130531 td 0xffffff000a166520
> > fork_trampoline() at fork_trampoline
> > db> wh 18708
> > Tracing pid 18708 tid 166269 td 0xffffff005c28a290
> > fork_trampoline() at fork_trampoline
> > db> wh 18718
> > Tracing pid 18718 tid 111088 td 0xffffff0081f51a40
> > fork_trampoline() at fork_trampoline
> > db> wh 18782
> > Tracing pid 18782 tid 166078 td 0xffffff0052b4c000
> > fork_trampoline() at fork_trampoline
> 
> Is the backtrace for faulted thread always the same ? And this is CURRENT ?

This is current, I haven't tried to reproduce on 6.x yet (but can do
so).

The trace through vm_thread_new() is always the same.

Kris


Received on Tue Feb 13 2007 - 18:10:32 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC