Re: Fatal Trap 12: Page fault while in kernel mode (racoon/amd64/5.3-RELEASE-p4)

From: Matthew Sullivan <matthew_at_uq.edu.au>
Date: Thu, 13 Jan 2005 13:59:38 +1000
First if this is the incorrect mailing list for these type of posts 
please let me know or I'll never be able to post to the correct location...

Matthew Sullivan wrote:

> I have recompiled the kernel with KDB and KDB_UNATTENDED however it 
> appears the machine is totally locked in this state - it will not 
> automatically reboot, it will not respond to [CTRL-ALT-ESC] it will 
> not even respond to [CTRL-ALT-DEL] - the only thing it will respond to 
> is "the big red button" ;-)
>
> I have told it to save cores, however so far no cores.  The crash is 
> reproducible everytime.  The fault process is recorded as racoon.
>
> Any suggestions on solving or debugging this further would be greatly 
> appreciated.
>
> The machine is rack mounted and without remote management card so 
> getting the rest of the trace information is not going to be easy.
>
Further to my last I have finally located a null modem cable and got it 
installed... A little fiddling later and we have DDB taking over...

root_at_desperado:~# racoon


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x39
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xffffffff80307a70
stack pointer           = 0x10:0xffffffff94eb4860
frame pointer           = 0x10:0xffffffff94eb4960
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 454 (racoon)
[thread 100081]
Stopped at      keydb_newsecasvar+0x100:        decl    %ecx

db> where
keydb_newsecasvar() at keydb_newsecasvar+0x100
raw_usend() at raw_usend+0x60
key_send() at key_send+0xa
sosend() at sosend+0x626
kern_sendit() at kern_sendit+0x113
sendit() at sendit+0x5f
sendto() at sendto+0x4d
syscall() at syscall+0x50c
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (133, FreeBSD ELF64, sendto), rip = 0x800a63da8, rsp = 
0x7fffffffec38, rbp = 0x2 ---
db> show reg
cs                 0x8
ss                0x10
rax         0xffffffff94eb4890
rcx         0xffffffff94eb493f
rdx         0xffffffff94eb4970
rbx         0xffffffff80307a6d  keydb_newsecasvar+0xfd
rsp         0xffffffff94eb4860
rbp         0xffffffff94eb4960
rsi              0x280
rdi         0xffffff001cce7c00
r8                0xa0
r9          0xffffff00151807b0
r10         0xffffffff80513980  key_usrreqs
r11         0xffffffff94eb4a10
r12               0x39
r13                  0
r14                  0
r15         0xffffff00164aa678
rip         0xffffffff80307a70  keydb_newsecasvar+0x100
rflags         0x10202
dr0                  0
dr1                  0
dr2                  0
dr3                  0
dr4         0xffff0ff0
dr5              0x400
dr6         0xffff0ff0
dr7              0x400
keydb_newsecasvar+0x100:        decl    %ecx
db> show all procs/m
  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  454 ffffff001555e5d0 ffffffff94ec0000    0   453   454 0004002 [CPU 0] 
racoon
  453 ffffff001555e2e8 ffffffff94ebf000    0   443   453 0004002 [SLPQ 
wait 0xffffff001555e2e8][SLP] bash
  452 ffffff001ccc62e8 ffffffff93cd3000    0     1     1 0004000 [SLPQ 
siodcd 0xffffff000096dc00][SLP] getty
  451 ffffff001555e8b8 ffffffff94ec1000    0     1   451 0004002 [SLPQ 
ttyin 0xffffff0000956010][SLP] getty
  450 ffffff001ccc6000 ffffffff93cd2000    0     1   450 0004002 [SLPQ 
ttyin 0xffffff0000956410][SLP] getty
  449 ffffff001cce32e8 ffffffff93c92000    0     1   449 0004002 [SLPQ 
ttyin 0xffffff000096c010][SLP] getty
  448 ffffff00152f98b8 ffffffff94e80000    0     1   448 0004002 [SLPQ 
ttyin 0xffffff0000954410][SLP] getty
  447 ffffff001555eba0 ffffffff94ec2000    0     1   447 0004002 [SLPQ 
ttyin 0xffffff0000954810][SLP] getty
  446 ffffff00152f92e8 ffffffff94e7e000    0     1   446 0004002 [SLPQ 
ttyin 0xffffff000096d410][SLP] getty
  445 ffffff00152f95d0 ffffffff94e7f000    0     1   445 0004002 [SLPQ 
ttyin 0xffffff0000971c10][SLP] getty
  444 ffffff00152f9000 ffffffff94e7d000    0     1   444 0004002 [SLPQ 
ttyin 0xffffff000096c810][SLP] getty
  443 ffffff001ccc68b8 ffffffff93cd5000    0     1   443 0004102 [SLPQ 
wait 0xffffff001ccc68b8][SLP] login
  442 ffffff00152f9ba0 ffffffff94e81000    0   441    53 0004002 [SLPQ 
nanslp 0xffffffff8053ba40][SLP] sleep
  441 ffffff001cce35d0 ffffffff93c93000    0   438    53 0000002 [SLPQ 
wait 0xffffff001cce35d0][SLP] sh
  439 ffffff001555e000 ffffffff94e82000    0     1    53 0004002 [SLPQ 
piperd 0xffffff001722ab40][SLP] logger
  438 ffffff001ccc6ba0 ffffffff93cd6000    0     1    53 0000002 [SLPQ 
wait 0xffffff001ccc6ba0][SLP] sh
  403 ffffff00154a1000 ffffffff94ec3000    0     1   403 0000000 [SLPQ 
nanslp 0xffffffff8053ba40][SLP] cron
  390 ffffff001cda75d0 ffffffff93c00000   25     1   390 0000100 [SLPQ 
pause 0xffffff001cda7640][SLP] sendmail
  386 ffffff001cda7ba0 ffffffff93c02000    0     1   386 0000100 [SLPQ 
select 0xffffffff80542030][SLP] sendmail
  380 ffffff001cce38b8 ffffffff93cd0000    0     1   380 0000100 [SLPQ 
select 0xffffffff80542030][SLP] sshd
  273 ffffff001cce3ba0 ffffffff93cd1000    0     1   273 0000000 [SLPQ 
select 0xffffffff80542030][SLP] syslogd
  253 ffffff001cce3000 ffffffff93c91000    0     1   253 0000000 [SLPQ 
select 0xffffffff80542030][SLP] devd
  181 ffffff001cda78b8 ffffffff93c01000    0     1   181 0000000 [SLPQ 
pause 0xffffff001cda7928][SLP] adjkerntz
   52 ffffff001ccc65d0 ffffffff93cd4000    0     0     0 0000204 [SLPQ - 
0xffffffff93cacbe4][SLP] schedcpu
   51 ffffff001cda2000 ffffffff93bb8000    0     0     0 0000204 [SLPQ 
syncer 0xffffffff8053b720][SLP] syncer
   50 ffffff001cda22e8 ffffffff93bb9000    0     0     0 0000204 [SLPQ 
vlruwt 0xffffff001cda22e8][SLP] vnlru
   49 ffffff001cda25d0 ffffffff93bba000    0     0     0 0000204 [SLPQ 
psleep 0xffffffff8054295c][SLP] bufdaemon
   48 ffffff001cda28b8 ffffffff93bbb000    0     0     0 000020c [SLPQ 
pgzero 0xffffffff80556dd4][SLP] pagezero
   47 ffffff001cda2ba0 ffffffff93bbc000    0     0     0 0000204 [SLPQ 
psleep 0xffffffff80556e3c][SLP] vmdaemon
   46 ffffff001cd83000 ffffffff93bbd000    0     0     0 0000204 [SLPQ 
psleep 0xffffffff80556dec][SLP] pagedaemon
   45 ffffff001cd832e8 ffffffff93bfa000    0     0     0 0000204 [IWAIT] 
swi0: sio
   44 ffffff001cd835d0 ffffffff93bfb000    0     0     0 0000204 [SLPQ - 
0xffffff0000811848][SLP] fdc0
   43 ffffff001cd838b8 ffffffff93bfc000    0     0     0 0000204 [SLPQ 
tzpoll 0xffffffff8052e568][SLP] acpi_thermal
    9 ffffff001cd83ba0 ffffffff93bfd000    0     0     0 0000204 [SLPQ 
actask 0xffffffff8052e620][SLP] acpi_task2
    8 ffffff001cda7000 ffffffff93bfe000    0     0     0 0000204 [SLPQ 
actask 0xffffffff8052e620][SLP] acpi_task1
    7 ffffff001cd92000 ffffffff93b72000    0     0     0 0000204 [SLPQ 
actask 0xffffffff8052e620][SLP] acpi_task0
   42 ffffff001cd922e8 ffffffff93b73000    0     0     0 0000204 [IWAIT] 
swi6: task queue
   41 ffffff001cd925d0 ffffffff93b74000    0     0     0 0000204 [IWAIT] 
swi6:+
    6 ffffff001cd928b8 ffffffff93b75000    0     0     0 0000204 [SLPQ - 
0xffffff0000835b80][SLP] thread taskq
   40 ffffff001cd92ba0 ffffffff93b76000    0     0     0 0000204 [IWAIT] 
swi6:+
    5 ffffff001cdb7000 ffffffff93bb3000    0     0     0 0000204 [SLPQ - 
0xffffff0000835d00][SLP] kqueue taskq
   39 ffffff001cdb72e8 ffffffff93bb4000    0     0     0 0000204 [IWAIT] 
swi6: acpitaskq
   38 ffffff001cdb75d0 ffffffff93bb5000    0     0     0 0000204 [SLPQ - 
0xffffffff8052eb00][SLP] yarrow
    4 ffffff001cdb78b8 ffffffff93bb6000    0     0     0 0000204 [SLPQ - 
0xffffffff80532988][SLP] g_down
    3 ffffff001cdb7ba0 ffffffff93bb7000    0     0     0 0000204 [SLPQ - 
0xffffffff80532980][SLP] g_up
    2 ffffff001cd8e2e8 ffffffff93b2d000    0     0     0 0000204 [SLPQ - 
0xffffffff80532970][SLP] g_event
   37 ffffff001cd8e5d0 ffffffff93b2e000    0     0     0 0000204 [IWAIT] 
swi4: vm
   36 ffffff001cd8e8b8 ffffffff93b2f000    0     0     0 000020c [RUNQ] 
swi5: clock sio
   35 ffffff001cd8eba0 ffffffff93b30000    0     0     0 0000204 [IWAIT] 
swi1: net
   34 ffffff001cda5000 ffffffff93b6d000    0     0     0 0000204 [IWAIT] 
irq23:
   33 ffffff001cda52e8 ffffffff93b6e000    0     0     0 0000204 [IWAIT] 
irq22:
   32 ffffff001cda55d0 ffffffff93b6f000    0     0     0 0000204 [IWAIT] 
irq21:
   31 ffffff001cda58b8 ffffffff93b70000    0     0     0 0000204 [IWAIT] 
irq20:
   30 ffffff001cda5ba0 ffffffff93b71000    0     0     0 0000204 [RUNQ] 
irq19: sis0 sis1
   29 ffffff001cda38b8 ffffffff93b07000    0     0     0 0000204 [IWAIT] 
irq18:
   28 ffffff001cda3ba0 ffffffff93b08000    0     0     0 0000204 [IWAIT] 
irq17: atapci1
   27 ffffff001cd8c000 ffffffff93b09000    0     0     0 0000204 [IWAIT] 
irq16:
   26 ffffff001cd8c2e8 ffffffff93b28000    0     0     0 0000204 [IWAIT] 
irq15: ata1
   25 ffffff001cd8c5d0 ffffffff93b29000    0     0     0 0000204 [IWAIT] 
irq14: ata0
   24 ffffff001cd8c8b8 ffffffff93b2a000    0     0     0 0000204 [IWAIT] 
irq13:
   23 ffffff001cd8cba0 ffffffff93b2b000    0     0     0 0000204 [IWAIT] 
irq12:
   22 ffffff001cd8e000 ffffffff93b2c000    0     0     0 0000204 [IWAIT] 
irq11:
   21 ffffff001cddd2e8 ffffffff93ae2000    0     0     0 0000204 [IWAIT] 
irq10:
   20 ffffff001cddd5d0 ffffffff93ae3000    0     0     0 0000204 [IWAIT] 
irq9: acpi0
   19 ffffff001cddd8b8 ffffffff93b02000    0     0     0 0000204 [IWAIT] 
irq8: rtc
   18 ffffff001cdddba0 ffffffff93b03000    0     0     0 0000204 [IWAIT] 
irq7: ppc0
   17 ffffff001cda3000 ffffffff93b04000    0     0     0 0000204 [IWAIT] 
irq6: fdc0
   16 ffffff001cda32e8 ffffffff93b05000    0     0     0 0000204 [IWAIT] 
irq5:
   15 ffffff001cda35d0 ffffffff93b06000    0     0     0 0000204 [IWAIT] 
irq4: sio0
   14 ffffff001cdd6000 ffffffff93aa0000    0     0     0 0000204 [IWAIT] 
irq3: sio1
   13 ffffff001cdd62e8 ffffffff93add000    0     0     0 0000204 [IWAIT] 
irq0: clk
   12 ffffff001cdd65d0 ffffffff93ade000    0     0     0 0000204 [IWAIT] 
irq1: atkbd0
   11 ffffff001cdd68b8 ffffffff93adf000    0     0     0 000020c [Can 
run] idle
    1 ffffff001cdd6ba0 ffffffff93ae0000    0     0     1 0004200 [SLPQ 
wait 0xffffff001cdd6ba0][SLP] init
   10 ffffff001cddd000 ffffffff93ae1000    0     0     0 0000204 [SLPQ 
ktrace 0xffffffff80538370][SLP] ktrace
    0 ffffffff80532b00 ffffffff80679000    0     0     0 0000200 [SLPQ 
sched 0xffffffff80532b00][SLP] swapper


I'm going to have to put this machine into production within the next 7 
days so any help would be really great, also any extra info anyone 
requires is available.  As I said in my last this is 100% reproducable.  
Dumps are not available - calling panic will lock the system solid. 
Calling boot(0) seems to work fine though...

Regards,

-- 
Matthew Sullivan
Specialist Systems Programmer
Information Technology Services
The University of Queensland


Received on Thu Jan 13 2005 - 03:00:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:26 UTC