Re: panic: arithmetic trap in fpurstor() in sys/i386/isa/npx.c

From: Bruce Evans <bde_at_zeta.org.au>
Date: Fri, 20 Feb 2004 17:42:51 +1100 (EST)
On Thu, 19 Feb 2004, Eric van Gyzen wrote:

> I can reliably panic 5.2-RELEASE GENERIC running on three different AMD Athlon
> CPUs with:
>
>   # echo 'q()' | R --no-save
>
> R is ports/math/R-letter, and q() just tells R to quit.  This does not happen
> on an AthlonMP or P3 running the same kernel.  It did not happen on the same
> three Athlon machines while running 5.1-RELEASE.  Some simple gdb debugging
> follows.  If you need more info, please ask; I don't debug the kernel very
> often, so I'm not sure what to provide.  :-/

Try backing out rev.1.216 of vm_machdep.c.  I don't see exactly how this
commit could cause the problem, but it is the only related thing that has
changed since 5.1, and the first part of it has several bugs (it is a
layering violation and is missing explicit disabling of interrupts).

> panic: arithmetic trap
> ...
> (kgdb) list *0xc07e07b4
> 0xc07e07b4 is in fpurstor (/usr/src/sys/i386/isa/npx.c:986).
> [snip]
>
> (kgdb) list 976,987
> 976     static void
> 977     fpurstor(addr)
> 978             union savefpu *addr;
> 979     {
> 980
> 981     #ifdef CPU_ENABLE_SSE
> 982             if (cpu_fxsr)
> 983                     fxrstor(addr);
> 984             else
> 985     #endif
> 986                     frstor(addr);
> 987     }

frstror() can only cause an arithmetic trap on broken CPUs.  I doubt
that Athlons are that broken, so this trap is mysterious.  frstor()
doesn't even trap for plain i386's; it may cause a bogus IRQ13 which
the kernel has to be careful not to turn into an arithmetic trap.

Please report the value and contents of addr (about 108 bytes of it
in hex).

> (kgdb) bt
> #0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
> #1  0xc0631967 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372
> #2  0xc0631cde in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #3  0xc07db60c in trap_fatal (frame=0xd8a08c88, eva=0)
>     at /usr/src/sys/i386/i386/trap.c:821
> #4  0xc07db062 in trap (frame=
>       {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 22,
>        tf_ebp = -660566840, tf_isp = -660566860, tf_ebx = 582, tf_edx = 0,
>        tf_ecx = 134996160, tf_eax = -660566560, tf_trapno = 6, tf_err = 0,
>        tf_eip = -1065482316, tf_cs = 8, tf_eflags = 65606,
>        tf_esp = -660566792, tf_ss = -1065482847})
>     at /usr/src/sys/i386/i386/trap.c:618
> #5  0xc07c8258 in calltrap () at {standard input}:94
> #6  0xc07e05a1 in npxdna () at /usr/src/sys/i386/isa/npx.c:840

Everything seems notmal up to the trap.  Old versions of gdb don't
print the frame before calltrap(), but you found it anyway.  npxdna()
is supposed to just load the user npx context and return.  There may
be an unmasked arithmetic trap pending in the user context, but that
is rare too.  fpurstor() must not trap since otherwise it would be
impossible to load user npx contexts in the kernel without breaking
trap delivery timing.

Bruce
Received on Thu Feb 19 2004 - 21:42:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:44 UTC