Re: panic: arithmetic trap in fpurstor() in sys/i386/isa/npx.c

From: Eric van Gyzen <vangyzen_at_stat.duke.edu>
Date: Thu, 1 Jul 2004 11:21:38 -0400
Bruce et al.:

I apologize for reviving this old problem.  It became irrelevant to me for a 
few months, but now it's relevant again.

Backing out rev 1.216 of vm_machdep.c fixed the problem.  I can no longer 
panic these machines.

Would you still like to see the value and contents of union savefpu *addr?

Eric

Bruce Evans wrote:
> On Thu, 19 Feb 2004, Eric van Gyzen wrote:
> > I can reliably panic 5.2-RELEASE GENERIC running on three different AMD
> > Athlon CPUs with:
> >
> >   # echo 'q()' | R --no-save
> >
> > R is ports/math/R-letter, and q() just tells R to quit.  This does not
> > happen on an AthlonMP or P3 running the same kernel.  It did not happen
> > on the same three Athlon machines while running 5.1-RELEASE.  Some simple
> > gdb debugging follows.  If you need more info, please ask; I don't debug
> > the kernel very often, so I'm not sure what to provide.  :-/
>
> Try backing out rev.1.216 of vm_machdep.c.  I don't see exactly how this
> commit could cause the problem, but it is the only related thing that has
> changed since 5.1, and the first part of it has several bugs (it is a
> layering violation and is missing explicit disabling of interrupts).
>
> > panic: arithmetic trap
> > ...
> > (kgdb) list *0xc07e07b4
> > 0xc07e07b4 is in fpurstor (/usr/src/sys/i386/isa/npx.c:986).
> > [snip]
> >
> > (kgdb) list 976,987
> > 976     static void
> > 977     fpurstor(addr)
> > 978             union savefpu *addr;
> > 979     {
> > 980
> > 981     #ifdef CPU_ENABLE_SSE
> > 982             if (cpu_fxsr)
> > 983                     fxrstor(addr);
> > 984             else
> > 985     #endif
> > 986                     frstor(addr);
> > 987     }
>
> frstror() can only cause an arithmetic trap on broken CPUs.  I doubt
> that Athlons are that broken, so this trap is mysterious.  frstor()
> doesn't even trap for plain i386's; it may cause a bogus IRQ13 which
> the kernel has to be careful not to turn into an arithmetic trap.
>
> Please report the value and contents of addr (about 108 bytes of it
> in hex).
>
> > (kgdb) bt
> > #0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
> > #1  0xc0631967 in boot (howto=256) at
> > /usr/src/sys/kern/kern_shutdown.c:372 #2  0xc0631cde in panic () at
> > /usr/src/sys/kern/kern_shutdown.c:550 #3  0xc07db60c in trap_fatal
> > (frame=0xd8a08c88, eva=0)
> >     at /usr/src/sys/i386/i386/trap.c:821
> > #4  0xc07db062 in trap (frame=
> >       {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 22,
> >        tf_ebp = -660566840, tf_isp = -660566860, tf_ebx = 582, tf_edx =
> > 0, tf_ecx = 134996160, tf_eax = -660566560, tf_trapno = 6, tf_err = 0,
> > tf_eip = -1065482316, tf_cs = 8, tf_eflags = 65606,
> >        tf_esp = -660566792, tf_ss = -1065482847})
> >     at /usr/src/sys/i386/i386/trap.c:618
> > #5  0xc07c8258 in calltrap () at {standard input}:94
> > #6  0xc07e05a1 in npxdna () at /usr/src/sys/i386/isa/npx.c:840
>
> Everything seems notmal up to the trap.  Old versions of gdb don't
> print the frame before calltrap(), but you found it anyway.  npxdna()
> is supposed to just load the user npx context and return.  There may
> be an unmasked arithmetic trap pending in the user context, but that
> is rare too.  fpurstor() must not trap since otherwise it would be
> impossible to load user npx contexts in the kernel without breaking
> trap delivery timing.
>
> Bruce

-- 
Eric van Gyzen                        Sr. Systems Programmer
http://www.stat.duke.edu/~vangyzen/   ISDS, Duke University
Received on Thu Jul 01 2004 - 13:23:13 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:59 UTC