Re: Optimization bug with floating-point?

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Thu, 14 Mar 2019 22:11:42 +0200
On Thu, Mar 14, 2019 at 12:59:14PM -0700, John Baldwin wrote:
> On 3/14/19 12:20 PM, Konstantin Belousov wrote:
> > On Fri, Mar 15, 2019 at 05:50:37AM +1100, Peter Jeremy wrote:
> >> On 2019-Mar-13 23:30:07 -0700, Steve Kargl <sgk_at_troutmask.apl.washington.edu> wrote:
> >>> AFAICT, all libm float routines need to be modified to conditional
> >>> include ieeefp.h and call fpsetprec(FP_PD).  This will work around
> >>> issues is FP and libm.  FreeBSD needs to issue an erratum about 
> >>> the numerical issues with clang.
> >>
> >> I vaguely recall looking into the x87 initialisation a long time ago
> >> and STR that the startup code (either crtX or in the kernel) does
> >> a fninit() to set the precision.  I don't recall exactly where.
> > At boot, a clean initial FPU state is stored in fpu_initialstate.
> > Then on first FPU access from userspace  (first for the given process
> > context), this saved state is copied into hardware registers.  The
> > quirk is that for i386 binaries on amd64, we adjust fpu control word
> > to what is expected by i386 binaries.
> > 
> >>
> >> IMO, calling fpsetprec() in every libm float function is overkill. It
> >> should be enough to fpsetprec() before main() and add a note in the
> >> man pages that libm is built to use the default FPU configuration and
> >> changing the configuration (precision or rounding) may result in larger
> >> errors.
> > Changing default precision in crt1 would break the ABI.
> 
> So what I don't understand then is what is gcc doing different than clang
> in this case.  I assume neither GCC _nor_ clang are adjusting the FPU in
> compiler-generated code, and in fact as Steve's earlier tests shows, the
> precision is set to PD by default when a clang-built binary is run.

Precision control only affect elementary floating-point instructions.
Could this be the cause ?

SDM vol 1 8.1.5.2 Precision Control Field
The precision-control bits only affect the results of the following
floating-point instructions: FADD, FADDP, FIADD, FSUB, FSUBP, FISUB,
FSUBR, FSUBRP, FISUBR, FMUL, FMULP, FIMUL, FDIV, FDIVP, FIDIV, FDIVR,
FDIVRP, FIDIVR, and FSQRT.
Received on Thu Mar 14 2019 - 19:11:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:20 UTC