Re: signal 8 (floating point exception) upon resume

From: Bruce Evans <brde_at_optusnet.com.au>
Date: Tue, 11 Mar 2014 15:24:49 +1100 (EST)
On Mon, 10 Mar 2014, John Baldwin wrote:

> On Tuesday, March 04, 2014 4:50:01 pm Bruce Evans wrote:
>> On Tue, 4 Mar 2014, John Baldwin wrote:
>> % Index: i386/i386/swtch.s
>> % ===================================================================
>> % --- i386/i386/swtch.s	(revision 262711)
>> % +++ i386/i386/swtch.s	(working copy)
[...savectx()]
>> This function is mostly bogus (see old mails).
>
> I was going off of the commit logs for amd64 that removed this code as savectx()
> is not used for fork(), only for IPI_STOP and suspend/resume.
>
>> Without fxsave, npxsuspend() cannot be atomic without locking, since
>> fnsave destroys the state in the FPU and you either need a lock to
>> reload the old state atomically enough, or a lock to modify FPCURTHREAD
>> atomically enough.
>
> save_ctx() is now only called from IPI handlers or when doing suspend in
> which case we shouldn't have to worry about being preempted.

I don't understand the suspend part.  Is sufficient locking held througout
suspend/resume to prevent states changing after they have been saved here?

>> % _at__at_ -520,7 +490,16 _at__at_
>> %  	movl	%eax,%dr7
>> %
>> %  #ifdef DEV_NPX
>> % -	/* XXX FIX ME */
>> % +	/* Restore FPU state */
>>
>> Is the problem just this missing functionality?
>
> Possibly.

I now think it was just the clobbering of %cr0 so i386 never had the
problem.

> I think on amd64 there was also the desire to have the pcb
> state be meaningful in dumps (since we IPI_STOP before a dump).  OTOH,

It should also be meaningful in debuggers.  Hopefully stop IPIs put
it there form all stopped CPUs.  I think it remains in the FPU for
the running CPU.

> the current approach used by amd64 (and this patch for i386) is to not
> dirty fpcurthread's state during save_ctx(), but to instead leave
> fpcurthread alone and explicitly save whatever state the FPU is in
> in the PCB used for IPI_STOP or suspend.

Hmm, if kernel debuggers actually supported displaying the FPU state, then
they would prefer to find it in the PCB only (after debugger entry puts
it there), but this doesn't work in places like the dna trap handler.
Similarly for IPIs and suspend.  The dna trap handler would be broken
unless any saving in the PCB is undone when normal operation is resumed,
and it seems more difficult to undo it than to save specially so as not
to have anything to undo.  It is OK to save in the usual place in the PCB
so that debuggers can find it more easily (since that place is not used
in normal operation), but not to change the state in the CPU+FPU across
the operation.  Harmful state changes in the CPU+FPU include toggling
CR0_TS and implicit fninit.  For suspend/resume, we have no option but
to undo everything, since other things may clobber the state.

>
>> % _at__at_ -761,7 +761,34 _at__at_
>> %  	PCPU_SET(fpcurthread, NULL);
>> %  }
>> %
>> % +/*
>> % + * Unconditionally save the current co-processor state across suspend and
>> % + * resume.
>> % + */
>> %  void
>> % +npxsuspend(union safefpu *addr)
>> % +{
>> % +	register_t cr0;
>> % +
>> % +	if (!hw_float)
>> % +		return;
>> % +	cr0 = rcr(0);
>> % +	clts();
>> % +	fpusave(addr);
>> % +	load_cr(0, cr0);
>> % +}
>>
>> In the !fxsave case, this destroys the state in the npx, leaving
>> fpcurthread invalid.  It also does the save when the state in the
>> npx is inactive.  I think jkim intentionally this state so that
>> resume can load it unconditionally.  It must be arranged that there
>> are no interactions with fpcurthread.
>
> Given the single-threaded nature of suspend/resume and IPI_STOP /
> restart_cpus(), those requirements are met, so it should be safe
> to resume whatever state was in the FPU and leave fpcurthread
> unchanged.

Is the whole suspend/resume really locked?

>> This doesn't work so well
>> without fxsave.  When fpcurthread != NULL, reloading CR0 keeps
>> CR0_TS and thus ensures that inconsistent state lives for longer.
>> Things will only be OK if fpcurthread isn't changed until resume.
>
> After the save_ctx() the CPU is going to either resume without
> doing a resume_ctx (IPI_STOP case) leaving fpcurthread unchanged
> (so save_ctx() just grabbed a snapshot of the FPU state for
> debugging purposes) or the CPU is going to power off for suspend.

If it doesn't restore for IPI_STOP, then it will continue with the
state clobbered by fnsave in the !fxsr case.  That is rare but can
happen.  Most CPUs that have IPIs also have fxsr.  But on at least
i386, there is an option to disable fxsr.

> During resume it will invoke resume_ctx() which will restore the
> FPU state (whatever state it was in) and fpcurthread and only
> after those are true is the CPU able to run other threads which
> will modify or use the FPU state.
>
>> You can probably fix this by using the old code here.  The old code
>> doesn't need the hw_float test, since fpcurthread != NULL implies
>> hw_float != 0.
>>
>> Actually, I don't see any need to change anything on i386 -- after
>> storing the state for the thread, there should be no need to store it
>> anywhere else across suspend/resume.  We intentionally use this method
>> (even on amd64 IIRC), although it is suboptimal, to reduce complications
>> for context switchres and signal handling.  npxsave() takes an address,
>> but savectx() didn't abuse this to store directly in the special save
>> area.  It made npxsave() store in the pcb, and then copied to the special
>> area.

Now I see why this won't just work.  Even if the dna handler masked
interrupts in hardware, it could be interrupted by at least non-maskable
STOP IPIs.  These could put the state in the PCB but would have to
restore it.  Soft interrupt masking gives the same problem for maskable
IPIs (including suspend?)

> So I guess that is one option is to always clear fpcurthread during
> suspend and just do an fninit on resume.  However, I chose to match
> what amd64 does for now.  I did make one change locally which was to
> not bother saving the FPU state if fpcurthread was NULL during save_ctx,
> but to instead store a copy of 'npx_initial_state' in the pcb instead.
> This is then loaded into the FPU on resume.  Is that sufficient for
> the !fxsave case?

Saving in the PCB would mainly improving debugging support (you can
at least examine the memory there).

I think you can't trust fpcurthread at all without further complications.
It is locked only by critical_enter(), but that doesn't prevent any IPIs.
The further complications would be to be more careful with the ordering
of setting and clearing fpcurthread, or maybe use another variable to
track states for transitioning fpcurthread.  Consider this code:

% void
% npxsave(addr)
% 	union savefpu *addr;
% {
%

fpcurthread is now non-null, and remains non-null until the end of the
function.

% 	stop_emulating();

With another state variable we could hope to track getting to this state
(but CR0_TS tells us this better).

% 	fpusave(addr);

We cannot use fpcurthread to determine if the state in the FPU is valid,
since in the !fxsr case this function clobbers the state in the FPU.
In the presence of non-maskable interrupts, there is no way to change
fpcurthread or any other variable atomically with calling this function,
so it wouldn't help to change the ordering of clearing fpcurthread.

However, in the fxsr case, fpcurthread being non-null still tells us that
the state in the FPU is valid, since the operations are ordered on the
running CPU.

% 
% 	start_emulating();

As above for tracking this change.

% 	PCPU_SET(fpcurthread, NULL);

At this point, we can use fpcurthread to tell us that the state in the
PCB is valid, provided we do things in a suitable order elsewhere.
The order is to set fpcurthread before hacking on the PCB.  This order
is unnatural, since the non-null fpcurthread won't become valid until
the changed PCB is loaded, but it is the one already used in npxdna().
(That sets fpcurthread in advance of the load to try to work around
IRQ13 bugs that might have been just my FUD and are no longer relevant.)

% }

Suitable state variables can probably be arranged by keeping them in
the stored-to area.  I think fnsave is atomic, and always writes nonzero
somewhere.  You can clear this place in advance and detect how far the
save got by examining this place:

 	state = 0;
 	addr->magic = 0;
 	state = 1;
 	fpusave(addr);
 	state = 2;

state = 0 tells you that addr->magic is invalid and the save hasn't started.
state 1 tells you that addr->magic is valid.
state 1 is ambiguous about whether the save completed (and thus clobbered
   the FPU in the !fxsr case).  In state 1, examine addr->magic on the same
   CPU to determine if the save completed.
state = 2 tells you that the save completed and the save area can be trusted
   (not needed in this example, but used to limit the scope of the magic).

Bruce
Received on Tue Mar 11 2014 - 03:24:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:47 UTC