Re: panic on one cpu leaves others running...

From: Bernd Walter <ticso_at_cicely12.cicely.de> Date: Thu, 8 Apr 2004 16:27:43 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:50 UTC

On Thu, Apr 08, 2004 at 09:44:41PM +1000, Peter Jeremy wrote:
> On Thu, Apr 08, 2004 at 03:25:08AM -0600, Scott Long wrote:
> >Peter Jeremy wrote:
> >>On Thu, Apr 08, 2004 at 12:13:39AM -0400, Robert Watson wrote:
> >>
> >>>Funky, eh?  I thought we used to have code to ipi the other cpu's and halt
> >>>them until the cpu in ddb was out agian.  I guess I mis-remember, or that
> >>>code is broken...
> >>
> >>
> >>Look on it as a feature - most other Unices can't survive a panic.
> >>Being able to continue running in a degraded mode until a suitable
> >>maintenance window is available would be a real selling point in
> >>HA applications.  Even being able to shutdown cleanly would be
> >>better than coming to a screaming halt.  :-) (sort of).
> >
> >Not sure if you're joking or not here.
> 
> I was joking about the FreeBSD behaviour (hence the smiley) but serious
> about the (potential) benefits of being able to degrade rather than die.
> 
> >  A panic usually means that
> >something unrecoverable happened, and that continuing on is not safe.
> 
> I realise that.  Hence actually being able to continue after a panic
> would be extremely difficult to do safely.  (Probably not possible in
> general, though it might be in some special cases).

If it's save to continue then there's no need to panic at all.
Just stoping the faulting parts would be enough in that case.
That's the same what happens on disk failure - the processes that
have their binaries on it can't continue but the remaining part still
runs.
I would also find it great if a filesystem panic just takes the given
filesystem down instead of the whole host.

-- 
B.Walter                   BWCT                http://www.bwct.de
ticso_at_bwct.de                                  info_at_bwct.de