On Thu, Apr 08, 2004 at 04:27:43PM +0200, Bernd Walter wrote: >On Thu, Apr 08, 2004 at 09:44:41PM +1000, Peter Jeremy wrote: >> > A panic usually means that >> >something unrecoverable happened, and that continuing on is not safe. >> >> I realise that. Hence actually being able to continue after a panic >> would be extremely difficult to do safely. (Probably not possible in >> general, though it might be in some special cases). > >If it's save to continue then there's no need to panic at all. >Just stoping the faulting parts would be enough in that case. Except FreeBSD (and most Unices) don't do this in general. I was thinking of hardware failures - if a CPU fails and it wasn't holding any locks then it would seem feasible to just abort the thread/process that was using the CPU and limp along on the remaining CPU(s). Likewise an unrecoverable memory error in a clean page should (in most cases) be able to be recovered by marking that page unusable and loading another copy of the data into another page. (Obviously this is problematic if the page in question is part of the kernel VM subsystem or the device driver for the relevant backing store). Even a dirty page may be recoverable by aborting the affected process or treating it similarly to an I/O error on a filesystem. The marketing spin from at least one vendor suggests that their high-end systems can manage this sort of fault recovery. I'm not sure whether this is an area that FreeBSD should aspire to - I suspect that the effort needed to implement and test this would not be justified by the small size of the additional potential market. PeterReceived on Thu Apr 08 2004 - 13:22:10 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:50 UTC