On Tue, 2009-07-28 at 15:45 +0100, Anton Shterenlikht wrote: > On Tue, Jul 28, 2009 at 02:22:50PM +0000, O. Hartmann wrote: > > Anton Shterenlikht wrote: > > > On Mon, Jul 27, 2009 at 10:04:28PM +0100, Anton Shterenlikht wrote: > > >> On Mon, Jul 27, 2009 at 09:55:12PM +0200, O. Hartmann wrote: > > >>> Kamigishi Rei wrote: > > >>>> O. Hartmann wrote: > > >>>>> I have the problem of crashing FreeBSD 8.0-BETA2/amd64 under load on > > >>>>> all of our SMP boxes. Is there an issue known at the moment? If not, I > > >>>>> will prepare the kernel for whitnessing and provide more informations, > > >>>>> if you wish. > > >>>> A quick question: what is in the crash message, i.e. the backtrace? > > >>>> And what kind of crash is it - a panic() or a fatal trap? > > >>> On the 8-core server box, I sometimes see : > > >>> > > >>> Fatal trap 12: page fault while in kernel mode > > >>> fault code = supervisor read, page not present > > >> Not sure if it's related, but on ia64 SMP (2 cpus) with 8.0-current and > > >> later with 8.0-beta1 (I havent' built beta2 yet) I'm getting crashes > > >> under load every so often. E.g buildworld -j8 is likely to crash the > > >> box. No messages, just a sudden freeze, no backtrace or panic, and then reboot. > > >> > > >> If load is less heavy, e.g. fewer processes and some idle time, the > > >> problem doesn't seem to appear. > > >> > > >> I'm happy to do any further testing, if suggested. > > > > > > my ia64 8.0-beta1 SMP box died again on > > > make -j8 buildworld > > > with no panic or log entries. > > > > > > Is it possible that some kernel variable needs to > > > be increased? E.g. kern.maxproc, kern.maxfiles, etc. > > > Or perhaps I'm talking complete rubbish.. > > > > > > > I suggest you try again with a UP kernel - a suggestion from a > > kernel-nnob, sorry. My SMP boxes work now with UP-kernel, but they are > > really slowish although they have modern Intel C2D/Penryn cores. > > I need SMP for OpenMP codes. It's a shame if SMP is buggy, but > I guess all is down to small user base.. > Before you go down that path, which, IMHO, is as counterproductive as it is incorrect, could you, please, show the output of sysctl debug | grep panic and check whether output of savecore -vC makes sense to you. -- Alexandre Kovalenko (Олександр Коваленко)Received on Wed Jul 29 2009 - 20:35:25 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:52 UTC