Re: FreeBSD 8.0-BETA2/amd64 crashes on SMP under load

From: Marcel Moolenaar <xcllnt_at_mac.com>
Date: Tue, 28 Jul 2009 08:34:52 -0700
On Jul 28, 2009, at 7:45 AM, Anton Shterenlikht wrote:

> On Tue, Jul 28, 2009 at 02:22:50PM +0000, O. Hartmann wrote:
>> Anton Shterenlikht wrote:
>>> On Mon, Jul 27, 2009 at 10:04:28PM +0100, Anton Shterenlikht wrote:
>>>> On Mon, Jul 27, 2009 at 09:55:12PM +0200, O. Hartmann wrote:
>>>>> Kamigishi Rei wrote:
>>>>>> O. Hartmann wrote:
>>>>>>> I have the problem of crashing FreeBSD 8.0-BETA2/amd64 under  
>>>>>>> load on
>>>>>>> all of our SMP boxes. Is there an issue known at the moment?  
>>>>>>> If not, I
>>>>>>> will prepare the kernel for whitnessing and provide more  
>>>>>>> informations,
>>>>>>> if you wish.
>>>>>> A quick question: what is in the crash message, i.e. the  
>>>>>> backtrace?
>>>>>> And what kind of crash is it - a panic() or a fatal trap?
>>>>> On the 8-core server box, I sometimes see :
>>>>>
>>>>> Fatal trap 12: page fault while in kernel mode
>>>>> fault code              = supervisor read, page not present
>>>> Not sure if it's related, but on ia64 SMP (2 cpus) with 8.0- 
>>>> current and
>>>> later with 8.0-beta1 (I havent' built beta2 yet) I'm getting  
>>>> crashes
>>>> under load every so often. E.g buildworld -j8 is likely to crash  
>>>> the
>>>> box. No messages, just a sudden freeze, no backtrace or panic,  
>>>> and then reboot.
>>>>
>>>> If load is less heavy, e.g. fewer processes and some idle time, the
>>>> problem doesn't seem to appear.
>>>>
>>>> I'm happy to do any further testing, if suggested.
>>>
>>> my ia64 8.0-beta1 SMP box died again on
>>> make -j8 buildworld
>>> with no panic or log entries.
>>>
>>> Is it possible that some kernel variable needs to
>>> be increased? E.g. kern.maxproc, kern.maxfiles, etc.
>>> Or perhaps I'm talking complete rubbish..
>>>
>>
>> I suggest you try again with a UP kernel - a suggestion from a
>> kernel-nnob, sorry. My SMP boxes work now with UP-kernel, but they  
>> are
>> really slowish although they have modern Intel C2D/Penryn cores.
>
> I need SMP for OpenMP codes. It's a shame if SMP is buggy, but
> I guess all is down to small user base..

I have no problems with SMP. If you don't have a panic, then
you may have a hardware problem. Check for MCA records.

-- 
Marcel Moolenaar
xcllnt_at_mac.com
Received on Tue Jul 28 2009 - 13:35:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:52 UTC