Re: SMP opteron system freezes

From: Willem Jan Withagen <wjw_at_withagen.nl>
Date: Wed, 22 Oct 2008 15:31:51 +0200
Willem Jan Withagen wrote:
 > Willem Jan Withagen wrote:
 >> Steve Kargl wrote:
 >>> On Sat, Oct 18, 2008 at 12:20:49AM +0200, Willem Jan Withagen wrote:
 >>>> I'm sort of assuming that the bge0: timeouts and coalesced links 
are due to the freezing.
 >>>
 >>> Does the following help?
 >>
 >> Just a little...
 >> It now takes a little longer for the system to freeze, but eventally 
it will.
 >> The coalesced messages did not return.
 >>
 >> Just out of curiosity is also plugged in an fxp-card.
 >> And there it takes even longer for the system to freeze, but in the 
end it does freeze.
 >>
 >> The "funny" part is it once in a while is revivable by going into 
the kernel-debugger and then just continue.
 >> Sometimes a long wait (10 sec) will suffice, during which there is 
no keyboard response what so ever.
 >> But on other instances the system is dead in the water, and only a 
hardware reset gets it back.
 >>
 >> Something I'm still wondering if this only is with NFS traffic, or 
with all other types of network traffic. But I haven't tested thids.
 >
 > Well I tested something different.
 >
 > This is a (older) dual opteron 244 system. So each chip has only one 
core.
 > And I removed one of the processors...
 >
 > Guess what:
 >     It just runs without any problems as far as I could test.
 >
 > With 2 processors it is just enough to let init start all the nfs 
related stuff in /etc/rc.d and lock up the system.
 >
 > So I guess we need to look at totally different things.
 > Given enough time, I'll check and see whether 7.x does run without 
trouble.
 >
 > If somebody thinks this thread should go to amd64, just say so.
 > But I am running the i386 stuff.

Tested 7.1-PRERELEASE, and that seems to run with mount problems.
So my guess is that there is something I have in my hardware that is 
either really wierdly broken, or there is some other problem that is 
really bothering me.

So I'm in the process of getting the serial console working to capture 
some of the traceback and stuff.

People wanting to compare dmesg.8 and dmesg.7, have a look at
www.tegenbosch28.nl:/FreeBSD/Toy

--WjW
Received on Wed Oct 22 2008 - 11:31:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:36 UTC