hang with raid, postgresql

From: Don Bowman <don_at_sandvine.com>
Date: Sun, 30 May 2004 15:52:04 -0400
I have a system with 2x 2.8GHz XEON (P4), intel e7501 chipset, 
4GB of ram, aac [adaptec 2200s] raid with 4 scsi
disks. I have also tried asr (adaptec 2015).
I have tried two different motherboards.
The only application the machine runs is postgresql,
with about ~30 databases, about ~250GB of data.

I'm finding the machine locks up solid once a day
or so (sometimes more, sometimes less, no pattern
of time of day). I know its not a hardware issue, it 
is reliable with FreeBSD 4.7. I've run through memory 
test, disk test, etc.

There appears to be a correlation between
disk activity (postgresql vacuum) and the lockup,
but i can't be sure.

I've just reproduced it with a cvsup from head today
[2004-05-30 12:00 EDT], so its still present.
I've got a serial console, and the break to 
debugger (which works under normal circumstances).

In the lockup case, i cannot drop into db, and
no output appears anywhere. I have enabled
the following options, but still no affect, no
messages come out (other than erroneous LOR
issues).

options         ALT_BREAK_TO_DEBUGGER
options         DDB            
options         INVARIANTS     
options         INVARIANT_SUPPORT 
options         WITNESS
options         WITNESS_SKIPSPIN
options         MUTEX_DEBUG
options         DIAGNOSTIC

i've tried both with and without ACPI. It
does not have PAE configured in.

The fact that i can't drop into the debugger
using the CR ~ ^B sequence when its locked up
implies that its no longer servicing the serial
interrupt.

Does anyone have any suggestions? postgresql
makes use of disk, sysv semaphores, shared memory,
etc.

I don't have sound, vga, X, ... any of the
'complicated' things, its just a server.
There is no ATA.

I tried setting kern.smp.active to 0, but
it still locked up.

I'm looking for any suggestions. I have 
attached the config file from it if anyone
has any comments on that.

--don



Received on Sun May 30 2004 - 10:52:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:55 UTC