5.2R: panic (syncer) on IBM x345 (SMP and Vinum)

From: Matti Saarinen <mjs_at_cc.tut.fi>
Date: Mon, 19 Jan 2004 12:36:10 +0200
I've been able to crash a server (usenet news server) running 5.2R.
The crash happens with and without ACPI. The attached info is with
ACPI enabled. I would be very pleased if someone could tell me why the
box crashed and how to prevent it from happening. I tried searching
the list archives and googling wihout any positive result.

The hardware is IBM x345 with two CPUs (Pentium4), internal LSI
SCSI/RAID controller and external IBM SCSI controller (which is really
Adaptec SCSI Card 29320LP). There is IBM ESX400 disk array connected
to the Adaptec controller. All the disks are U320 disk.

The root filesystem is mirrored with the LSI adapter (which only
supports mirroring of two drives). There are three other mirrored
filesystems created with vinum. On all file systems except root, I've
enabled soft updates. I've tested all the filesystems (mirrored root, 
vinum mirrors and filesystems created on single disks) with bonnie++
and iozone and the server has behaved well. 

The disk layout is following (da0-6 are connected to the Adaptec
controller and the rest to the LSI controller):

#df
Filesystem          512-blocks      Used     Avail Capacity  Mounted on
/dev/da16s1a           2025948    150456   1713420     8%    /
devfs                        2         2         0   100%    /dev
/dev/da16s1h          12172084    147124  11051196     1%    /home
/dev/da16s1g           4052060     17744   3710152     0%    /tmp
/dev/da16s1d          52808984   4025312  44558956     8%    /usr
/dev/da16s1e          40616796    172624  37194832     0%    /news
/dev/da16s1f          10154076     15368   9326384     0%    /var
/dev/vinum/news_db   138862504    284336 127469168     0%    /news/db
/dev/vinum/overview  138862504    168888 127584616     0%    /overview
/dev/vinum/fispool   138862504 125891528   1861976    99%    /cnfs/fispool
/dev/da2a            138862772 125891536   1862216    99%    /cnfs/altspool
/dev/da3a            138862772 125891528   1862224    99%    /cnfs/altspool/bin1
/dev/da4a            138862772 125891528   1862224    99%    /cnfs/altspool/bin2
/dev/da5a            138862772 125891528   1862224    99%    /cnfs/therest/1
/dev/da6a            138862772 126916072    837680    99%    /cnfs/therest/2
procfs                       8         8         0   100%    /proc
#vinum l
6 drives:
D vinumdrive3           State: up       /dev/da21a      A: 0/70006 MB (0%)
D vinumdrive2           State: up       /dev/da20a      A: 0/70006 MB (0%)
D vinumdrive1           State: up       /dev/da19a      A: 0/70006 MB (0%)
D vinumdrive0           State: up       /dev/da18a      A: 0/70006 MB (0%)
D vinumdrive5           State: up       /dev/da1a       A: 0/70006 MB (0%)
D vinumdrive4           State: up       /dev/da0a       A: 0/70006 MB (0%)

3 volumes:
V news_db               State: up       Plexes:       2 Size:         68 GB
V overview              State: up       Plexes:       2 Size:         68 GB
V fispool               State: up       Plexes:       2 Size:         68 GB

6 plexes:
P news_db.p0          C State: up       Subdisks:     1 Size:         68 GB
P news_db.p1          C State: up       Subdisks:     1 Size:         68 GB
P overview.p0         C State: up       Subdisks:     1 Size:         68 GB
P overview.p1         C State: up       Subdisks:     1 Size:         68 GB
P fispool.p0          C State: up       Subdisks:     1 Size:         68 GB
P fispool.p1          C State: up       Subdisks:     1 Size:         68 GB

6 subdisks:
S news_db.p0.s0         State: up       D: vinumdrive0  Size:         68 GB
S news_db.p1.s0         State: up       D: vinumdrive1  Size:         68 GB
S overview.p0.s0        State: up       D: vinumdrive2  Size:         68 GB
S overview.p1.s0        State: up       D: vinumdrive3  Size:         68 GB
S fispool.p0.s0         State: up       D: vinumdrive4  Size:         68 GB
S fispool.p1.s0         State: up       D: vinumdrive5  Size:         68 GB

Now, I installed INN (cnfs + ovdb) and a test newfeed which puts
stress mainly on /news/db, /overview and /cnfs/fispool. When I started
the feed to the server, everething worked fine for a couple of minutes
and the a crash. The logs show following:

(da0:ahd0:0:0:0): Retrying Command
(da0:ahd0:0:0:0): Queue Full
(da0:ahd0:0:0:0): tagged openings now 128
(da0:ahd0:0:0:0): Retrying Command


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc07bcafe
stack pointer           = 0x10:0xe7b96784
frame pointer           = 0x10:0xe7b967c0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 79 (syncer)



Attached below are the verbose boot logs from the server and the
kernel debugger output.





Cheers,

-- 
- Matti -

Received on Mon Jan 19 2004 - 01:36:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:38 UTC