vinum dangling vnode & GEOM stripe problems

From: Daniel Eriksson <daniel_k_eriksson_at_telia.com>
Date: Fri, 16 Jul 2004 00:55:44 +0200
Background:
I have a machine running a fairly recent CURRENT (2004.07.12.22.00.00) that
has a bunch of discs. 26 discs are hooked up through the onboard ATA and
SATA controllers (VIA KT-600), an Adaptec 29160 and two HighPoint RocketRAID
454 (4 channel HPT374 cards). The discs are used either as single discs or
combined into RAID-0 and RAID-1 arrays using the "old" vinum or ataraid.
There's 11 different physical file systems at this point, plus over 200
mount_nullfs rw mounts. I'm having some hardware problems relating to
interrupt routing, but the last kernel (see above) seems pretty stable. I
still get interrupt storms during device probing every now and then  though,
which worries me, but after tweaking the interrupt storm threshold
(hw.intr_storm_threshold=5000) this seems to be less of a problem.

Problem #1:
I usually boot the machine single-user so that I can verify all discs were
detected during boot (interrupt storms sometime prevent proper probing,
forcing ataraid and/or vinum to mark arrays as crashed). However, after the
latest kernel upgrade and the threshold tweaking the machine has been stable
enough that I dared to try a normal multi-user boot. However, it always ends
up with a vinum panic (dangling vnode, as reported by others). Unfortunately
I don't have enough swap for a coredump right now so I cannot provide any
additional info (hope to change that in the next couple of days).

Problem #2:
Yesterday I was playing around with the GEOM stripe module, and it messed
the machine up pretty bad when trying to create a simple 2 disc stripe. It
made all the discs that were attached to ataraid arrays time out, which
resulted in the stripes getting crashed, and then it eventually panicked in
some vfs function.


As soon as I have added more swap I'll post a more detailed error report.
Just wanted to throw this out as it seems to be very easy for me to
reproduce the panics. Since the machine is rather sensitive to downtime I
usually go into panic mode myself when it crash, so I'll blame the lack of
details on that. Once I get the kernel to save coredumps it should be easier
to provide accurate info (kernel already compiled with debug symbols and
KDB/DDB).

/Daniel Eriksson
Received on Thu Jul 15 2004 - 20:55:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:01 UTC