Re: Filesystem wedges caused by r251446

From: Ian FREISLICH <ianf_at_clue.co.za>
Date: Sat, 13 Jul 2013 10:14:06 +0200
Konstantin Belousov wrote:
> On Fri, Jul 12, 2013 at 11:34:18PM +0200, Ian FREISLICH wrote:
> > (kgdb) print runningbufreq
> > $1 = 1
> > (kgdb) print runningbufspace
> > $2 = 0
> > (kgdb) print lorunningspace
> > $3 = 4587520
> > (kgdb) print hirunningspace
> > $4 = 4194304
> 
> This is extremely weird.  The hirunningspace is less then lorunningspace,
> am I right ?  This causes the runningbufspace machinery to never wake up

Yes.  This state of affairs doesn't happen on r251445 and further
testing on my side shows it doesn't hapen on all my amd64 servers.
It appears that this particular server type (Dell R200) running
amd64 with geom_mirror is affected.  I will have to test further
by destroying the mirror and removing it from the kernel and see
if I can still reproduce the issue.  Perhaps r251446 exposes
insufficient locking on opperations affecting these variables.

FWIW, I cannot reproduce the problem if the mirror is rebuilding.

> I just verified on the 4G VM on amd64, my numbers for lo is 4587520,
> for high 6881280.  Verify your tuning and kernel options, which you should
> have provided with the original report, I think.

Sorry about that (and I'm relieved:) I had originally compiled with
CPUTYPE?=opteron which is incorrect for this CPU.  However the
problem persists with CPUTYPE?=core2, but I'm not sure how much of
a difference this makes with clang.  Also, I have another affected
host that's compiled with gcc and the correct CPUTYPE so I doubt
it's the compiler.

I've attached make.conf, kernelconfig and dmesg.boot.  You'll notice
it's r251446M - which is a result of your patch.

Ian

-- 
Ian Freislich


Received on Sat Jul 13 2013 - 06:14:24 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:39 UTC