Re: umtx/libthr SMP fixes.

From: Bryan Liesner <bleez_at_verizon.net>
Date: Wed, 4 Jun 2003 02:39:30 -0400 (EDT)
On Tue, 3 Jun 2003, Robert Watson wrote:

>
> On Tue, 3 Jun 2003, Bryan Liesner wrote:
>
> > Actually, no it doesn't.  I was able to use kern_umtx v 1.3 only if I
> > removed atapicam from my kernel config.  These patches (now committed?)
> > panic the system whether I use atapicam or not.  With kern_umtx v1.2
> > there is no panic at all, with or without atapicam.
> >
> > Actually, I think it's cam in general that's causing the panic with
> > these changes.
>
> Bizarre.  Sounds like an errant pointer in some other code, and it's just
> a matter of the memory layout as to what gets stepped on.  Alternatively,
> it might be affected by the insertion of the MTX sysinit event.  Perhaps
> that revision rearranges memory a bit.

Even more bizarre.  I have cvsupped to the latest source, built a
kernel with DDB and it won't panic.  Without DDB, it panics.  But the
behavior has changed a bit. I now panics _without_ atapicam in the
build, at boot time.  With atapicam, it panics and dumps core if I do
an init 6.  Savecore refuses to grab the dump:

gravy savecore: first and last dump headers disagree on /dev/ad0s1b
gravy savecore: unsaved dumps found but not saved

I cleared the dump and tried again with the same results.

If I reboot with the USB drive mounted, it will panic on the init 6,
unmounted, it reboots without trouble.

Any hints on grabbing a dump without savecore complaining, please let
me know.  I don't have anything specific to report yet, when I have
time tomorrow I'll try to get more information out.

>
> Anyhow, here are some things you might consider, since this whole thing is
> so odd.  Try merging the addition of the struct mtx declaration from 1.3
> into 1.2 and see if you get the same panic.  If you don't, try merging the
> MTX_SYSINIT line and see if that triggers the panic.  The other changes
> probably wouldn't cause disruptive memory rearrangement, so see what
> happens.  If the panics appear with the addition of the variable, it
> probably is a memory stepping thing and a bug in some other piece of code
> (unfortunately, probably hard to track down).  If it's the addition of the
> initializer, it's a different class of problem.

Right now I'm at rev 1.4 of kern_umtx... I'll try reverting back and
trying this time permitting...


> I have to admit that I'm also fairly baffled: my current reading of the
> change suggests there won't be a specific bug in umtx, rather, the
> triggering of symptoms from another bug, but I guess we can only find out
> with a bit of experimentation.  You might also find the problem
> "disappears" if you remove INVARIANTS, although given that you can
> reproduce this nicely, I'm reluctant to have you do that for fear the bug
> will get away and not get fixed.

INVARIANTS wasn't in the picture to begin with.  If I put it in, it
will probably disappear, as with using DDB.  The code has changed
sufficiently now that I can't reproduce the original panic that's in
the PR, but it's still panicking...


-- 
=============================================================
= Bryan D. Liesner            LeezSoft Communications, Inc. =
=                             A subsidiary of LeezSoft Inc. =
= bleez_at_verizon.net           Home of the Gipper            =
=============================================================
Received on Tue Jun 03 2003 - 21:39:34 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:10 UTC