Re: PLEASE TEST: IPI deadlock avoidance patch

From: Doug White <dwhite_at_gumbysoft.com>
Date: Thu, 26 Aug 2004 11:18:34 -0700 (PDT)
On Thu, 26 Aug 2004, Craig Boston wrote:

> On Sun, Aug 22, 2004 at 12:05:39PM -0700, Doug White wrote:
> > If you have a reasonably fast i386 or amd64 multiprocessor and/or
> > hyperthreading machine and are experiencing reproducible hangs during -j
> > buildwords and other highly parallel operations, please try this patch:
>
> Just a follow-up to my off-list message and another data point, with
> this patch I no longer get deadlocks, however I now get random data
> corruption.

Okay, for those of you experiencing the data corruption issue, I need to
know the following:

. cvsup date & time for the affect kernel(s)
. branch you're tracking
. revision of src/sys/kern/kern_lock.c - I'm checking for a specific set
  of commits here
. reproduction case - applications involved and detailed description of
  the operation(s) involved.

It would also be nice if you could set up a serial console and attempt to
break into the debugger with an NMI, if your system is so equipped. You'll
want to set these sysctls beforehand:

machdep.panic_on_nmi=0
debug.kdb.stop_cpus=0

That should prevent the usual suspects from disrupting your entry to ddb.
This usually works for me for getting into ddb in the IPI deadlock
situation.

If you are tracking RELENG_5, be aware the patch is NOT committed there,
and cvsup will happily obliterate the changed files on next run. So be
sure to reapply the patch after cvsup until the patch is merged, which
should be Real Soon Now.

> Disabling the second processor or falling back to an older kernel (one
> from before the IPI hangs started) both fix the problem.

My guess here is that there is another change that got masked by the IPI
problems that are causing this, and getting SMP usable again has brought
it into the light.

-- 
Doug White                    |  FreeBSD: The Power to Serve
dwhite_at_gumbysoft.com          |  www.FreeBSD.org
Received on Thu Aug 26 2004 - 16:18:34 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:08 UTC