Re: Native preemption is the culprit [was Re: today's CURRENT lockups]

From: Ariff Abdullah <skywizard_at_MyBSD.org.my>
Date: Sat, 10 Jul 2004 15:06:20 +0800
On Sat, 10 Jul 2004 01:18:06 -0400 (EDT)
Robert Watson <rwatson_at_freebsd.org> wrote:
> 
> On Fri, 9 Jul 2004, Robert Watson wrote:
> 
> > I'm now experiencing extremely hard hangs in the following
> > configurations:
> > 
> >   SMP kernel running SCHED_ULE with hyperthreads
> >   SMP kernel running SCHED_4BSD with hyperthreads
> > 
> > To generate the load, I'm using the "supersmack" benchark with the
> > select-key.smack query set with 30 client workers and 10,000
> > transactions. I am able to reliable hang the system with one or
> > two runs.
> > 
> > By disabling the "#define PREEMPTION" entry in param.h with
> > SCHED_4BSD, I'm able to complete the benchmark several times in a
> > row without apparent problems. However, I'll leave it running for
> > a few more hours and see if I didn't just "get lucky".  I'll then
> > try SCHED_ULE w/o PREEMPTION. 
> > 
> > By "extremely hard" I mean that I am unable to break into the
> > debugger using a serial break on the serial console.  I have not
> > yet been able to run the test on a system with easily accessible
> > NMI but will attempt to do so in the next few days.
> > 
> > I'll give UP a spin with various combinations next. 
> 
> FYI, UP+SCHED_ULE with PREEMPTION hung within three seconds of
> starting the benchmark.  Without PREEMPTION it seems to run fine.
> 
> So it looks like either PREEMPTION has a problem, or it's triggering
> an existing problem elsewhere.  If it's only one problem, it seems
> not to depend on either SMP/UP or the scheduler choice.  If it's
> multiple problems, who knows :-).  As the MySQL test relies on
> threading, we could be looking at an edge case involving threading
> and scheduling/preemption-- the other reports I've seen mention
> X11/KDE, which would also involve threading.  On the other hand, it
> could just be load.  Tomorrow I'll load up a box with non-threaded
> apps and see what happens.
>

I'm suspecting bad combination between threaded apps and current
native preemption, either the preemption itself, or threads. Running
current kernel without any threaded apps turns up nothing suspicious.
Once the threaded apps started, it's like sending the entire system to
the death row.

I'm reverting following files to pre-July 2 to achive solid stability:

 sys/sys/interrupt.h          - v1.27
 sys/kern/kern_intr.c         - v1.110
 sys/i386/i386/intr_machdep.c - v1.6
 sys/kern/sched_ule.c         - v1.109

CPU: AMD Duron(tm)  (1800.08-MHz 686-class CPU)
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,
   PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
AMD Features=0xc0400000<AMIE,DSP,3DNow!>
--

Ariff Abdullah
MyBSD

http://www.MyBSD.org.my (IPv6/IPv4)
http://staff.MyBSD.org.my (IPv6/IPv4)
http://tomoyo.MyBSD.org.my (IPv6/IPv4)
Received on Sat Jul 10 2004 - 05:06:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:01 UTC