4BSD instability

From: Jeff Roberson <jroberson_at_chesapeake.net>
Date: Wed, 15 Dec 2004 15:57:58 -0500 (EST)
On Tue, 14 Dec 2004, Tony Arcieri wrote:

> On 2004-12-13 17:26:10 Scott Long wrote:
>
> > RELENG_5 is the stable branch.  If quality testing goes into ULE in HEAD
> > and it's shown to be as stable as 4BSD then we can consider it for
> > RELENG_5 in the future.  Given the incredible problems that we had in
> > the scheduler leading up to 5.3, I'm not excited about quickly merging
> > these things.
>
> I have FreeBSD 5.3 installed on a dual amd64 colo server of mine and have
> been experiencing severe issues with the system and the 4BSD scheduler under
> heavy MySQL load.  Originally with 5.3-RELEASE these appeared to be kernel
> crashes/deadlocks, but unfortunately I never had a dump device configured
> when I was running 5.3-RELEASE and so I don't have a core file to be examined.
>
> However, I've been checking out the sys/ sources from RELENG_5 fairly frequently
> and still experience severe issues with the 4BSD scheduler when the system
> is under heavy database load.  Namely, while the kernel appears to remain
> running and the system continues to respond to pings, all other network
> services cease to function.  New TCP connections are accepted, but the
> services don't respond, and existing connections time out.
>

I have cc'd two developers who work quite a lot on scheduler related
things.  I think it's very important that we discover the source of your
instability.  Is your machine available to reproduce this scenario and
gather debuging information?  If not, can you provide us with steps needed
to reproduce this ourselves?  Can you describe your environment in more
detail?  What software are you running, is it threaded, how much memory do
you have, etc?

I'm very pleased that ULE is working well for you, but 4BSD stability is
very important.  I am actually leaving the country tomorrow, so I'm hoping
John and/or Julian will pick up this thread and help you debug.

Cheers,
Jeff

> I have found this does NOT occur when the ULE scheduler is used.  I have
> (perhaps foolishly) attempted to copy the minimum necessary files to run the
> ULE scheduler from the -CURRENT branch and merge them myself into the 5-STABLE
> sources, which I believe are sched_ule.c and kern_switch.c, and have modified
> the proc_fini() function in kern_proc.c to panic if invoked (since according to
> the comments, UMA should ensure that proc_fini is never called, correct?).  If
> these are all the changes that are needed to import the ULE scheduler, then why
> continue to include the broken ULE scheduler with an #error tag rather than
> importing the minimum sources required for the ULE scheduler to work and leave
> it off per default?
>
> I, for one, am experiencing better system stability with ULE than with 4BSD.
> If anyone cares to examine my system I can provide shell access.
>
> Tony Arcieri
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
Received on Wed Dec 15 2004 - 19:58:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:24 UTC