Re: r244036 kernel hangs under load.

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Mon, 10 Dec 2012 13:38:21 -0500 (EST)
Adrian Chadd wrote:
> .. what was the previous kernel version?
> 
Hopefully Tim has it narrowed down more, but I don't see
the hangs on a Sept. 7 kernel from head and I do see them
on a Dec. 3 kernel from head. (Don't know the eact rNNNNNN.)

It seems to predate my commit (r244008), which was my first
concern.

I use old single core i386 hardware and can fairly reliably
reproduce it by doing a kernel build and a "svn checkout"
concurrently. No NFS activity. These are running on a local
disk (UFS/FFS). (The kernel I reproduce it on is built via
GENERIC for i386. If you want me to start a "binary search"
for which rNNNNNN, I can do that, but it will take a while.:-)

I can get out into DDB, but I'll admit I don't know enough
about it to know where to look;-)
Here's some lines from "db> ps", in case they give someone
useful information. (I can leave this box sitting in DB for
the rest of to-day, in case someone can suggest what I should
look for on it.)

Just snippets...
   Ss pause     adjkerntz
   DL sdflush  [sofdepflush]
   RL            [syncer]
   DL vlruwt   [vnlru]
   DL psleep   [bufdaemon]
   RL          [pagezero]
   DL psleep   [vmdaemon]
   DL psleep   [pagedaemon]
   DL ccb_scan [xpt_thrd]
   DL waiting_ [sctp_iterator]
   DL ctl_work [ctl_thrd]
   DL cooling  [acpi_cooling0]
   DL tzpoll   [acpi_thermal]
   DL (threaded) [usb]
   ...
   DL -        [yarrow]
   DL (threaded) [geom]
   D  -         [g_down]
   D  -         [g_up]
   D  -         [g_event]
   RL   (threaded) [intr]
   I            [irq15: ata1]
   ...
   Run CPU0    [swi6: Giant taskq]
--> does this one indicate the CPU is actually running this?
   (after a db> cont, wait a while <ctrl><alt><esc> db> ps
    it is still the same)
   I            [swi4: clock]
   I            [swi1: netisr 0]
   I            [swi3: vm]
   RL           [idle: cpu0]
   SLs wait     [init]
   DL  audit_wo [audit]
   DLs (threaded) [kernel]
   D  -         [deadlkres]
   ...
   D   sched    [swapper]

I have no idea if this "ps" output helps, unless it indicates
that it is looping on the Giant taskq?

As I said, I can leave it in "db" for to-day, if anyone wants
me to do anything in the debugger and I can probably reproduce
it, if someone wants stuff tried later.

rick


> 
> 
> adrian
> 
> 
> On 9 December 2012 22:08, Tim Kientzle <kientzle_at_freebsd.org> wrote:
> > I haven't found any useful clues yet, but thought I'd ask if anyone
> > else
> > was seeing hangs in a recent kernel.
> >
> > I just upgraded to r244036 using a straight GENERIC i386 kernel.
> > (Straight buildworld/buildkernel, no local changes, /etc/src.conf
> > doesn't
> > exist, /etc/make.conf just has PERL_VERSION defined.)
> >
> > When I try to cross build an ARM world on the resulting system,
> > the entire system hangs hard after about 30 minutes: No network,
> > no keyboard response, no nothing.
> >
> > Don't know if it's relevant, but the system is using NFS pretty
> > heavily (Parallels VM mounting NFS from Mac OS 10.7 host.)
> >
> > I'll try to get some more details ...
> >
> > Tim
> >
> > _______________________________________________
> > freebsd-current_at_freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to
> > "freebsd-current-unsubscribe_at_freebsd.org"
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "freebsd-current-unsubscribe_at_freebsd.org"
Received on Mon Dec 10 2012 - 17:38:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:33 UTC