On Tue, Dec 11, 2012 at 9:55 PM, Rick Macklem <rmacklem_at_uoguelph.ca> wrote: > Konstantin Belousov wrote: >> On Mon, Dec 10, 2012 at 07:11:59PM -0500, Rick Macklem wrote: >> > Konstantin Belousov wrote: >> > > On Mon, Dec 10, 2012 at 01:38:21PM -0500, Rick Macklem wrote: >> > > > Adrian Chadd wrote: >> > > > > .. what was the previous kernel version? >> > > > > >> > > > Hopefully Tim has it narrowed down more, but I don't see >> > > > the hangs on a Sept. 7 kernel from head and I do see them >> > > > on a Dec. 3 kernel from head. (Don't know the eact rNNNNNN.) >> > > > >> > > > It seems to predate my commit (r244008), which was my first >> > > > concern. >> > > > >> > > > I use old single core i386 hardware and can fairly reliably >> > > > reproduce it by doing a kernel build and a "svn checkout" >> > > > concurrently. No NFS activity. These are running on a local >> > > > disk (UFS/FFS). (The kernel I reproduce it on is built via >> > > > GENERIC for i386. If you want me to start a "binary search" >> > > > for which rNNNNNN, I can do that, but it will take a while.:-) >> > > > >> > > > I can get out into DDB, but I'll admit I don't know enough >> > > > about it to know where to look;-) >> > > > Here's some lines from "db> ps", in case they give someone >> > > > useful information. (I can leave this box sitting in DB for >> > > > the rest of to-day, in case someone can suggest what I should >> > > > look for on it.) >> > > > >> > > > Just snippets... >> > > > Ss pause adjkerntz >> > > > DL sdflush [sofdepflush] >> > > > RL [syncer] >> > > > DL vlruwt [vnlru] >> > > > DL psleep [bufdaemon] >> > > > RL [pagezero] >> > > > DL psleep [vmdaemon] >> > > > DL psleep [pagedaemon] >> > > > DL ccb_scan [xpt_thrd] >> > > > DL waiting_ [sctp_iterator] >> > > > DL ctl_work [ctl_thrd] >> > > > DL cooling [acpi_cooling0] >> > > > DL tzpoll [acpi_thermal] >> > > > DL (threaded) [usb] >> > > > ... >> > > > DL - [yarrow] >> > > > DL (threaded) [geom] >> > > > D - [g_down] >> > > > D - [g_up] >> > > > D - [g_event] >> > > > RL (threaded) [intr] >> > > > I [irq15: ata1] >> > > > ... >> > > > Run CPU0 [swi6: Giant taskq] >> > > > --> does this one indicate the CPU is actually running this? >> > > > (after a db> cont, wait a while <ctrl><alt><esc> db> ps >> > > > it is still the same) >> > > > I [swi4: clock] >> > > > I [swi1: netisr 0] >> > > > I [swi3: vm] >> > > > RL [idle: cpu0] >> > > > SLs wait [init] >> > > > DL audit_wo [audit] >> > > > DLs (threaded) [kernel] >> > > > D - [deadlkres] >> > > > ... >> > > > D sched [swapper] >> > > > >> > > > I have no idea if this "ps" output helps, unless it indicates >> > > > that it is looping on the Giant taskq? >> > > Might be. You could do 'bt <pid>' for the process to see where it >> > > loops. >> > > Another good set of hints is at >> > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html >> > >> > Kostik, you must be clairvoyant;-) >> > >> > When I did "show alllocks", I found that the syncer process held >> > - exclusive sleep mutex mount mtx locked _at_ kern/vfs_subr.c:4720 >> > - exclusive lockmgr syncer locked _at_ kern/vfs_subr.c:1780 >> > The trace for this process goes like: >> > spinlock_exit >> > mtx_unlock_spin_flags >> > kern_yield >> > _mnt_vnode_next_active >> > vnode_next_active >> > vfs_msync() >> > >> > So, it seems like your r244095 commit might have fixed this? >> > (I'm not good at this stuff, but from your description, it looks >> > like it did the kern_yield() with the mutex held and "maybe" >> > got into trouble trying to acquire Giant?) >> > >> > Anyhow, I'm going to test a kernel with r244095 in it and see >> > if I can still reproduce the hang. >> > (There wasn't much else in the "show alllocks", except a >> > process that held the exclusive vnode interlock mutex plus >> > a ufs vnode lock, but it's just doing a witness_unlock.) >> There must be a thread blocked for the mount interlock for the loop >> in the mnt_vnode_next_active to cause livelock. >> > Yes. I am getting hangs with the -current kernel and they seem > easier for me to reproduce. Can you report the svn rev number is kernel is built from? Attilio -- Peace can only be achieved by understanding - A. EinsteinReceived on Tue Dec 11 2012 - 20:57:46 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:33 UTC