On May 27, 2009, at 07:58 PM, Artem Belevich wrote: > Hi, > > While recent ZFS improvements got rid of random hangs I used to see, > there's still one problem that I keep running into -- panic in ZFS > under heavy load. I can reproduce it by doing a build with -j16 in a > jail running i386 binaries on -CURRENT/amd64 running on a box with > quad-core CPU. It takes a while to reproduce, but it usually shows up > within couple of hours. > > Sleeping thread (tid 100606, pid 32147) owns a non-sleepable lock > sched_switch() at sched_switch+0xed > mi_switch() at mi_switch+0x16f > sleepq_wait() at sleepq_wait+0x42 > _sx_xlock_hard() at _sx_xlock_hard+0x1f0 > _sx_xlock() at _sx_xlock+0x4e > rrw_exit() at rrw_exit+0x1d > zfs_freebsd_getattr() at zfs_freebsd_getattr+0x2be > VOP_GETATTR_APV() at VOP_GETATTR_APV+0x44 > filt_vfsread() at filt_vfsread+0x51 > knote() at knote+0xc2 > VOP_WRITE_APV() at VOP_WRITE_APV+0x11f > vn_write() at vn_write+0x279 > dofilewrite() at dofilewrite+0x85 > kern_writev() at kern_writev+0x60 > write() at write+0x54 > ia32_syscall() at ia32_syscall+0x236 > Xint0x80_syscall() at Xint0x80_syscall+0x85 > --- syscall (4, FreeBSD ELF32, write), rip = 0x78162153, rsp = > 0xffff945c, rbp = 0xffff9478 --- > > It appears that locking within ZFS conflicts with vnode locking. The > back-trace is always the same. > > For now, I've applied following patch to disable the panic, but it > would be good if someone familiar with VFS locking in FreeBSD could > take a look. > If you need any additional info, let me know. > > Thanks, > --Artem > > diff -r 930d975c8103 src/sys/kern/subr_turnstile.c > --- a/sys/kern/subr_turnstile.c Fri Dec 05 16:12:43 2008 -0800 > +++ b/sys/kern/subr_turnstile.c Fri Dec 12 14:31:16 2008 -0800 > _at__at_ -219,7 +219,10 _at__at_ > #ifdef DDB > db_trace_thread(td, -1); > #endif > - panic("sleeping thread"); > + /* Don't propagate priority to a sleeping > thread. */ > + thread_unlock(td); > + return; > + // panic("sleeping thread"); > } > > /* Anyone have any updates on this? I just got a "sleeping thread" panic in ZFS after doing a zfs rollback. Unfortunately, "panic" in the debugger resulted in "dump device too small" (despite being RAM-sized) so I don't have a BT... However the BT I got in the debugger was *not* the same as yours. There was no _sx_xlock in it, but that's pretty much all I know about it. :( Regards, ThomasReceived on Thu Jun 18 2009 - 09:50:08 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:50 UTC