On Dec 31, 2009, at 5:49 AM, John Baldwin wrote: > On Wednesday 30 December 2009 4:55:44 pm Marcel Moolenaar wrote: >> All, >> >> We still have a ZFS-triggerable panic. The conditions under which the panic >> happens are "simple": >> >> 1. Create a mount-point /dos, and mount a MS-DOS file system >> there. >> 2. Create directory /dos/zfs >> 3. Make /boot/zfs a symlink to /dos/zfs >> 4. create or import a pool, like "zpool import tank" >> >> ZFS will create/update the zpool cache (/boot/zfs/zpool.cache) >> and when done exits the zfskern/solthread thread, at which time >> the panic happens: >> >> panic: mutex Giant owned at /tank/usr/src/sys/kern/kern_thread.c:357 >> cpuid = 0 >> KDB: enter: panic >> [thread pid 8 tid 100147 ] >> Stopped at kdb_enter+0x92: [I2] addl r14=0xffffffffffe1f3f0,gp ;; >> db> show alllocks >> Process 8 (zfskern) thread 0xe000000010df4a20 (100147) >> exclusive sleep mutex process lock (process lock) r = 0 (0xe000000010407660) > locked _at_ /tank/usr/src/sys/kern/kern_kthread.c:326 >> exclusive sleep mutex Giant (Giant) r = 1 (0xe0000000048f8da8) locked _at_ > /tank/usr/src/sys/kern/vfs_lookup.c:755 >> >> It looks to me that this is a bug in vfs_lookup.c, but I'm not >> savvy enough to know this for sure or fix it fast myself. Help >> is welcome, because this particular bug hits ia64 hard: /boot >> is a symlink to /efi/boot, where /efi is a msdosfs mount point. > > Can you get a stack trace? The bug is probably that ZFS isn't properly > honoring NDHASGIANT() someplace. Hmm, it certainly doesn't honor it > in lookupnameat(). You could maybe have it unlock Giant there, but I > believe that will result in ZFS not acquiring Giant for any vnode > operations on a returned vnode from a !MPSAFE filesystem. The backtrace is rather useless: # zpool import tank panic: mutex Giant owned at /tank/usr/src/sys/kern/kern_thread.c:357 cpuid = 1 KDB: enter: panic [thread pid 8 tid 100105 ] Stopped at kdb_enter+0x92: [I2] addl r14=0xffffffffffe1fab8,gp ;; db> bt Tracing pid 8 tid 100105 td 0xe0000000109e1560 kdb_enter(0xe0000000047984c0, 0xe0000000047984c0, 0xe00000000439bb70, 0x793) at kdb_enter+0x92 panic(0xe000000004796058, 0xe000000004796728, 0xe000000004799b48, 0x165) at panic+0x2f0 _mtx_assert(0xe000000004911828, 0x0, 0xe000000004799b48, 0x165) at _mtx_assert+0x200 thread_exit(0xe000000004799b48, 0x0, 0xe000000004793480, 0xe0000000109e1560) at thread_exit+0x70 kthread_exit(0xe000000004793480, 0xe000000010407568, 0xe000000004c7bb80, 0x58f) at kthread_exit+0xd0 spa_async_thread(0xe000000010e59000, 0x1, 0xe000000004791aa8, 0x343) at spa_async_thread+0x1a0 fork_exit(0xe00000000485f130, 0xe000000010e59000, 0xa000000034b61550) at fork_exit+0x110 enter_userland() at enter_userland I traced the locks (with a tweak to get Giant included) and it looks like vfs_lookup.c is fine: there are as many unlocks of Giant as there are locks. Unfortunately the default trace buffer size doesn't capture the problem entirely. Though it may show a little bit already: : 882 (0xe0000000109e1560:cpu1): _mtx_unlock_sleep: 0xe000000004911828 unrecurse 881 (0xe0000000109e1560:cpu1): UNLOCK (sleep mutex) Giant 0xe000000004911828 r = 2 at /tank/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/sys/vnode.h:266 : 861 (0xe0000000109e1560:cpu1): vfs_ref: mp 0xe0000000108e0ed8 860 (0xe0000000109e1560:cpu1): LOCK (sleep mutex) Giant 0xe000000004911828 r = 2 at /tank/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/sys/vnode.h:264 859 (0xe0000000109e1560:cpu1): _mtx_lock_sleep: 0xe000000004911828 recursing 858 (0xe0000000109e1560:cpu1): _mtx_unlock_sleep: 0xe000000004911828 unrecurse 857 (0xe0000000109e1560:cpu1): UNLOCK (sleep mutex) Giant 0xe000000004911828 r = 2 at /tank/usr/src/sys/kern/vfs_syscalls.c:3675 856 (0xe0000000109e1560:cpu1): _mtx_unlock_sleep: 0xe000000004911828 unrecurse 855 (0xe0000000109e1560:cpu1): UNLOCK (sleep mutex) Giant 0xe000000004911828 r = 3 at /tank/usr/src/sys/kern/vfs_syscalls.c:3674 : (some sleep operations that result in repeated unlocks and locks of Giant) : When the trace starts Giant has been locked 4 times (recursively) and the only unmatched unlocks are those in vfs_syscalls.c shows above. Thus: we're still missing 2 unlocks of Giant somewhere of which the lock location is not known. I'll redo the experiment with a 128K entry trace buffer or so and see what comes up... BTW: Xin LI gave me a patch with 2 missing unlocks of Giant in zfs_dir.c It seem ZFS is rather sloppy WRT to Giant :-/ FYI, -- Marcel Moolenaar xcllnt_at_mac.comReceived on Thu Dec 31 2009 - 16:36:12 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:59 UTC