Re: panic: sx_xlock() of destroyed sx _at_ /zoo/kris/src8/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:535

From: Pawel Jakub Dawidek <pjd_at_FreeBSD.org> Date: Sat, 12 Sep 2009 21:11:58 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:55 UTC

On Sun, Sep 06, 2009 at 08:32:00PM +0100, Kris Kennaway wrote:
> 9.0 doing I/O to a zfs:
> 
> panic: sx_xlock() of destroyed sx _at_ 
> /zoo/kris/src8/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:535
> db> wh
> Tracing pid 14 tid 100047 td 0xffffff000357c720
> kdb_enter() at kdb_enter+0x3d
> panic() at panic+0x17b
> _sx_xlock() at _sx_xlock+0xe9
> zfs_range_unlock() at zfs_range_unlock+0x38
> zfs_get_data() at zfs_get_data+0xd7
> zil_commit() at zil_commit+0x532
> zfs_sync() at zfs_sync+0xa6
> sync_fsync() at sync_fsync+0x13a
> VOP_FSYNC_APV() at VOP_FSYNC_APV+0xb7
> sync_vnode() at sync_vnode+0x157
> sched_sync() at sched_sync+0x1d1
> fork_exit() at fork_exit+0x12a
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8125da0d30, rbp = 0 ---
> 
> This was essentially just doing make world + cvs update + tar creation 
> in a loop and failed after about a week.

Ok, here is a patch to try:

	http://people.freebsd.org/~pjd/patches/zfs_races.patch

Actually I think I'll just commit it as I was able to reproduce and
understand your problem.

The patch fixes three races:
- The check to see that we lost race in zfs_zget() wasn't tight enough.
  This was found by Jaakko Heinonen and part of the patch is based on
  his work.
- There was a race where rollback could be called between
  zfs_freebsd_reclaim() and zfs_reclaim_complete().
- There was a race where forced unmount could be called between
  zfs_freebsd_reclaim() and zfs_reclaim_complete().

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd_at_FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!