Re: ZFS panic with concurrent recv and read-heavy workload

From: Nathaniel W Filardo <nwf_at_cs.jhu.edu>
Date: Fri, 3 Jun 2011 03:03:56 -0400
I just got this on another machine, no heavy workload needed, just booting
and starting some jails.  Of interest, perhaps, both this and the machine
triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will
confess that the machine in the original report may have had bad RAM).  I
have run a UP 1.2GHz V240 for months and never seen this panic.

This time the kernel is
> FreeBSD 9.0-CURRENT #9: Fri Jun  3 02:32:13 EDT 2011
csup'd immediately before building.  The full panic this time is
> panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked _at_
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659
>
> cpuid = 1
> KDB: stack backtrace:
> panic() at panic+0x1c8
> _sx_assert() at _sx_assert+0xc4
> _sx_xunlock() at _sx_xunlock+0x98
> l2arc_feed_thread() at l2arc_feed_thread+0xeac
> fork_exit() at fork_exit+0x9c
> fork_trampoline() at fork_trampoline+0x8
>
> SC Alert: SC Request to send Break to host.
> KDB: enter: Line break on console
> [ thread pid 27 tid 100121 ]
> Stopped at      kdb_enter+0x80: ta              %xcc, 1
> db> reset
> ttiimmeeoouutt  sshhuuttttiinngg  ddoowwnn  CCPPUUss..

Half of the memory in this machine is new (well, came with the machine) and
half is from the aforementioned UP V240 which seemed to work fine (I was
attempting an upgrade when this happened); none of it (or indeed any of the
hardware save the disk controller and disks) are common between this and the
machine reporting below.

Thoughts?  Any help would be greatly appreciated.
Thanks.
--nwf;

On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote:
>[...]
> panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked _at_ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869
>
> cpuid = 1
> KDB: stack backtrace:
> panic() at panic+0x1c8
> _sx_assert() at _sx_assert+0xc4
> _sx_xunlock() at _sx_xunlock+0x98
> arc_evict() at arc_evict+0x614
> arc_get_data_buf() at arc_get_data_buf+0x360
> arc_buf_alloc() at arc_buf_alloc+0x94
> dmu_buf_will_fill() at dmu_buf_will_fill+0xfc
> dmu_write() at dmu_write+0xec
> dmu_recv_stream() at dmu_recv_stream+0x8a8
> zfs_ioc_recv() at zfs_ioc_recv+0x354
> zfsdev_ioctl() at zfsdev_ioctl+0xe0
> devfs_ioctl_f() at devfs_ioctl_f+0xe8
> kern_ioctl() at kern_ioctl+0x294
> ioctl() at ioctl+0x198
> syscallenter() at syscallenter+0x270
> syscall() at syscall+0x74
> -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 --
> userland() at 0x40e72cc8
> user trace: trap %o7=0x40c13e24
> pc 0x40e72cc8, sp 0x7fdffff4641
> pc 0x40c158f4, sp 0x7fdffff4721
> pc 0x40c1e878, sp 0x7fdffff47f1
> pc 0x40c1ce54, sp 0x7fdffff8b01
> pc 0x40c1dbe0, sp 0x7fdffff9431
> pc 0x40c1f718, sp 0x7fdffffd741
> pc 0x10731c, sp 0x7fdffffd831
> pc 0x10c90c, sp 0x7fdffffd8f1
> pc 0x103ef0, sp 0x7fdffffe1d1
> pc 0x4021aff4, sp 0x7fdffffe291
> done
>[...]

Received on Fri Jun 03 2011 - 05:26:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:14 UTC