ZFS panic with concurrent recv and read-heavy workload

From: Nathaniel W Filardo <nwf_at_cs.jhu.edu>
Date: Wed, 6 Apr 2011 04:00:43 -0400
When racing two workloads, one doing
>  zfs recv -v -d testpool
and the other
>  find /testpool -type f -print0 | xargs -0 sha1
I can (seemingly reliably) trigger this panic:

panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked _at_ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869                                                  
                                                                                                       
cpuid = 1                                                                                              
KDB: stack backtrace:                                                                                  
panic() at panic+0x1c8                                                                                 
_sx_assert() at _sx_assert+0xc4                                                                        
_sx_xunlock() at _sx_xunlock+0x98                                                                      
arc_evict() at arc_evict+0x614                                                                         
arc_get_data_buf() at arc_get_data_buf+0x360                                                           
arc_buf_alloc() at arc_buf_alloc+0x94                                                                  
dmu_buf_will_fill() at dmu_buf_will_fill+0xfc
dmu_write() at dmu_write+0xec
dmu_recv_stream() at dmu_recv_stream+0x8a8                                                             
zfs_ioc_recv() at zfs_ioc_recv+0x354                                                                   
zfsdev_ioctl() at zfsdev_ioctl+0xe0                                                                    
devfs_ioctl_f() at devfs_ioctl_f+0xe8                                                                  
kern_ioctl() at kern_ioctl+0x294                                                                       
ioctl() at ioctl+0x198
syscallenter() at syscallenter+0x270
syscall() at syscall+0x74                                                                              
-- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 --                                                
userland() at 0x40e72cc8                                                                               
user trace: trap %o7=0x40c13e24                                                                        
pc 0x40e72cc8, sp 0x7fdffff4641
pc 0x40c158f4, sp 0x7fdffff4721                                                                        
pc 0x40c1e878, sp 0x7fdffff47f1                                                                        
pc 0x40c1ce54, sp 0x7fdffff8b01                                                                        
pc 0x40c1dbe0, sp 0x7fdffff9431                                                                        
pc 0x40c1f718, sp 0x7fdffffd741                                                                        
pc 0x10731c, sp 0x7fdffffd831                                                                          
pc 0x10c90c, sp 0x7fdffffd8f1                                                                          
pc 0x103ef0, sp 0x7fdffffe1d1                                                                          
pc 0x4021aff4, sp 0x7fdffffe291                                                                        
done

The machine is a freshly installed and built sparc64 2-way SMP, running
today's -CURRENT with
http://people.freebsd.org/~mm/patches/zfs/zfs_ioctl_compat_bugfix.patch
applied.  Of note, it has only 1G of RAM in it, so kmem_max <= 512M.

Thoughts?  More information?  Thanks in advance.
--nwf;

Received on Wed Apr 06 2011 - 06:13:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC