Re: zfs recv hangs in kmem arena

From: Xin Li <delphij_at_delphij.net>
Date: Thu, 16 Oct 2014 09:12:31 -0700
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 10/16/14 4:25 AM, James R. Van Artsdalen wrote:
> The zfs recv / kmem arena hang happens with -CURRENT as well as 
> 10-STABLE, on two different systems, with 16GB or 32GB of RAM,
> from memstick or normal multi-user environments,
> 
> Hangs usually seem to hapeen 1TB to 3TB in, but last night one run
> hung after only 4.35MB.
> 
> On 9/26/2014 1:42 AM, James R. Van Artsdalen wrote:
>> FreeBSD BLACKIE.housenet.jrv 10.1-BETA2 FreeBSD 10.1-BETA2 #2
>> r272070M: Wed Sep 24 17:36:56 CDT 2014 
>> james_at_BLACKIE.housenet.jrv:/usr/obj/usr/src/sys/GENERIC  amd64
>> 
>> With current STABLE10 I am unable to replicate a ZFS pool using
>> zfs send/recv without zfs hanging in state "kmem arena", within
>> the first 4TB or so (of a 23TB Pool).
>> 
>> The most recent attempt used this command line
>> 
>> SUPERTEX:/root# zfs send -R BIGTEX/UNIX_at_syssnap | ssh BLACKIE zfs
>> recv -duvF BIGTOX
>> 
>> though local replications fail in kmem arena too.
>> 
>> The two machines I've been attempting this on have 16BG and 32GB
>> of RAM each and are otherwise idle.
>> 
>> Any suggestions on how to get around, or investigate, "kmem
>> arena"?
>> 
>> # top last pid:  3272;  load averages:  0.22,  0.22,  0.23
>> up 0+08:25:02  01:32:07 34 processes:  1 running, 33 sleeping 
>> CPU:  0.0% user,  0.0% nice,  0.1% system,  0.0% interrupt, 99.9%
>> idle Mem: 21M Active, 82M Inact, 15G Wired, 28M Cache, 450M Free 
>> ARC: 12G Total, 24M MFU, 12G MRU, 23M Anon, 216M Header, 47M
>> Other Swap: 16G Total, 16G Free
>> 
>> PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME
>> WCPU COMMAND 1173 root          1  52    0 86476K  7780K select
>> 0 124:33   0.00% sshd 1176 root          1  46    0 87276K 47732K
>> kmem a  3  48:36   0.00% zfs 968 root         32  20    0 12344K
>> 1888K rpcsvc  0   0:13   0.00% nfsd 1009 root          1  20    0
>> 25452K  2864K select  3   0:01   0.00% ntpd ...

What does procstat -kk 1176 (or the PID of your 'zfs' process that
stuck in that state) say?

Cheers,

-----BEGIN PGP SIGNATURE-----

iQIcBAEBCgAGBQJUP+5vAAoJEJW2GBstM+ns0v4P/31s7geR2j22etrRnfReUxbb
lbex0VkmLGm23TbTj2vpVce+ogBeA4zo6h4WzF/yYt2372MpWOfnEoVX2yOuuGku
AFapewXS3UMXLzaRWrdTWng1KQlOyQykAHI2rvQLlYlQNTLA5AbUm6uzNXaKpD8s
PbckREQ6wHnpZOiRcMN695QstjBNCal+XJHgvrwTfyp9vdFrPVD4UHnsN7MU6QSO
XobxOqbuw4Tq95mgYJqrjk+xEYMgzUy2zkVp2QTCBXZn3T3yroI2RcgUZQWaw5SO
xRegPa5jfJqcQJAdSxl8oVs9Sz8+5YDeksAnjCOxIQzLZBbNho+SOAzi+kjnT6W7
ijTc20z5eioQVPekdJ4MBweBsAeS1aGi8VWppuP+ZDLoirmxB0LaZyRv/W/HRQDD
j4CoZswkndh+J+9Crsa9SUkfNGNvVVNjhJUGyIfTGFUsMbWTAWwa4SMj7Ad04aqW
yhg+Ab4H3Yc14TahtX0jrhD3sTBer6ZoMFKE3tl8aStGXHVMyPkj0PHg5xjZEWL2
XGF86eoIgx03A9sIdbdHEZpyTMksfNatDXZk5XpPGF/sVd6txUoYP4Ch2wD8YRFM
O5Ny2r6ash2rZYmlyjf19n4gvKebdGo8d8NbzOJ3oYue6OI/88cu0rv6xLV9hHSF
fwgIbPo5uK4hIpEm0Dk4
=qY45
-----END PGP SIGNATURE-----
Received on Thu Oct 16 2014 - 14:12:35 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:53 UTC