Re: r273165. ZFS ARC: possible memory leak to Inact

From: Allan Jude <allanjude_at_freebsd.org>
Date: Tue, 04 Nov 2014 12:22:29 -0500
On 11/04/2014 08:22, Dmitriy Makarov wrote:
> ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
> 
> UMA Kegs:               384,      0,     210,      10,     216,   0,   0
> UMA Zones:             2176,      0,     210,       0,     216,   0,   0
> UMA Slabs:               80,      0, 2921231, 1024519,133906002,   0,   0
> UMA RCntSlabs:           88,      0,    8442,    1863,  771451,   0,   0
> UMA Hash:               256,      0,       2,      28,      79,   0,   0
> 4 Bucket:                32,      0,    5698,   16052,424047094,   0,   0
> 6 Bucket:                48,      0,     220,    8993,77454827,   0,   0
> 8 Bucket:                64,      0,     260,    6808,56285069,  15,   0
> 12 Bucket:               96,      0,     302,    2568,42712743, 192,   0
> 16 Bucket:              128,      0,    1445,    1903,86971183,   0,   0
> 32 Bucket:              256,      0,     610,    2870,96758244, 215,   0
> 64 Bucket:              512,      0,    1611,    1117,55896361,77166469,   0
> 128 Bucket:            1024,      0,     413,     635,99338830,104451029,  
> 0
> 256 Bucket:            2048,      0,    1100,     222,164776092,24917372,  
> 0
> vmem btag:               56,      0, 1889493,  502639,30117503,16948,   0
> VM OBJECT:              256,      0,  970434,  174126,1080667061,   0,   0
> RADIX NODE:             144,      0, 2792188,  882809,1489929489,   0,   0
> MAP:                    240,      0,       3,      61,       3,   0,   0
> KMAP ENTRY:             128,      0,      13,     173,      37,   0,   0
> MAP ENTRY:              128,      0,   82182,   11624,3990141990,   0,   0
> VMSPACE:                496,      0,     615,     761,41838231,   0,   0
> fakepg:                 104,      0,       0,       0,       0,   0,   0
> mt_zone:              16400,      0,     261,       0,     267,   0,   0
> 16:                      16,      0, 3650397, 6166213,6132198534,   0,   0
> 32:                      32,      0, 1118176,  259824,9115561085,   0,   0
> 64:                      64,      0,14496058,14945820,11266627738,   0,   0
> 128:                    128,      0, 1337428,  319398,15463968444,   0,   0
> 256:                    256,      0, 1103937,  258183,8392009677,   0,   0
> 512:                    512,      0,    1714,     470,7174436957,   0,   0
> 1024:                  1024,      0,   29033,     347,131133987,   0,   0
> 2048:                  2048,      0,     869,     275,1001770010,   0,   0
> 4096:                  4096,      0,  730319,    3013,332721996,   0,   0
> 8192:                  8192,      0,      47,      11,  487154,   0,   0
> 16384:                16384,      0,      65,       5,    1788,   0,   0
> 32768:                32768,      0,      54,      13,  103482,   0,   0
> 65536:                65536,      0,     627,       8, 8172809,   0,   0
> SLEEPQUEUE:              80,      0,    1954,    1053,    2812,   0,   0
> 64 pcpu:                  8,      0,     558,     594,     793,   0,   0
> Files:                   80,      0,   16221,    2579,1549799224,   0,   0
> TURNSTILE:              136,      0,    1954,     506,    2812,   0,   0
> rl_entry:                40,      0,    1114,    2186,    1114,   0,   0
> umtx pi:                 96,      0,       0,       0,       0,   0,   0
> MAC labels:              40,      0,       0,       0,       0,   0,   0
> PROC:                  1208,      0,     635,     514,41838196,   0,   0
> THREAD:                1168,      0,    1840,     113,   12778,   0,   0
> cpuset:                  96,      0,     705,     361,    1490,   0,   0
> audit_record:          1248,      0,       0,       0,       0,   0,   0
> sendfile_sync:          128,      0,       0,       0,       0,   0,   0
> mbuf_packet:            256, 46137345,    8199,    5074,15123806588,   0,  
> 0
> mbuf:                   256, 46137345,   25761,   13076,21621129305,   0,  
> 0
> mbuf_cluster:          2048, 7208960,   13273,     315, 2905465,   0,   0
> mbuf_jumbo_page:       4096, 3604480,     786,     862,628074105,   0,   0
> mbuf_jumbo_9k:         9216, 1067994,       0,       0,       0,   0,   0
> mbuf_jumbo_16k:       16384, 600746,       0,       0,       0,   0,   0
> mbuf_ext_refcnt:          4,      0,       0,       0,       0,   0,   0
> g_bio:                  248,      0,      36,    2348,2894002696,   0,   0
> DMAR_MAP_ENTRY:         120,      0,       0,       0,       0,   0,   0
> ttyinq:                 160,      0,     180,     195,    4560,   0,   0
> ttyoutq:                256,      0,      95,     190,    2364,   0,   0
> FPU_save_area:          832,      0,       0,       0,       0,   0,   0
> taskq_zone:              48,      0,       0,    4814,108670448,   0,   0
> VNODE:                  472,      0, 1115838,  293402,379118791,   0,   0
> VNODEPOLL:              112,      0,       0,       0,      12,   0,   0
> BUF TRIE:               144,      0,      96,  105852, 5530345,   0,   0
> S VFS Cache:            108,      0,  995997,  161558,523325155,   0,   0
> STS VFS Cache:          148,      0,       0,       0,       0,   0,   0
> L VFS Cache:            328,      0,      25,     443,39533826,   0,   0
> LTS VFS Cache:          368,      0,       0,       0,       0,   0,   0
> NAMEI:                 1024,      0,       4,     208,3917615385,   0,   0
> range_seg_cache:         64,      0, 2036778,  121876,1194538610,   0,   0
> zio_cache:              920,      0,      65,   15323,15366038685,   0,   0
> zio_link_cache:          48,      0,      30,   16321,12086373533,   0,   0
> zio_buf_512:            512,      0, 2713231, 2767361,807591166,   0,   0
> zio_data_buf_512:       512,      0,     481,     655,196012401,   0,   0
> zio_buf_1024:          1024,      0,    7131,    1893,34360002,   0,   0
> zio_data_buf_1024:     1024,      0,     449,     335,13698525,   0,   0
> zio_buf_1536:          1536,      0,    2478,     560,21617894,   0,   0
> zio_data_buf_1536:     1536,      0,     821,     433,17033305,   0,   0
> zio_buf_2048:          2048,      0,    1867,     373,24528179,   0,   0
> zio_data_buf_2048:     2048,      0,     710,     348,18500686,   0,   0
> zio_buf_2560:          2560,      0,    1362,      38,13483571,   0,   0
> zio_data_buf_2560:     2560,      0,     946,      47,12074257,   0,   0
> zio_buf_3072:          3072,      0,     978,      43,20528564,   0,   0
> zio_data_buf_3072:     3072,      0,     716,      57,10665806,   0,   0
> zio_buf_3584:          3584,      0,     768,      23,15883624,   0,   0
> zio_data_buf_3584:     3584,      0,     867,       7, 9497134,   0,   0
> zio_buf_4096:          4096,      0,    9982,     772,154583770,   0,   0
> zio_data_buf_4096:     4096,      0,     851,      12, 8770997,   0,   0
> zio_buf_5120:          5120,      0,     904,      24,15481475,   0,   0
> zio_data_buf_5120:     5120,      0,    1615,      19,22450665,   0,   0
> zio_buf_6144:          6144,      0,     715,      23,18561260,   0,   0
> zio_data_buf_6144:     6144,      0,    1536,       1,12377616,   0,   0
> zio_buf_7168:          7168,      0,     600,      25,22583123,   0,   0
> zio_data_buf_7168:     7168,      0,    1789,      62,10888039,   0,   0
> zio_buf_8192:          8192,      0,     527,      28,21084452,   0,   0
> zio_data_buf_8192:     8192,      0,    1123,      35,11257788,   0,   0
> zio_buf_10240:        10240,      0,     891,      40,23445358,   0,   0
> zio_data_buf_10240:   10240,      0,    2757,      10,31594664,   0,   0
> zio_buf_12288:        12288,      0,     793,      44,32778601,   0,   0
> zio_data_buf_12288:   12288,      0,    2983,      19,33810459,   0,   0
> zio_buf_14336:        14336,      0,     680,      22,22955621,   0,   0
> zio_data_buf_14336:   14336,      0,    2837,       7,31231322,   0,   0
> zio_buf_16384:        16384,      0, 1174235,    5515,423668480,   0,   0
> zio_data_buf_16384:   16384,      0,   12197,       2,23870379,   0,   0
> zio_buf_20480:        20480,      0,    1234,      42,28438855,   0,   0
> zio_data_buf_20480:   20480,      0,    3349,      10,39049709,   0,   0
> zio_buf_24576:        24576,      0,    1039,      35,23663028,   0,   0
> zio_data_buf_24576:   24576,      0,    2515,      12,32477737,   0,   0
> zio_buf_28672:        28672,      0,     872,      47,17630224,   0,   0
> zio_data_buf_28672:   28672,      0,    1746,      11,24870056,   0,   0
> zio_buf_32768:        32768,      0,     847,      29,18368605,   0,   0
> zio_data_buf_32768:   32768,      0,    1637,      11,20784299,   0,   0
> zio_buf_36864:        36864,      0,     797,      22,16120701,   0,   0
> zio_data_buf_36864:   36864,      0,    2136,      65,19999849,   0,   0
> zio_buf_40960:        40960,      0,     707,      40,14881217,   0,   0
> zio_data_buf_40960:   40960,      0,    1242,      66,18085181,   0,   0
> zio_buf_45056:        45056,      0,     718,      43,13708380,   0,   0
> zio_data_buf_45056:   45056,      0,     993,      41,13875971,   0,   0
> zio_buf_49152:        49152,      0,     569,      43,15518175,   0,   0
> zio_data_buf_49152:   49152,      0,     929,      32,12006369,   0,   0
> zio_buf_53248:        53248,      0,     594,      25,14752074,   0,   0
> zio_data_buf_53248:   53248,      0,     889,      30,11159838,   0,   0
> zio_buf_57344:        57344,      0,     536,      46,16314266,   0,   0
> zio_data_buf_57344:   57344,      0,    1105,      12,10210025,   0,   0
> zio_buf_61440:        61440,      0,     527,      43,14355397,   0,   0
> zio_data_buf_61440:   61440,      0,     738,      10, 9080556,   0,   0
> zio_buf_65536:        65536,      0,     447,      44,13264282,   0,   0
> zio_data_buf_65536:   65536,      0,     723,      16, 8855438,   0,   0
> zio_buf_69632:        69632,      0,     434,      35,10357799,   0,   0
> zio_data_buf_69632:   69632,      0,     675,      44, 8017072,   0,   0
> zio_buf_73728:        73728,      0,     441,      24, 9784965,   0,   0
> zio_data_buf_73728:   73728,      0,     650,      35, 7370868,   0,   0
> zio_buf_77824:        77824,      0,     448,      26, 9643063,   0,   0
> zio_data_buf_77824:   77824,      0,     802,      34, 7733636,   0,   0
> zio_buf_81920:        81920,      0,     393,      48, 8958739,   0,   0
> zio_data_buf_81920:   81920,      0,     671,      10, 6437432,   0,   0
> zio_buf_86016:        86016,      0,     397,      24, 8406339,   0,   0
> zio_data_buf_86016:   86016,      0,     458,      14, 5752942,   0,   0
> zio_buf_90112:        90112,      0,     337,      19, 9427445,   0,   0
> zio_data_buf_90112:   90112,      0,     629,      14, 6209404,   0,   0
> zio_buf_94208:        94208,      0,     342,      18, 9703869,   0,   0
> zio_data_buf_94208:   94208,      0,     471,      32, 5147136,   0,   0
> zio_buf_98304:        98304,      0,     335,      22,11366122,   0,   0
> zio_data_buf_98304:   98304,      0,     813,      13, 5071769,   0,   0
> zio_buf_102400:      102400,      0,     318,      35,10730116,   0,   0
> zio_data_buf_102400: 102400,      0,     494,      15, 5120409,   0,   0
> zio_buf_106496:      106496,      0,     295,      25,11494927,   0,   0
> zio_data_buf_106496: 106496,      0,     441,      12, 4628043,   0,   0
> zio_buf_110592:      110592,      0,     277,      36,12261799,   0,   0
> zio_data_buf_110592: 110592,      0,     996,       8, 4655911,   0,   0
> zio_buf_114688:      114688,      0,     248,      28,13187629,   0,   0
> zio_data_buf_114688: 114688,      0,     367,      26, 4356168,   0,   0
> zio_buf_118784:      118784,      0,     248,      25,11526765,   0,   0
> zio_data_buf_118784: 118784,      0,     457,      16, 3997133,   0,   0
> zio_buf_122880:      122880,      0,     221,      18,13138310,   0,   0
> zio_data_buf_122880: 122880,      0,     440,      16, 4127363,   0,   0
> zio_buf_126976:      126976,      0,     225,      22,21080594,   0,   0
> zio_data_buf_126976: 126976,      0,     332,      23, 3611080,   0,   0
> zio_buf_131072:      131072,      0,     236,     768,260386880,   0,   0
> zio_data_buf_131072: 131072,      0,  235926,      17,201706301,   0,   0
> lz4_ctx:              16384,      0,       0,      22,870339248,   0,   0
> sa_cache:                80,      0, 1114682,  301918,377799679,   0,   0
> dnode_t:                752,      0, 4591384, 1276221,343600652,   0,   0
> dmu_buf_impl_t:         232,      0, 4193283, 4522906,1613603616,   0,   0
> arc_buf_hdr_t:          216,      0, 3636990, 1135188,1255686550,   0,   0
> arc_buf_t:               72,      0, 1517802,  983818,1342208723,   0,   0
> zil_lwb_cache:          192,      0,      59,    1301,28828585,   0,   0
> zfs_znode_cache:        368,      0, 1114682,  297778,377799679,   0,   0
> procdesc:               128,      0,       0,       0,       3,   0,   0
> pipe:                   744,      0,       8,     197,30953268,   0,   0
> Mountpoints:            816,      0,      13,      82,      13,   0,   0
> ksiginfo:               112,      0,    1138,    2362, 1449794,   0,   0
> itimer:                 352,      0,       0,     264,    4107,   0,   0
> pf mtags:                40,      0,       0,       0,       0,   0,   0
> pf states:              296, 500006,     275,     427, 2506195,   0,   0
> pf state keys:           88,      0,     378,    1602, 2878928,   0,   0
> pf source nodes:        136, 500018,       0,       0,       0,   0,   0
> pf table entries:       160, 200000,      17,      33,      34,   0,   0
> pf table counters:       64,      0,       0,       0,       0,   0,   0
> pf frags:                80,      0,       0,       0,       0,   0,   0
> pf frag entries:         32,  40000,       0,       0,       0,   0,   0
> pf state scrubs:         40,      0,       0,       0,       0,   0,   0
> KNOTE:                  128,      0,   13343,    1568,2119230288,   0,   0
> socket:                 728, 4192760,   31124,    1581,260689101,   0,   0
> ipq:                     56, 225283,       0,       0,       0,   0,   0
> udp_inpcb:              400, 4192760,      46,     484,18539506,   0,   0
> udpcb:                   24, 4192869,      46,    4296,18539506,   0,   0
> tcp_inpcb:              400, 4192760,   42550,    1050,241905139,   0,   0
> tcpcb:                 1032, 4192761,   14734,     830,241905139,   0,   0
> tcptw:                   80,  27800,   27800,       0,100020089,89206796,  
> 0
> syncache:               168,  15364,       0,     805,137341445,   0,   0
> hostcache:              136,  15370,      57,     233,     759,   0,   0
> sackhole:                32,      0,       0,    3125,   19180,   0,   0
> sctp_ep:               1400, 4192760,       0,       0,       0,   0,   0
> sctp_asoc:             2408,  40000,       0,       0,       0,   0,   0
> sctp_laddr:              48,  80012,       0,       0,       3,   0,   0
> sctp_raddr:             720,  80000,       0,       0,       0,   0,   0
> sctp_chunk:             136, 400026,       0,       0,       0,   0,   0
> sctp_readq:             104, 400026,       0,       0,       0,   0,   0
> sctp_stream_msg_out:    104, 400026,       0,       0,       0,   0,   0
> sctp_asconf:             40, 400000,       0,       0,       0,   0,   0
> sctp_asconf_ack:         48, 400060,       0,       0,       0,   0,   0
> udplite_inpcb:          400, 4192760,       0,       0,       0,   0,   0
> ripcb:                  400, 4192760,       0,      60,       6,   0,   0
> unpcb:                  240, 4192768,    1166,    1074,  244448,   0,   0
> rtentry:                200,      0,       8,      92,       8,   0,   0
> selfd:                   56,      0,    2339,    3270,6167642044,   0,   0
> SWAPMETA:               288, 16336788,       0,       0,       0,   0,   0
> FFS inode:              168,      0,    1032,    1084, 1308978,   0,   0
> FFS1 dinode:            128,      0,       0,       0,       0,   0,   0
> FFS2 dinode:            256,      0,    1032,    1098, 1308978,   0,   0
> NCLNODE:                528,      0,       0,       0,       0,   0,   0
> 
> this is staticticts after script helped to reclaim memory.
> 
> Here's top statistics:
> 
> Mem: 19G Active, 20G Inact, 81G Wired, 59M Cache, 3308M Buf, 4918M Free
> ARC: 66G Total, 6926M MFU, 54G MRU, 8069K Anon, 899M Header, 5129M Other
> 
> 
> 
> Steven Hartland wrote
>> This is likely spikes in uma zones used by ARC.
>>
>> The VM doesn't ever clean uma zones unless it hits a low memory 
>> condition, which explains why your little script helps.
>>
>> Check the output of vmstat -z to confirm.
>>
>> On 04/11/2014 11:47, Dmitriy Makarov wrote:
>>> Hi Current,
>>>
>>> It seems like there is constant flow (leak) of memory from ARC to Inact
>>> in FreeBSD 11.0-CURRENT #0 r273165.
>>>
>>> Normally, our system (FreeBSD 11.0-CURRENT #5 r260625) keeps ARC size
>>> very close to vfs.zfs.arc_max:
>>>
>>> Mem: 16G Active, 324M Inact, 105G Wired, 1612M Cache, 3308M Buf, 1094M
>>> Free
>>> ARC: 88G Total, 2100M MFU, 78G MRU, 39M Anon, 2283M Header, 6162M Other
>>>
>>>
>>> But after an upgrade to (FreeBSD 11.0-CURRENT #0 r273165) we observe
>>> enormous numbers of Inact memory in the top:
>>>
>>> Mem: 21G Active, 45G Inact, 56G Wired, 357M Cache, 3308M Buf, 1654M Free
>>> ARC: 42G Total, 6025M MFU, 30G MRU, 30M Anon, 819M Header, 5214M Other
>>>
>>> Funny thing is that when we manually allocate and release memory, using
>>> simple python script:
>>>
>>> #!/usr/local/bin/python2.7
>>>
>>> import sys
>>> import time
>>>
>>> if len(sys.argv) != 2:
>>>      print "usage: fillmem 
>> <number-of-megabytes>
>> "
>>>      sys.exit()
>>>
>>> count = int(sys.argv[1])
>>>
>>> megabyte = (0,) * (1024 * 1024 / 8)
>>>
>>> data = megabyte * count
>>>
>>> as:
>>>
>>> # ./simple_script 10000
>>>
>>> all those allocated megabyes 'migrate' from Inact to Free, and afterwards
>>> they are 'eaten' by ARC with no problem.
>>> Until Inact slowly grows back to the number it was before we ran the
>>> script.
>>>
>>> Current workaround is to periodically invoke this python script by cron.
>>> This is an ugly workaround and we really don't like it on our production
>>>
>>>
>>> To answer possible questions about ARC efficience:
>>> Cache efficiency drops dramatically with every GiB pushed off the ARC.
>>>
>>> Before upgrade:
>>>      Cache Hit Ratio:                99.38%
>>>
>>> After upgrade:
>>>      Cache Hit Ratio:                81.95%
>>>
>>> We believe that ARC misbehaves and we ask your assistance.
>>>
>>>
>>> ----------------------------------
>>>
>>> Some values from configs.
>>>
>>> HW: 128GB RAM, LSI HBA controller with 36 disks (stripe of mirrors).
>>>
>>> top output:
>>>
>>> In /boot/loader.conf :
>>> vm.kmem_size="110G"
>>> vfs.zfs.arc_max="90G"
>>> vfs.zfs.arc_min="42G"
>>> vfs.zfs.txg.timeout="10"
>>>
>>> -----------------------------------
>>>
>>> Thanks.
>>>
>>> Regards,
>>> Dmitriy
>>> _______________________________________________
>>>
> 
>> freebsd-current_at_
> 
>>  mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> To unsubscribe, send any mail to "
> 
>> freebsd-current-unsubscribe_at_
> 
>> "
>>
>> _______________________________________________
> 
>> freebsd-current_at_
> 
>>  mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "
> 
>> freebsd-current-unsubscribe_at_
> 
>> "
> 
> 
> 
> 
> 
> --
> View this message in context: http://freebsd.1045724.n5.nabble.com/r273165-ZFS-ARC-possible-memory-leak-to-Inact-tp5962410p5962421.html
> Sent from the freebsd-current mailing list archive at Nabble.com.
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 

Justin Gibbs and I were helping George from Voxer look at the same issue
they are having. They had ~169GB in inact, and only ~60GB being used for
ARC.

Are there any further debugging steps we can recommend to him to help
investigate this?

-- 
Allan Jude
Received on Tue Nov 04 2014 - 16:22:31 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:53 UTC