Scrub incredibly slow with 13.0-RC3 (as well as RC1 & 2)

From: Mathieu Chouquet-Stringer <me+freebsd_at_mathieu.digital>
Date: Mon, 22 Mar 2021 11:17:42 +0100
	Hello,

I'm running vanilla 13.0-RC3 at the moment but experienced the same
issue with RC2 and RC1.

This instance of FreeBSD is running in a Linux kvm and it has 16 threads
and 32GB of RAM:
hw.model: Intel Xeon E3-12xx v2 (Ivy Bridge, IBRS)
hw.machine: amd64
hw.ncpu: 16
hw.physmem: 34318278656

I have 2 zpools, 1 is for the root filesystems the other one is just my
storage array.

NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP HEALTH  ALTROOT
data     29T  24.5T  4.54T        -         -    11%    84%  1.00x ONLINE  -
zroot  11.9G  8.76G  3.18G        -         -    62%    73%  1.00x ONLINE  -

Zroot is on vtbd0 device backed by ssd and I used to be able to scrub it
reaching SATA2 bandwidth limits, close to 300MB/s. Now when I scrub it,
I get half that at best.

But that's not the worst part as it's really small so it's still quick
enough. The worst part is my data zpool. It's raidz2 array with 8
devices, a log mirror on SSDs and 2 caches devices on SSD. The sas/sata
controllers (2 x SAS3008 in IT mode) are dedicated to FreeBSD so it has
full control of them (through vfio pci).

  pool: data
 state: ONLINE
  scan: scrub in progress since Sun Mar 21 23:47:40 2021
        2.82T scanned at 77.8M/s, 2.24T issued at 62.0M/s, 24.5T total
        0B repaired, 9.17% done, 4 days 08:23:32 to go
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          raidz2-0  ONLINE       0     0     0
            da1     ONLINE       0     0     0
            da7     ONLINE       0     0     0
            da3     ONLINE       0     0     0
            da4     ONLINE       0     0     0
            da6     ONLINE       0     0     0
            da9     ONLINE       0     0     0
            da2     ONLINE       0     0     0
            da10    ONLINE       0     0     0
        logs
          mirror-1  ONLINE       0     0     0
            da0p1   ONLINE       0     0     0
            da5p1   ONLINE       0     0     0
        cache
          da0p2     ONLINE       0     0     0
          da5p2     ONLINE       0     0     0
        spares
          da8       AVAIL

I used to be able to scrub the whole pool in a day or so but not
anymore. I started scrubbing it almost 11 hours ago and I've only done
10% of it.

iostat shows devices aren't that busy:

                        extended device statistics  
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b  
vtbd0          9       2    264.1     14.3     0     0     2     0    0   0 
da0            0      71     10.9   9106.4     0     0     0     0    0   3 
da1         1443      29   8632.4   1095.7     0     0    79     0    0  34 
da2         1442      29   8631.3   1095.7     0     0    82     0    0  31 
da3         1434      28   8637.7   1096.1     0     0    84     0    1  36 
da4         1434      29   8631.8   1095.7     0     0    79     0    0  33 
da5            0      71     11.7   9130.4     0     0     0     0    0   3 
da6         1438      29   8634.1   1095.6     0     0    82     0    0  31 
da7         1441      29   8636.2   1095.6     0     0    84     0    1  33 
da8            0       0      0.0      0.0     1     2     0     1    0   0 
da9         1441      29   8630.8   1095.8     0     0    81     0    0  32 
da10        1438      29   8633.7   1095.5     0     0    82     0    0  32 
pass1          0       0      0.0      0.0     0     0     0     0    0   0 
pass2          0       0      0.0      0.0    43    17    36    41    0   0 
pass3          0       0      0.0      0.0    43    15    35    40    0   0 
pass4          0       0      0.0      0.0    44    12    42    43    0   0 
pass5          0       0      0.0      0.0    45    13    32    41    0   0 
pass6          0       0      0.0      0.0     0     0     0     0    0   0 
pass7          0       0      0.0      0.0    47    23    44    46    0   0 
pass8          0       0      0.0      0.0    44    15    35    42    0   0 
pass9          0       0      0.0      0.0    33     0   198    71    0   0 
pass10         0       0      0.0      0.0    43    16    33    40    0   0 
pass11         0       0      0.0      0.0    41     9    32    39    0   0 

Here's what I have in my sysctl.conf:
vfs.zfs.l2arc_write_max=314572800
vfs.zfs.min_auto_ashift=12
vfs.zfs.vdev.async_read_max_active=10

I tried playing with scan_mem_lim_fact (I went through the whole openzfs
doc yesterday) but it didn't make any difference:
vfs.zfs.scan_mem_lim_fact=5

With or without vfs.zfs.scan_mem_lim_fact, I often see I have 1 and
sometimes 2 threads pegged at 100% of CPU in kernel mode...

    1[||         6.8%]    5[||         8.8%]     9[|          4.7%]   13[||         9.4%]
    2[||         8.3%]    6[|          4.2%]    10[||         8.9%]   14[|          5.7%]
    3[||         8.4%]    7[|||||||||100.0%]    11[|||       14.1%]   15[|          5.2%]
    4[|          5.7%]    8[||         9.4%]    12[||        13.0%]   16[|          4.2%]
  Mem[||||||||||||||||||||||||||9.33G/32.0G]   Tasks: 45, 0 thr; 2 running
  Swp[|                          130M/8.00G]   Load average: 2.51 2.63 2.46
                                               Uptime: 10:47:36

  PID USER      PRI  NI  VIRT   RES S CPU% MEM%   TIME+  Command
    0 root      -16   0     0 12976 S 22.5  0.0  9h45:36 kernel
    1 root       24   0 11824    32 S  0.0  0.0  0:00.36 /sbin/init
 3453 root       20   0 17668  2224 S  0.0  0.0  0:00.48 /usr/local/sbin/smartd -A /usr/local/
 3974 root       52   0 12892    48 S  0.0  0.0  0:00.00 /usr/libexec/getty Pc ttyv0
 4973 root       52   0 12892    48 S  0.0  0.0  0:00.00 /usr/libexec/getty Pc ttyv1
 5618 root       52   0 12892    48 S  0.0  0.0  0:00.00 /usr/libexec/getty Pc ttyv2
 5974 root       52   0 12892    48 S  0.0  0.0  0:00.00 /usr/libexec/getty Pc ttyv3
 6320 root       52   0 12892    48 S  0.0  0.0  0:00.00 /usr/libexec/getty Pc ttyv4
 6386 root       20   0 39188  2932 S  0.0  0.0  0:00.65 /usr/local/sbin/nmbd --daemon --confi

Any idea how I could help debug that?

Cheers,
-- 
Mathieu Chouquet-Stringer                             me+freebsd_at_mathieu.digital
            The sun itself sees not till heaven clears.
	             -- William Shakespeare --
Received on Mon Mar 22 2021 - 09:17:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:27 UTC