Re: ZFS: I/O error - blocks larger than 16777216 are not supported

From: KIRIYAMA Kazuhiko <kiri_at_kx.openedu.org>
Date: Fri, 29 Jun 2018 11:47:13 +0900
At Tue, 26 Jun 2018 09:48:10 +0300,
Toomas Soome wrote:
> 
> 
> 
> > On 26 Jun 2018, at 05:08, KIRIYAMA Kazuhiko <kiri_at_kx.openedu.org> wrote:
> > 
> > At Thu, 21 Jun 2018 10:48:28 +0300,
> > Toomas Soome wrote:
> >> 
> >> 
> >> 
> >>> On 21 Jun 2018, at 09:00, KIRIYAMA Kazuhiko <kiri_at_kx.openedu.org> wrote:
> >>> 
> >>> At Wed, 20 Jun 2018 23:34:48 -0400,
> >>> Allan Jude wrote:
> >>>> 
> >>>> On 2018-06-20 21:36, KIRIYAMA Kazuhiko wrote:
> >>>>> Hi all,
> >>>>> 
> >>>>> I've been reported ZFS boot disable problem [1], and found
> >>>>> that this issue occers form RAID configuration [2]. So I
> >>>>> rebuit with RAID5 and re-installed 12.0-CURRENT
> >>>>> (r333982). But failed to boot with:
> >>>>> 
> >>>>> ZFS: i/o error - all block copies unavailable
> >>>>> ZFS: can't read MOS of pool zroot
> >>>>> gptzfsboot: failed to mount default pool zroot
> >>>>> 
> >>>>> FreeBSD/x86 boot
> >>>>> ZFS: I/O error - blocks larger than 16777216 are not supported
> >>>>> ZFS: can't find dataset u
> >>>>> Default: zroot/<0x0>:
> >>>>> 
> >>>>> In this case, the reason is "blocks larger than 16777216 are
> >>>>> not supported" and I guess this means datasets that have
> >>>>> recordsize greater than 8GB is NOT supported by the
> >>>>> FreeBSD boot loader(zpool-features(7)). Is that true ?
> >>>>> 
> >>>>> My zpool featues are as follows:
> >>>>> 
> >>>>> # kldload zfs
> >>>>> # zpool import 
> >>>>>  pool: zroot
> >>>>>    id: 13407092850382881815
> >>>>> state: ONLINE
> >>>>> status: The pool was last accessed by another system.
> >>>>> action: The pool can be imported using its name or numeric identifier and
> >>>>>       the '-f' flag.
> >>>>>  see: http://illumos.org/msg/ZFS-8000-EY
> >>>>> config:
> >>>>> 
> >>>>>       zroot       ONLINE
> >>>>>         mfid0p3   ONLINE
> >>>>> # zpool import -fR /mnt zroot
> >>>>> # zpool list
> >>>>> NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
> >>>>> zroot  19.9T   129G  19.7T         -     0%     0%  1.00x  ONLINE  /mnt
> >>>>> # zpool get all zroot
> >>>>> NAME   PROPERTY                                  VALUE                                     SOURCE
> >>>>> zroot  size                                      19.9T                                     -
> >>>>> zroot  capacity                                  0%                                        -
> >>>>> zroot  altroot                                   /mnt                                      local
> >>>>> zroot  health                                    ONLINE                                    -
> >>>>> zroot  guid                                      13407092850382881815                      default
> >>>>> zroot  version                                   -                                         default
> >>>>> zroot  bootfs                                    zroot/ROOT/default                        local
> >>>>> zroot  delegation                                on                                        default
> >>>>> zroot  autoreplace                               off                                       default
> >>>>> zroot  cachefile                                 none                                      local
> >>>>> zroot  failmode                                  wait                                      default
> >>>>> zroot  listsnapshots                             off                                       default
> >>>>> zroot  autoexpand                                off                                       default
> >>>>> zroot  dedupditto                                0                                         default
> >>>>> zroot  dedupratio                                1.00x                                     -
> >>>>> zroot  free                                      19.7T                                     -
> >>>>> zroot  allocated                                 129G                                      -
> >>>>> zroot  readonly                                  off                                       -
> >>>>> zroot  comment                                   -                                         default
> >>>>> zroot  expandsize                                -                                         -
> >>>>> zroot  freeing                                   0                                         default
> >>>>> zroot  fragmentation                             0%                                        -
> >>>>> zroot  leaked                                    0                                         default
> >>>>> zroot  feature_at_async_destroy                     enabled                                   local
> >>>>> zroot  feature_at_empty_bpobj                       active                                    local
> >>>>> zroot  feature_at_lz4_compress                      active                                    local
> >>>>> zroot  feature_at_multi_vdev_crash_dump             enabled                                   local
> >>>>> zroot  feature_at_spacemap_histogram                active                                    local
> >>>>> zroot  feature_at_enabled_txg                       active                                    local
> >>>>> zroot  feature_at_hole_birth                        active                                    local
> >>>>> zroot  feature_at_extensible_dataset                enabled                                   local
> >>>>> zroot  feature_at_embedded_data                     active                                    local
> >>>>> zroot  feature_at_bookmarks                         enabled                                   local
> >>>>> zroot  feature_at_filesystem_limits                 enabled                                   local
> >>>>> zroot  feature_at_large_blocks                      enabled                                   local
> >>>>> zroot  feature_at_sha512                            enabled                                   local
> >>>>> zroot  feature_at_skein                             enabled                                   local
> >>>>> zroot  unsupported_at_com.delphix:device_removal    inactive                                  local
> >>>>> zroot  unsupported_at_com.delphix:obsolete_counts   inactive                                  local
> >>>>> zroot  unsupported_at_com.delphix:zpool_checkpoint  inactive                                  local
> >>>>> # 
> >>>>> 
> >>>>> Regards
> >>>>> 
> >>>>> [1] https://lists.freebsd.org/pipermail/freebsd-current/2018-March/068886.html
> >>>>> [2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=151910
> >>>>> 
> >>>>> ---
> >>>>> KIRIYAMA Kazuhiko
> >>>>> _______________________________________________
> >>>>> freebsd-current_at_freebsd.org mailing list
> >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> >>>>> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> >>>>> 
> >>>> 
> >>>> I am guessing it means something is corrupt, as 16MB is the maximum size
> >>>> of a record in ZFS. Also, the 'large_blocks' feature is 'enabled', not
> >>>> 'active', so this suggest you do not have any records larger than 128kb
> >>>> on your pool.
> >>> 
> >>> As I mentioned above, [2] says ZFS on RAID disks have any
> >>> serious bugs except for mirror. Anyway I gave up to use ZFS
> >>> on RAID{5,6}* until Bug 151910 [2] fixed.
> >>> 
> >> 
> >> if you boot from usb stick (or cd), press esc at boot loader menu and enter lsdev -v. what sector and disk sizes are reported?
> > 
> > OK lsdev -v
> > disk devices:
> >    disk0:   BIOS drive C (31588352 X 512)
> >      disk0p1: FreeBSD boot        512KB
> >      disk0p2: FreeBSD UFS         13GB
> >      disk0p3: FreeBSD swap        771MB
> >    disk1:   BIOS drive D (4294967295 X 512)
> >      disk0p1: FreeBSD boot        512KB
> >      disk0p2: FreeBSD swap        128GB
> >      disk0p3: FreeBSD ZFS          19TB
> > OK
> > 
> > Does this means whole disk size that I can use is
> > 2TB (4294967295 X 512) ? 
> 
> 
> Yes, or to be exact, that is the disk size reported by the INT13; and as below you do get the same value from UEFI, the limit seems to be set by the RAID controller itself. In this case it means that the best way to address the issue is to create one smaller lun for boot disk (zroot pool) and larger for data. Or of course you can have separate FreeBSD ZFS partition for zroot, just make sure it will fit inside the first 2TB.
> 
> Of course there may be option for RAID firmware update, or configuration settings for lun, or use JBOD mode (if supported by the card). JBOD would be the best because in the current setup, the pool is vulnerable against silent data corruption (checksum errors) and has no way to recover (this is the reason why RAID setups are not preferred with zfs).

My RAID card is AVAGO MegaRAID (SAS-MFI BIOS Version
6.36.00.0) and find it to be enable JBOD-mode. So I change
RAID-mode to JBOD-mode and make each disk to JBOD. Then
reboot and checked at loader prompt 'lsdev -v', all disk is
recognized as single device 'mfidx' (x=0,2,..,11). Anyway I
re-installed as ZFS RAIDZ-3 with UEFI boot. Result is fine !!!

Each disk was recoginized up to 2TB as a ZFS file system and
built a zpool (zroot) as raidz3 from those disks:

OK lsdev -v
  PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,80)
    disk0:    3907029168 X 512 blocks
      disk0p1: EFI                 200MB
      disk0p2: FreeBSD swap        8192MBB
      disk0p3: FreeBSD ZFS         1854GBB
  PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,81)
    disk1:    3907029168 X 512 blocks
      disk1p1: EFI                 200MB
      disk1p2: FreeBSD swap        8192MBB
      disk1p3: FreeBSD ZFS         1854GBB
  PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,82)
    disk1:    3907029168 X 512 blocks
      disk1p1: EFI                 200MB
      disk1p2: FreeBSD swap        8192MBB
      disk1p3: FreeBSD ZFS         1854GBB
        :
  PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,8B)
    disk11:    3907029168 X 512 blocks
      disk11p1: EFI                200MB
      disk11p2: FreeBSD swap       8192MBB
      disk11p3: FreeBSD ZFS        1854GBB
net devices:
zfs devices:
  pool: zroot
bootfs: zroot/ROOT/default
config:

        NAME STATE
        zroot ONLINE
          raidz3 ONLINE
            mfid0p3 ONLINE
            mfid1p3 ONLINE
            mfid2p3 ONLINE
            mfid3p3 ONLINE
            mfid4p3 ONLINE
            mfid5p3 ONLINE
            mfid6p3 ONLINE
            mfid7p3 ONLINE
            mfid8p3 ONLINE
            mfid9p3 ONLINE
            mfid10p3 ONLINE
            mfid11p3 ONLINE
OK

Built-up ZFS file system on FreeBSD 12.0-CURRENT (r335317)
is as follwos:

# gpart show mfid0
=>        40  3907029088  mfid0  GPT  (1.8T)
          40      409600      1  efi  (200M)
      409640        2008         - free -  (1.0M)
      411648    16777216      2  freebsd-swap  (8.0G)
    17188864  3889840128      3  freebsd-zfs  (1.8T)
  3907028992         136         - free -  (68K)

# zpool status
  pool: zroot
 state: ONLINE
  scan: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        zroot         ONLINE       0     0     0
          raidz3-0    ONLINE       0     0     0
            mfid0p3   ONLINE       0     0     0
            mfid1p3   ONLINE       0     0     0
            mfid2p3   ONLINE       0     0     0
            mfid3p3   ONLINE       0     0     0
            mfid4p3   ONLINE       0     0     0
            mfid5p3   ONLINE       0     0     0
            mfid6p3   ONLINE       0     0     0
            mfid7p3   ONLINE       0     0     0
            mfid8p3   ONLINE       0     0     0
            mfid9p3   ONLINE       0     0     0
            mfid10p3  ONLINE       0     0     0
            mfid11p3  ONLINE       0     0     0

errors: No known data errors
# zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zroot  21.6T  2.55G  21.6T        -         -     0%     0%  1.00x  ONLINE  -
# zpool get all zroot
NAME   PROPERTY                       VALUE                          SOURCE
zroot  size                           21.6T                          -
zroot  capacity                       0%                             -
zroot  altroot                        -                              default
zroot  health                         ONLINE                         -
zroot  guid                           2002381236893751526            default
zroot  version                        -                              default
zroot  bootfs                         zroot/ROOT/default             local
zroot  delegation                     on                             default
zroot  autoreplace                    off                            default
zroot  cachefile                      -                              default
zroot  failmode                       wait                           default
zroot  listsnapshots                  off                            default
zroot  autoexpand                     off                            default
zroot  dedupditto                     0                              default
zroot  dedupratio                     1.00x                          -
zroot  free                           21.6T                          -
zroot  allocated                      2.55G                          -
zroot  readonly                       off                            -
zroot  comment                        -                              default
zroot  expandsize                     -                              -
zroot  freeing                        0                              default
zroot  fragmentation                  0%                             -
zroot  leaked                         0                              default
zroot  bootsize                       -                              default
zroot  checkpoint                     -                              -
zroot  feature_at_async_destroy          enabled                        local
zroot  feature_at_empty_bpobj            active                         local
zroot  feature_at_lz4_compress           active                         local
zroot  feature_at_multi_vdev_crash_dump  enabled                        local
zroot  feature_at_spacemap_histogram     active                         local
zroot  feature_at_enabled_txg            active                         local
zroot  feature_at_hole_birth             active                         local
zroot  feature_at_extensible_dataset     enabled                        local
zroot  feature_at_embedded_data          active                         local
zroot  feature_at_bookmarks              enabled                        local
zroot  feature_at_filesystem_limits      enabled                        local
zroot  feature_at_large_blocks           enabled                        local
zroot  feature_at_sha512                 enabled                        local
zroot  feature_at_skein                  enabled                        local
zroot  feature_at_device_removal         enabled                        local
zroot  feature_at_obsolete_counts        enabled                        local
zroot  feature_at_zpool_checkpoint       enabled                        local
# uname -a
FreeBSD vm.openedu.org 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r335317: Mon Jun 18 16:21:17 UTC 2018     root_at_releng3.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
# df -h
Filesystem            Size    Used   Avail Capacity  Mounted on
zroot/ROOT/default     15T    191M     15T     0%    /
devfs                 1.0K    1.0K      0B   100%    /dev
zroot/.dake            15T    256K     15T     0%    /.dake
zroot/ds               15T    279K     15T     0%    /ds
zroot/ds/backup        15T    256K     15T     0%    /ds/backup
zroot/ds/distfiles     15T    256K     15T     0%    /ds/distfiles
zroot/ds/obj           15T    256K     15T     0%    /ds/obj
zroot/ds/packages      15T    256K     15T     0%    /ds/packages
zroot/ds/ports         15T    256K     15T     0%    /ds/ports
zroot/ds/src           15T    256K     15T     0%    /ds/src
zroot/tmp              15T    302K     15T     0%    /tmp
zroot/usr              15T    1.6G     15T     0%    /usr
zroot/usr/home         15T    372K     15T     0%    /usr/home
zroot/usr/local        15T    256K     15T     0%    /usr/local
zroot/var              15T    395K     15T     0%    /var
zroot/var/audit        15T    256K     15T     0%    /var/audit
zroot/var/crash        15T    256K     15T     0%    /var/crash
zroot/var/db           15T    9.2M     15T     0%    /var/db
zroot/var/empty        15T    256K     15T     0%    /var/empty
zroot/var/log          15T    337K     15T     0%    /var/log
zroot/var/mail         15T    256K     15T     0%    /var/mail
zroot/var/ports        15T    256K     15T     0%    /var/ports
zroot/var/run          15T    442K     15T     0%    /var/run
zroot/var/tmp          15T    256K     15T     0%    /var/tmp
zroot/vm               15T    256K     15T     0%    /vm
zroot                  15T    256K     15T     0%    /zroot
# 

Thankx for benignant advice !

> 
> rgds,
> toomas
> 
> > 
> > 
> >> 
> >> the issue [2] is mix of ancient freebsd (v 8.1 is mentioned there), and RAID luns with 512B sector size and 15TB!!! total size - are you really sure your BIOS can actually address 15TB lun (with 512B sector size)? Note that the problem with large disks can hide itself till you have pool filled up enough till the essential files will be stored above the  limit~ meaning that you may have ~perfectly working~ setup till at some point in time, after next update, it is suddenly not working any more.
> >> 
> > 
> > I see why I could use for a while.
> > 
> >> Note that for boot loader we have only INT13h for BIOS version, and it really is limited. The UEFI version is using EFI_BLOCK_IO API, which usually can handle large sectors and disk sizes better.
> > 
> > I re-installed the machine with UEFI boot:
> > 
> > # gpart show mfid0
> > =>         40  42965401520  mfid0  GPT  (20T)
> >           40       409600      1  efi  (200M)
> >       409640         2008         - free -  (1.0M)
> >       411648    268435456      2  freebsd-swap  (128G)
> >    268847104  42696552448      3  freebsd-zfs  (20T)
> >  42965399552         2008         - free -  (1.0M)
> > 
> > # uname -a
> > FreeBSD vm.openedu.org <http://vm.openedu.org/> 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r335317: Mon Jun 18 16:21:17 UTC 2018     root_at_releng3.nyi.freebsd.org <mailto:root_at_releng3.nyi.freebsd.org>:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
> > # zpool get all zroot
> > NAME   PROPERTY                       VALUE                          SOURCE
> > zroot  size                           19.9T                          -
> > zroot  capacity                       0%                             -
> > zroot  altroot                        -                              default
> > zroot  health                         ONLINE                         -
> > zroot  guid                           11079446129259852576           default
> > zroot  version                        -                              default
> > zroot  bootfs                         zroot/ROOT/default             local
> > zroot  delegation                     on                             default
> > zroot  autoreplace                    off                            default
> > zroot  cachefile                      -                              default
> > zroot  failmode                       wait                           default
> > zroot  listsnapshots                  off                            default
> > zroot  autoexpand                     off                            default
> > zroot  dedupditto                     0                              default
> > zroot  dedupratio                     1.00x                          -
> > zroot  free                           19.9T                          -
> > zroot  allocated                      1.67G                          -
> > zroot  readonly                       off                            -
> > zroot  comment                        -                              default
> > zroot  expandsize                     -                              -
> > zroot  freeing                        0                              default
> > zroot  fragmentation                  0%                             -
> > zroot  leaked                         0                              default
> > zroot  bootsize                       -                              default
> > zroot  checkpoint                     -                              -
> > zroot  feature_at_async_destroy          enabled                        local
> > zroot  feature_at_empty_bpobj            active                         local
> > zroot  feature_at_lz4_compress           active                         local
> > zroot  feature_at_multi_vdev_crash_dump  enabled                        local
> > zroot  feature_at_spacemap_histogram     active                         local
> > zroot  feature_at_enabled_txg            active                         local
> > zroot  feature_at_hole_birth             active                         local
> > zroot  feature_at_extensible_dataset     enabled                        local
> > zroot  feature_at_embedded_data          active                         local
> > zroot  feature_at_bookmarks              enabled                        local
> > zroot  feature_at_filesystem_limits      enabled                        local
> > zroot  feature_at_large_blocks           enabled                        local
> > zroot  feature_at_sha512                 enabled                        local
> > zroot  feature_at_skein                  enabled                        local
> > zroot  feature_at_device_removal         enabled                        local
> > zroot  feature_at_obsolete_counts        enabled                        local
> > zroot  feature_at_zpool_checkpoint       enabled                        local
> > # 
> > 
> > and checked 'lsdev -v' at loader prompt:
> > 
> > OK lsdev -v
> >  PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,80)
> >    disk0:    4294967295 X 512 blocks
> >      disk0p1: EFI                 200MB
> >      disk0p2: FreeBSD swap        128GB
> >      disk0p2: FreeBSD ZFS         19TB
> > net devices:
> > zfs devices:
> >  pool: zroot
> > bootfs: zroot/ROOT/default
> > config:
> > 
> >        NAME STATE
> >        zroot ONLINE
> >          mfid0p3 ONLINE
> > OK
> > 
> > but disk size (4294967295 X 512) still not changed or this
> > means 4294967295 X 512 X 512 bytes ?
> > 
> >> 
> >> rgds,
> >> toomas
> >> 
> >> _______________________________________________
> >> freebsd-current_at_freebsd.org <mailto:freebsd-current_at_freebsd.org> mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-current <https://lists.freebsd.org/mailman/listinfo/freebsd-current>
> >> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org <mailto:freebsd-current-unsubscribe_at_freebsd.org>"
> > 
> > Regards
> > 
> > ---
> > KIRIYAMA Kazuhiko
> 
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 

---
KIRIYAMA Kazuhiko
Received on Fri Jun 29 2018 - 00:47:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC