Re: ZFS i/o error in recent 12.0

From: Markus Wild <fbsd-lists_at_dudes.ch>
Date: Wed, 21 Mar 2018 10:28:48 +0100
Hello Thomas,

> > I had faced the exact same issue on a HP Microserver G8 with 8TB disks and a 16TB zpool on FreeBSD 11 about a year
> > ago.  
> I will ask you the same question as I asked the OP:
> 
> Has this pool had new vdevs addded to it since the server was installed?

No. This is a microserver with only 4 (not even hotplug) trays. It was set up using the freebsd installer 
originally. I had to apply the (then patch, don't know whether it's included standard now) btx loader fix to retry
a failed read to get around BIOS bugs with that server, but after that, the server booted fine. It's only after
a bit of use and a kernel update that things went south. I tried many different things at that time, but the only
approach that worked for me was to steal 2 of the 4 swap partitions which I placed on every disk initially, and 
build a mirrored boot zpool from those. The loader had no problem loading the kernel from that, and when the kernel
took over, it had no problem using the original root pool (that the boot loader wasn't able to find/load). Whence my
conclusion that the 2nd stage boot loader has a problem (probably due to yet another bios bug on that server) loading
blocks beyond a certain limit, which could be 2TB or 4TB.

> What does a "zpool status" look like when the pool is imported?

$ zpool status
  pool: zboot
 state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed Mar 21 03:58:36 2018
config:

        NAME               STATE     READ WRITE CKSUM
        zboot              ONLINE       0     0     0
          mirror-0         ONLINE       0     0     0
            gpt/zfs-boot0  ONLINE       0     0     0
            gpt/zfs-boot1  ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
  scan: scrub repaired 0 in 6h49m with 0 errors on Sat Mar 10 10:17:49 2018
config:

        NAME          STATE     READ WRITE CKSUM
        zroot         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            gpt/zfs0  ONLINE       0     0     0
            gpt/zfs1  ONLINE       0     0     0
          mirror-1    ONLINE       0     0     0
            gpt/zfs2  ONLINE       0     0     0
            gpt/zfs3  ONLINE       0     0     0

errors: No known data errors

Please note: this server is in use at a customer now, it's workin fine with this workaround. I just brought it up 
to give a possible explanation to the observed problem of the original poster, and that it _might_ have nothing to do
with a newer version of the current kernel, but rather be due to the updated kernel being written to a new location
on disk, which can't be read properly by the boot loader.

Cheers,
Markus
Received on Wed Mar 21 2018 - 08:29:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC