Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"

From: Robert Noland <rnoland_at_FreeBSD.org>
Date: Sat, 24 Oct 2009 22:45:52 -0500
On Sat, 2009-10-24 at 19:44 +0200, Radek Valášek wrote:
> Robert Noland napsal(a):
> > On Thu, 2009-10-15 at 21:37 +0200, Radek Valášek wrote:
> >   
> >> Robert Noland napsal(a):
> >>     
> >>> On Thu, 2009-10-15 at 14:08 +0200, Radek Valášek wrote:
> >>>   
> >>>       
> >>>> Hi,
> >>>>
> >>>> I want to ask if there is something new in adding support to 
> >>>> gptzfsboot/zfsboot for reading gang-blocks?
> >>>>    
> >>>>         
> >
> > I think that the gang block patch will work, though still haven't gotten
> > it tested.  However, I'm fairly confident that the issue is not gang
> > block related.  Right now, I have setup a disk like this:
> >
> > =>        34  1953525101  ada1  GPT  (932G)
> >           34         128     1  freebsd-boot  (64K)
> >          162     8388608     2  freebsd-swap  (4.0G)
> >      8388770   648019968     3  freebsd-zfs  (309G)
> >    656408738   648019968     4  freebsd-zfs  (309G)
> >   1304428706   648019968     5  freebsd-zfs  (309G)
> >   1952448674     1076461        - free -  (526M)
> >
> > Note that this is not a raidz pool right now.  It is just 3 toplevel
> > partitions setup as a single pool.  I finally have this configuration
> > working reliably.  At least in this case, the issue is due to all of the
> > partitions not being probed during early boot and so not being added to
> > the list of vdevs for the pool.  When zio_read finds a dva that points
> > to a device it doesn't know about, it gives up and whines.
> >
> > Can you detail for me how you have everything configured, so that I can
> > try to replicate it.  gpart show, zpool status and zpool get all <pool>
> > would be good.  I'm not sure that I have enough spare disks lying around
> > to do this properly, but maybe I can use virtual disks or something.
> >
> > robert.
> >
> >   
> 
> Sorry for not responding so long. Here are details you want from me:
> 
> # gpart show
> =>        34  1953525101  ad6  GPT  (932G)
>           34         128    1  freebsd-boot  (64K)
>          162  1953524973    2  freebsd-zfs  (932G)
> 
> =>        34  1953525101  ad8  GPT  (932G)
>           34         128    1  freebsd-boot  (64K)
>          162  1953524973    2  freebsd-zfs  (932G)
> 
> =>        34  1953525101  ad10  GPT  (932G)
>           34         128     1  freebsd-boot  (64K)
>          162  1953524973     2  freebsd-zfs  (932G)
> 
> =>        34  1953525101  ad12  GPT  (932G)
>           34         128     1  freebsd-boot  (64K)
>          162  1953524973     2  freebsd-zfs  (932G)
> 
> # zpool status
>   pool: z
>  state: ONLINE
>  scrub: none requested
> config:
> 
>     NAME        STATE     READ WRITE CKSUM
>     z           ONLINE       0     0     0
>       raidz1    ONLINE       0     0     0
>         ad6p2   ONLINE       0     0     0
>         ad8p2   ONLINE       0     0     0
>         ad10p2  ONLINE       0     0     0
>         ad12p2  ONLINE       0     0     0
> 
> errors: No known data errors
> 
> # zpool get all z
> NAME  PROPERTY       VALUE       SOURCE
> z     size           3.62T       -
> z     used           4.62G       -
> z     available      3.62T       -
> z     capacity       0%          -
> z     altroot        -           default
> z     health         ONLINE      -
> z     guid           17857007133862981114  -
> z     version        13          default
> z     bootfs         z/system    local
> z     delegation     on          default
> z     autoreplace    off         default
> z     cachefile      -           default
> z     failmode       wait        default
> z     listsnapshots  off         default
> 
> I've tested your patches but it seems that you're right and it's not 
> gang related issue. I was able to discover these things on a fully 
> functional zfs pool (system compiled with your patches):
> 
> 1, If I overwrite the file /boot/loader.conf (with copy of itself, or 
> when upgrading kernel/world), next reboot comes with these messages:
> 
> BTX loader 1.00  BTX version is 1.02
> Consoles: internal video/keyboard
> BIOS drive C: is disk0
> BIOS drive D: is disk1
> BIOS drive E: is disk2
> BIOS drive F: is disk3
> BIOS 627kB/3405248kB available memory
> 
> FreeBSD/i386 bootstrap loader, Revision 1.1
> (root_at_ztest, Thu Oct 22 22:27:22 CEST 2009)
> Loading /boot/defaults/loader.conf
> ZFS: i/o error - all block copies unavailable
> Warning: error reading file /boot/loader.conf
> 
> Then I'm still able to boot the system, but I must set the boot 
> variables included in loader.conf by hand
> 
> 2, Next I overwrite the file /boot/loader (with copy of itself, or when 
> upgrading kernel/world) and reboot comes with these messages:
> 
> BTX loader 1.00  BTX version is 1.02
> Consoles: internal video/keyboard
> BIOS drive C: is disk0
> BIOS drive D: is disk1
> BIOS drive E: is disk2
> BIOS drive F: is disk3
> BIOS 627kB/3405248kB available memory
> 
> FreeBSD/i386 bootstrap loader, Revision 1.1
> (root_at_ztest, Thu Oct 22 22:27:22 CEST 2009)
> Loading /boot/defaults/loader.conf
> ZFS: i/o error - all block copies unavailable
> Warning: error reading file /boot/loader.conf
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> ZFS: i/o error - all block copies unavailable
> Unable to load a kernel!
> 
> After that I'm no longer able to boot the system from zfs pool.
> 
> Hope you have some ideas...

Ok, can you retest with -CURRENT?  I just committed some fixes on
Friday.  I'm having real difficulty in reproducing these issues.  Most
of the problems that I've run into so far had to do with the system not
knowing about all of the vdevs when it wanted to read something.  In
your case, it looks like you are making it to boot3 and it appears to be
seeing all 4 of your disks.  Right now, I've been trying to track down
an issue wher the MOS can't be read, which basically means that we have
screwed up the root block pointer somehow.  I haven't been able to
reproduce that issue in qemu, I have been able to reproduce it with
VirtualBox, but it is really time consuming trying to work in vbox since
I have to reconvert all of the disk images every time I make a change.
I'm actually a bit concerned that it hinges on how many drives are
visible to the bios at various points in time.

robert.

> vaLin
> 
> >>> Ok, I can't figure out any way to test this... beyond the fact that it
> >>> builds and doesn't break my currently working setup.  Can you give this
> >>> a try?  It should still report if it finds gang blocks, but hopefully
> >>> now will read them as well.
> >>>
> >>> robert.
> >>>
> >>>   
> >>>       
> >> Big thanks for the patches Robert, I will definitely test them as soon 
> >> as possible (tomorrow) and report the results immediately to list. I can 
> >> repeat this issue probably at any time (up to cca 30 times tested with 
> >> the same result), so don't bother about the broken booting, I'm prepared 
> >> for it...
> >>
> >> vaLin
> >>     
> >>>>  From Sun's docs:
> >>>>
> >>>> Gang blocks
> >>>>
> >>>> When there is not enough contiguous space to write a complete block, the ZIO
> >>>> pipeline will break the I/O up into smaller 'gang blocks' which can later be
> >>>> assembled transparently to appear as complete blocks.
> >>>>
> >>>> Everything works fine for me, until I rewrite kernel/world after system 
> >>>> upgrade to latest one (releng_8). After this am I no longer able to boot 
> >>>> from zfs raidz1 pool with following messages:
> >>>>
> >>>>  >/ ZFS: i/o error - all block copies unavailable
> >>>> />/ ZFS: can't read MOS
> >>>> />/ ZFS: unexpected object set type lld
> >>>> />/ ZFS: unexpected object set type lld
> >>>> />/
> >>>> />/ FreeBSD/i386 boot
> >>>> />/ Default: z:/boot/kernel/kernel
> >>>> />/ boot:
> >>>> />/ ZFS: unexpected object set type lld
> >>>> />/
> >>>> />/ FreeBSD/i386 boot
> >>>> />/ Default: tank:/boot/kernel/kernel
> >>>> />/ boot:
> >>>> //
> >>>> /I presume it's the same issue as talked in june-2009 current mailing 
> >>>> list 
> >>>> http://lists.freebsd.org/pipermail/freebsd-current/2009-June/008589.html
> >>>>
> >>>> Any success in that matter?
> >>>>
> >>>> Thnx for answer.
> >>>>
> >>>> vaLin
> >>>> _______________________________________________
> >>>> freebsd-current_at_freebsd.org mailing list
> >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> >>>> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> >>>>     
> >>>>         
> 
-- 
Robert Noland <rnoland_at_FreeBSD.org>
FreeBSD
Received on Sun Oct 25 2009 - 02:46:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:57 UTC