zfsboot, MBR/partition table based setup, zfs loader issues

From: Benjamin Close <Benjamin.Close_at_clearchain.com>
Date: Fri, 08 May 2009 12:45:21 +0930
Hi Folks,
     I've been trying to setup zfs on root without a ufs partition. Not 
using gpt but using a standard mbr/partition table.
After digging through the code I found having it on a bsd slice (aka 
a,b,d,e,f etc) is impossible. Though having it on a partition should be 
possible.

I've got most of the way but now have hit a point where there's 
something not quite working and I don't know how to debug it further.
The setup is:
     amd64
     ad4s1 - winxp
     ad4s2 - zpool:data

I'm using the -current snapshot fixit cdrom from 200902
I've successfully installed boot1/2 using:

	# dd if=/mnt2/boot/zfsboot of=/dev/da0s1 count=1
	# dd if=/mnt2/boot/zfsboot of=/dev/da0s1 skip=1 seek=1024




and created a pool using:

                 # cd /mnt2/boot/kernel
                 # kldload ./opensolaris.ko
                 # kldload ./zfs.ko
                 # zpool create data /dev/ad4s2
                 # cp -a /dist/boot /data/boot

Most the setup is working.

Boot2 is starting the loader but the loader is unable to see any zfs 
pools (even though it's built with LOADER_WITH_ZFS on another box).
I've been trying to narrow down the issue with the loader. Turns out the 
loader gets the correct guid for the pool from boot2 but when loader 
reprobes the bios disks via:

zfsimpl.c:837:zio_read_phy: rc=vdev->v_read(vdev,vdev->vread_priv, 
offset, buf,psize)
zfsimpl.c:644:vdev_probe  : zio_read_phys(&vtmp,&bp,vdev_label,off)
zfs.c    :438:zfs_dev_init: vdev_probe(vdev_read, (void *)(uintptr_t)fd, 0))
main.c   :167:main        : devsw[i]->dv_init()

the resultant read succeeds but the following call to zio_checksum_error 
confirms the checksum is wrong. I've destroyed the pool and recreated it 
just incase the sum was wrong.
However, I suspect the ldr is reading the wrong data from the disk. 
vdev->v_read is the function:

     vdev_read( vdev_t *vdev, void *priv, off_t offset, void *buf, 
size_t size),

being called with:

     vdev_read(
          tempstruct built in vdev_probe,
         fd = 0 opened in vdev_probe for ad4s2,
         offset = 0x4000 from offsetof(vdev_label_t, vl_vdevphys) in 
vdev_probe,
         a buffer - vdev_label (which = zscratch),
        113800 - sizeof(vdev_phys_t), via get/set BP_PSIZE  in vdev_probe)

The read is successful. What I question is the following:
     o Why do all devices return fd=0 on an open call, is this because 
it's loader code and there's no unique numbering? If so do offsets have 
to be applied for slices? Ie is it 0x4000 + some slice offset?
     o Does the read (based on asm in zfsldr.S) actually have valid data 
at offset 0x4000 - The comment in the file seems to indicate data is 
only valid above 0x8000 or am I mixing up memory/vs disk addressing?

/*
  * Ok, we have a slice and drive in %dx now, so use that to locate and
  * load boot2.  %si references the start of the slice we are looking
  * for, so go ahead and load up the 64 sectors starting at sector 1024
  * (i.e. after the two vdev labels).  We don't have do anything fancy
  * here to allow for an extra copy of boot1 and a partition table
  * (compare to this section of the UFS bootstrap) so we just load it
  * all at 0x8000. The first part of boot2 is BTX, which wants to run
  * at 0x9000. The boot2.bin binary starts right after the end of BTX,
  * so we have to figure out where the start of it is and then move the
  * binary to 0xc000. After we have moved the client, we relocate BTX
  * itself to 0x9000 - doing it in this order means that none of the
  * memcpy regions overlap which would corrupt the copy.  Normally, BTX
  * clients start at MEM_USR, or 0xa000, but when we use btxld to
  * create boot2, we use an entry point of 0x2000.  That entry point is
  * relative to MEM_USR; thus boot2.bin starts at 0xc000.

Any help would be appreciated in getting this working as having FBSD 
boot off zfs natively is a huge win.

Cheers,
     Benjamin
Received on Fri May 08 2009 - 01:50:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:47 UTC