Re: Booting UEFI ZFS is broken on arm64

From: Warner Losh <imp_at_bsdimp.com>
Date: Fri, 1 Dec 2017 14:57:35 -0700
On Fri, Dec 1, 2017 at 2:55 PM, Shawn Webb <shawn.webb_at_hardenedbsd.org>
wrote:

> On Fri, Dec 01, 2017 at 02:53:53PM -0700, Warner Losh wrote:
> > On Fri, Dec 1, 2017 at 2:49 PM, Shawn Webb <shawn.webb_at_hardenedbsd.org>
> > wrote:
> >
> > > On Wed, Nov 29, 2017 at 07:31:17PM -0700, Warner Losh wrote:
> > > > On Wed, Nov 29, 2017 at 5:54 PM, Warner Losh <imp_at_bsdimp.com> wrote:
> > > >
> > > > >
> > > > >
> > > > > On Wed, Nov 29, 2017 at 5:43 PM, Shawn Webb <
> > > shawn.webb_at_hardenedbsd.org>
> > > > > wrote:
> > > > >
> > > > >> On Wed, Nov 29, 2017 at 05:42:52PM -0700, Warner Losh wrote:
> > > > >> > On Wed, Nov 29, 2017 at 5:34 PM, Shawn Webb <
> > > shawn.webb_at_hardenedbsd.org
> > > > >> >
> > > > >> > wrote:
> > > > >> >
> > > > >> > > On Wed, Nov 29, 2017 at 05:33:46PM -0700, Warner Losh wrote:
> > > > >> > > > On Wed, Nov 29, 2017 at 5:21 PM, Shawn Webb <
> > > > >> shawn.webb_at_hardenedbsd.org>
> > > > >> > > > wrote:
> > > > >> > > >
> > > > >> > > > > It appears that in the latest FreeBSD 12-CURRENT/arm64
> > > snapshot,
> > > > >> > > > > booting UEFI GPT ZFS on my OverDrive 1000 is broken. It
> boots
> > > up
> > > > >> to
> > > > >> > > > > this line:
> > > > >> > > > >
> > > > >> > > > > Using DTB provided by EFI at 0x801fe00000.
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > Which snapshot is that? Boot1 was broken until recently.
> > > > >> > >
> > > > >> > > FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-
> memstick.img
> > > > >> > >
> > > > >> > > It also happens on latest HEAD, so it would appear to still be
> > > broken.
> > > > >> >
> > > > >> >
> > > > >> > Is this boot1.efi producing the output, or loader.efi? I'm
> guessing
> > > the
> > > > >> > latter, but wanted to make sure. If so, then we're past the
> point
> > > where
> > > > >> > boot1.efi would have failed (besides, it was fixed before that
> > > > >> snapshot).
> > > > >>
> > > > >> With DEBUG turned on for stand/fdt:
> > > > >>
> > > > >> Booting [/boot/kernel/kernel]...
> > > > >> fdt_copy(): fdt_copy va 0x01208000
> > > > >> fdt_setup_fdtp(): fdt_setup_fdtp()
> > > > >> fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
> > > > >> Using DTB provided by EFI at 0x801fe00000.
> > > > >> Loaded the platform dtb: 0x81f56f1630.
> > > > >> fdt_fixup(): fdt_fixup()
> > > > >>
> > > > >> ^ hangs after that message
> > > > >
> > > > >
> > > > > That doesn't sound like anything I've changed, but it could well
> be...
> > > I
> > > > > think to find this breakage, you may need to bisect backwards along
> > > stand /
> > > > > sys/boot until we find the spot where it broke.
> > > > >
> > > >
> > > > There's been several conversations on IRC about how others are
> hitting a
> > > > scheduler bug, at least on x86. hps' fix seems to do the trick for
> their
> > > > issues.
> > > >
> > > > Author: hselasky <hselasky_at_ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
> > > > Date:   Wed Nov 29 23:28:40 2017 +0000
> > > >
> > > >     The sched_add() function is not only used when the thread is
> > > initially
> > > >     started, but also by the turnstiles to mark a thread as runnable
> for
> > > >     all locks, for instance sleepqueues do:
> > > >     setrunnable()->sched_wakeup()->sched_add()
> > > >
> > > >     In r326218 code was added to allow booting from non-zero CPU
> numbers
> > > >     by setting the ts_cpu field inside the ULE scheduler's
> sched_add()
> > > >     function. This had an undesired side-effect that prior
> sched_pin()
> > > and
> > > >     sched_bind() calls got disregarded. This patch fixes the
> > > >     initialization of the ts_cpu field for the ULE scheduler to only
> > > >     happen once when the initial thread is constructed during system
> > > >     init. Forking will then later on ensure that a valid ts_cpu value
> > > gets
> > > >     copied to all children.
> > > >
> > > >     Reviewed by:    jhb, kib
> > > >     Discussed with: nwhitehorn
> > > >     MFC after:      1 month
> > > >     Differential revision:  https://reviews.freebsd.org/D13298
> > > >     Sponsored by:   Mellanox Technologies
> > > >
> > > >
> > > >     git-svn-id: svn+ssh://svn.freebsd.org/base/head_at_326376
> > > > ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
> > > >
> > > > is the fix.... But the bug it fixes post-dates the snapshot, so maybe
> > > this
> > > > isn't the same thing...
> > >
> > > Definitely is not the same thing. I've so far got it printf'd to where
> > > the uefi loader jumps into the kernel's entry point. So the loader
> > > itself might be fine. Something in the kernel, then, is going funky.
> > >
> > > Booting in verbose mode does not provide any additional input.
> > >
> > > Here's the output I get (some of the output is from printf's I've
> > > done):
> > >
> > > FreeBSD/arm64 EFI loader, Revision 1.1
> > > (Wed Nov 29 21:51:14 EST 2017 shawn_at_hbsd-dev-laptop)
> > > EFI boot environment
> > > Loading /boot/defaults/loader.conf
> > > /boot/kernel/kernel text=0x7e0a78 data=0xaad80+0x443f62
> > > syms=[0x8+0x10ec78+0x8+0x1021d4]
> > > /boot/entropy size=0x1000
> > > /boot/kernel/zfs.ko text=0x99070 text=0x130390 data=0x21ff8+0x9ef98
> > > syms=[0x8+0x22c68+0x8+0x1b99b]
> > > /boot/kernel/opensolaris.ko text=0x1330 text=0xd00 data=0x10160+0x125d0
> > > syms=[0x8+0xff0+0x8+0x8d8]
> > >
> > > Hit [Enter] to boot immediately, or any other key for command prompt.
> > > Booting [/boot/kernel/kernel]...
> > > Using DTB provided by EFI at 0x801fe00000.
> > > fdt_copy returned. dtb_size is 9060.
> > > bi_load finished. err: 0
> > > dev_cleanup finished
> > > About to call into the entry point at 0x81ee601000
> > >
> >
> > You might try booting the same kernel off a small UFS partition. There's
> a
> > tiny chance that the loader didn't load it right, but more likely the
> > kernel is borked. Maybe DTB issues? Maybe something else... A quick test
> > like that would remove ZFS from the equation, even if it's just a USB
> > stick...
>
> UFS works fine and dandy. It's ZFS that's b0rked.


OK. Let me know what you find...  I assume the entry point matches with
what you've loaded?

Warner
Received on Fri Dec 01 2017 - 20:57:36 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:13 UTC