Re: Booting UEFI ZFS is broken on arm64

From: Warner Losh <imp_at_bsdimp.com>
Date: Wed, 29 Nov 2017 19:31:17 -0700
On Wed, Nov 29, 2017 at 5:54 PM, Warner Losh <imp_at_bsdimp.com> wrote:

>
>
> On Wed, Nov 29, 2017 at 5:43 PM, Shawn Webb <shawn.webb_at_hardenedbsd.org>
> wrote:
>
>> On Wed, Nov 29, 2017 at 05:42:52PM -0700, Warner Losh wrote:
>> > On Wed, Nov 29, 2017 at 5:34 PM, Shawn Webb <shawn.webb_at_hardenedbsd.org
>> >
>> > wrote:
>> >
>> > > On Wed, Nov 29, 2017 at 05:33:46PM -0700, Warner Losh wrote:
>> > > > On Wed, Nov 29, 2017 at 5:21 PM, Shawn Webb <
>> shawn.webb_at_hardenedbsd.org>
>> > > > wrote:
>> > > >
>> > > > > It appears that in the latest FreeBSD 12-CURRENT/arm64 snapshot,
>> > > > > booting UEFI GPT ZFS on my OverDrive 1000 is broken. It boots up
>> to
>> > > > > this line:
>> > > > >
>> > > > > Using DTB provided by EFI at 0x801fe00000.
>> > > >
>> > > >
>> > > > Which snapshot is that? Boot1 was broken until recently.
>> > >
>> > > FreeBSD-12.0-CURRENT-arm64-aarch64-20171121-r326056-memstick.img
>> > >
>> > > It also happens on latest HEAD, so it would appear to still be broken.
>> >
>> >
>> > Is this boot1.efi producing the output, or loader.efi? I'm guessing the
>> > latter, but wanted to make sure. If so, then we're past the point where
>> > boot1.efi would have failed (besides, it was fixed before that
>> snapshot).
>>
>> With DEBUG turned on for stand/fdt:
>>
>> Booting [/boot/kernel/kernel]...
>> fdt_copy(): fdt_copy va 0x01208000
>> fdt_setup_fdtp(): fdt_setup_fdtp()
>> fdt_load_dtb_addr(): fdt_load_dtb_addr(0x801fe00000)
>> Using DTB provided by EFI at 0x801fe00000.
>> Loaded the platform dtb: 0x81f56f1630.
>> fdt_fixup(): fdt_fixup()
>>
>> ^ hangs after that message
>
>
> That doesn't sound like anything I've changed, but it could well be... I
> think to find this breakage, you may need to bisect backwards along stand /
> sys/boot until we find the spot where it broke.
>

There's been several conversations on IRC about how others are hitting a
scheduler bug, at least on x86. hps' fix seems to do the trick for their
issues.

Author: hselasky <hselasky_at_ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Date:   Wed Nov 29 23:28:40 2017 +0000

    The sched_add() function is not only used when the thread is initially
    started, but also by the turnstiles to mark a thread as runnable for
    all locks, for instance sleepqueues do:
    setrunnable()->sched_wakeup()->sched_add()

    In r326218 code was added to allow booting from non-zero CPU numbers
    by setting the ts_cpu field inside the ULE scheduler's sched_add()
    function. This had an undesired side-effect that prior sched_pin() and
    sched_bind() calls got disregarded. This patch fixes the
    initialization of the ts_cpu field for the ULE scheduler to only
    happen once when the initial thread is constructed during system
    init. Forking will then later on ensure that a valid ts_cpu value gets
    copied to all children.

    Reviewed by:    jhb, kib
    Discussed with: nwhitehorn
    MFC after:      1 month
    Differential revision:  https://reviews.freebsd.org/D13298
    Sponsored by:   Mellanox Technologies


    git-svn-id: svn+ssh://svn.freebsd.org/base/head_at_326376
ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f

is the fix.... But the bug it fixes post-dates the snapshot, so maybe this
isn't the same thing...

Warner
Received on Thu Nov 30 2017 - 01:31:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:13 UTC