Re: head -r339076's boot loader fails to boot threadripper 1950X system (BTX halted); an earlier version works [ WITHOUT_ZFS= fixes it ]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sun, 21 Oct 2018 17:11:50 -0700
[Building and installing based on WITHOUT_ZFS= allows the
resulting loader to work correctly on the 1950X.]

On 2018-Oct-21, at 12:05 AM, Mark Millard <marklmi_at_yahoo.com> wrote:

> On 2018-Oct-20, at 10:32 PM, Warner Losh <imp at bsdimp.com> wrote:
> 
>> On Sat, Oct 20, 2018 at 11:04 PM Mark Millard <marklmi at yahoo.com> wrote:
>> [I found what change lead to the 1950X boot crashing
>> with BTX halted.]
>> 
>>> On 2018-Oct-20, at 12:44 PM, Mark Millard <marklmi at yahoo.com> wrote:
>>> 
>>>> [Adding some vintage information for a loader
>>>> that allowed a native boot.]
>>>> 
>>>> On 2018-Oct-20, at 4:00 AM, Mark Millard <marklmi at yahoo.com> wrote:
>>>> 
>>>>> I attempted to jump from head -r334014 to -r339076
>>>>> on a threadripper 1950X board and the native
>>>>> FreeBSD boot failed very early. (Hyper-V use of
>>>>> the same media did not have this issue.)
>>>>> 
>>>>> But copying over an older /boot/loader from another
>>>>> storage device with a FreeBSD head version that has
>>>>> not been updated yet got past the problem being
>>>>> reported here. (For other reasons, the kernel has
>>>>> been moved back to -r338804 --and with that,
>>>>> and the older /boot/loader, the 1950X native-boots
>>>>> FreeBSD all the way just fine.)
>>>> 
>>>> I found one /boot/loader.old that was dated
>>>> in the update'd file system as 2018-May 20,
>>>> instead of 2018-Apr-03 from the older file
>>>> system. May 20 would apparently mean a little
>>>> below -r334014 . It native-booted okay, as did
>>>> the April one.
>>>> 
>>>> [I do not know how to inspect a /boot/loader*
>>>> to find out what -r?????? it is from.]
>>>> 
>>>> Unfortunately, I had done more than one -r339076
>>>> install from -r334014 before rebooting and
>>>> no -r334014 loaders were still present:
>>>> the other *.old files from a few minutes before
>>>> the ones I had the boot problem with.
>>>> 
>>>> I might be able to extract loaders from various:
>>>> 
>>>> https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/base.txz
>>>> 
>>>> materials and try substituting them in order to
>>>> narrow the range for works -> fails. If I can,
>>>> this likely would take a fair amount of time in
>>>> my context.
>>>> 
>>>> Other notes:
>>>> 
>>>> It turns out that only Hyper-V based use needed
>>>> a -r334804 kernel: Native booting with the older
>>>> loaders and newer kernels works fine.
>>>> 
>>>> Windows 10 Pro 64bit also has no problems
>>>> booting and operating the machine.
>>>> 
>>>> The native-boot problem does seem to be freeBSD
>>>> loader-vintage specific.
>>>> 
>>>>> For the BTX failure the display ends up with
>>>>> (hand transcribed, ". . ." for an omission):
>>>>> 
>>>>> BTX loader 1.00 BTX version is 1.02
>>>>> Console: internal video/keyboard
>>>>> BIOS drive C: is disk0
>>>>> . . .
>>>>> BIOS drive P: is disk13
>>>>> -
>>>>> int=00000000  err=00000000  efl=00010246  eip=000096fd
>>>>> eax=74d48000  ebx=74d4e5e0  ecx=00000011  edx=00000000
>>>>> esi=74d4e380  edi=74d4e5b0  ebp=00091da0  esp=00091d60
>>>>> cs=002b  ds=0033  es=0033    fs=0033  gs=0033  ss=0033
>>>>> cs:eip=66 f7 77 04 0f b7 c0 89-44 24 0c 89 5c 24 04 8b
>>>>>     45 08 89 04 24 83 64 24-10 00 c7 44 24 08 01 00
>>>>> ss:esp=00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>>>>>     00 00 00 00 00 00 00 00-f0 1d 89 00 00 00 00 00
>>>>> BTX halted
>>>> 
>>>> I've no clue what of that output might be loader vintage
>>>> specific. It might not be of use without knowing the
>>>> exact build of the loader.
>>>> 
>>>>> The board is a GIGABYTE X399 AORUS Gaming 7 (rev 1.0).
>>>>> It has 96 GiBytes of ECC RAM, just 6 DIMMs installed.
>>>> 
>>>> For reference for the board's BIOS:
>>>> 
>>>> Version: F11e
>>>> Dated: 2018-Sep-17
>>>> Description: Update AGESA 1.1.0.1a
>>> 
>>> Using:
>>> 
>>> https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/base.txz
>>> 
>>> materials I found that:
>>> 
>>> -r336492: worked (loader vs. zfsloader: not linked)
>>> (no more amd64 builds until . . .)
>>> -r336538: failed (loader vs. zfsloader: linked)
>>> 
>>> (Later ones that I tried also failed.)
>>> 
>>> Looks like this broke for booting the 1950X 
>>> system in question when the following was
>>> checked in:
>>> 
>>> Author: imp
>>> Date: Fri Jul 20 05:17:37 2018
>>> New Revision: 336532
>>> URL: 
>>> https://svnweb.freebsd.org/changeset/base/336532
>>> 
>>> 
>>> Log:
>>>  Collapse zfsloader functionality back down into loader.
>>> 
>> Yea, this shouldn't matter. It worked on all the systems I tried it on.
>> 
>> So my first question: is this a ZFS system? Second, does it also have UFS? If yes to both, which one do you want it to boot off of?
> 
> No zfs in use at all. It has been years since
> I experimented with ZFS and reverted back to
> UFS.
> 
> # gpart show -l
> =>       40  937703008  da0  GPT  (447G)
>         40       1024    1  FBSDFSSDboot  (512K)
>       1064  746586112    2  FBSDFSSDroot  (356G)
>  746587176   31457280    3  FBSDFSSDswap  (15G)
>  778044456  159383552    4  FBSDFSSDswap2  (76G)
>  937428008     275040       - free -  (134M)
> . . .
> 
> Doing:
> 
> gpart bootcode -p /boot/gptboot -i 1 da0
> 
> and the trying a modern /boot/loader
> did not change anything: still "BTX halted"
> for a native boot. (No problem under Hyper-V.)

I added WITHOUT_ZFS= to my equivalents of src.conf
files for targeting amd64, built, and installed.
The result native-boots just fine.

The crash is somehow specific to loader code
tied to LOADER_ZFS_SUPPORT being defined.

Of course, this leaves me unable to native-boot an
official, modern, unmodified build on the 1950X
machine.

While I do not actively use ZFS these days, I'd
always left it built and installed in case I
decided to do something with it at some point.
I do not normally try to minimize configurations.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Sun Oct 21 2018 - 22:12:02 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:18 UTC