Re: [UEFI] Boot issues on some UEFI implementations

From: Toomas Soome <tsoome_at_me.com>
Date: Wed, 25 Jul 2018 11:46:07 +0300
> On 25 Jul 2018, at 10:59, O. Hartmann <ohartmann_at_walstatt.org> wrote:
> 
> On Tue, 24 Jul 2018 08:53:36 +0300
> Toomas Soome <tsoome_at_me.com> wrote:
> 
> 
> Hello  Toomas Soome,
> 
> I CC Allan Jude since I discovered something  weird today regarding the UEFI
> boot capabilities of USB flash devices and SSDs. See below.
> 
>>> On 24 Jul 2018, at 08:16, O. Hartmann <ohartmann_at_walstatt.org> wrote:
>>> 
>>> On Mon, 23 Jul 2018 10:56:04 +0300
>>> Toomas Soome <tsoome_at_me.com> wrote:
>>> 
>>>>> On 23 Jul 2018, at 10:27, O. Hartmann <ohartmann_at_walstatt.org> wrote:
>>>>> 
>>>>> On Fri, 13 Jul 2018 18:44:23 +0300
>>>>> Toomas Soome <tsoome_at_me.com <mailto:tsoome_at_me.com>> wrote:
>>>>> 
>>>>>>> On 13 Jul 2018, at 17:44, O. Hartmann <o.hartmann_at_walstatt.org
>>>>>>> <mailto:o.hartmann_at_walstatt.org>> wrote:
>>>>>>> 
>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>> Hash: SHA512
>>>>>>> 
>>>>>>> Am Fri, 13 Jul 2018 14:26:51 +0300
>>>>>>> Toomas Soome <tsoome_at_me.com <mailto:tsoome_at_me.com> <mailto:tsoome_at_me.com
>>>>>>> <mailto:tsoome_at_me.com>>> schrieb:   
>>>>>>>>> On 13 Jul 2018, at 14:00, O. Hartmann <ohartmann_at_walstatt.org> wrote:
>>>>>>>>> 
>>>>>>>>> The problem is some kind of weird. I face UEFI boot problems on GPT
>>>>>>>>> drives where the first partition begins at block 40 of the hdd/ssd.
>>>>>>>>> 
>>>>>>>>> I have two host in private use based on an
>>>>>>>>> outdated ASRock Z77-Pro4-M and Z77-Pro4 mainboard (IvyBridge, Socket
>>>>>>>>> LGA1155). Both boards are equipted with the lates official available
>>>>>>>>> AMI firmware revision, dating to 2013. This is for the Z77-Pro4-M
>>>>>>>>> revision 2.0 (2013/7/23) and for the Z77 Pro4 revision 1.8
>>>>>>>>> (2013/7/17). For both boards a BETA revision for the Spectre/Meltdown
>>>>>>>>> mitigation is available, but I didn't test that. But please read.
>>>>>>>>> 
>>>>>>>>> The third box I realised this problem is a brand new Fujitsu Esprimo
>>>>>>>>> Q956, also AMI firmware, at V5.0.0.11 R 1.26.0 for 3413-A1x, date
>>>>>>>>> 05/25/2018 (or 20180525).
>>>>>>>>> 
>>>>>>>>> Installing on any kind of HDD or SSD manually or via bsdinstall the OS
>>>>>>>>> using UEFI-only boot method on a GPT partitioned device fails. The
>>>>>>>>> ASRock boards jump immediately into the firmware, the Fujitsu offers
>>>>>>>>> some kind of CPU/Memory/HDD test facility.
>>>>>>>>> 
>>>>>>>>> If on both type of vendor/boards CSM is disabled and UEFI boot only is
>>>>>>>>> implied, the MBR partitioned FreeBSD installation USB flash device
>>>>>>>>> does boot in UEFI! I guess I can assume this when the well known
>>>>>>>>> clumsy 80x25 char console suddenly gets bright and shiny with a much
>>>>>>>>> higher resoltion as long the GPU supports EFI GOP. Looking with gpart
>>>>>>>>> at the USB flash drives reveals that the EFI partition starts at
>>>>>>>>> block 1 of the device and the device has a MBR layout. I haven't
>>>>>>>>> found a way to force the GPT scheme, when initialised via gpart, to
>>>>>>>>> let the partitions start at block 1. This might be a naiv thinking,
>>>>>>>>> so please be patient with me.
>>>>>>>>> 
>>>>>>>>> I do not know whether this is a well-known issue. On the ASRock
>>>>>>>>> boards, I tried years ago some LinuxRed Hat and Suse with UEFI and
>>>>>>>>> that worked - FreeBSD not. I gave up on that that time. Now, having
>>>>>>>>> the very same issues with a new Fujitsu system, leaves me with the
>>>>>>>>> impression that FreeBSD's UEFI implementation might have problems I'm
>>>>>>>>> not aware of.
>>>>>>>>> 
>>>>>>>>> Can someone shed some light onto this? 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> The first thing to check is if the secure boot is disabled. We do not
>>>>>>>> support secure boot at all at this time.      
>>>>>>> 
>>>>>>> Secure boot is in every scenario disabled!
>>>>>>> 
>>>>>>>> 
>>>>>>>> If you have efi or bios version running - you can check from either
>>>>>>>> console variable value (it can have efi or vidconsole or comconsole) or
>>>>>>>> better yet, see if efi-version is set (show efi-version) - if
>>>>>>>> efi-version is not set, it is BIOS loader running. Another indirect
>>>>>>>> way is to see lsdev -v, with device paths present, it is uefi:)      
>>>>>>> 
>>>>>>> What are you talking about?
>>>>>>> What is the point of entry - running system, loader?
>>>>>>> 
>>>>>>> sysct machdep.bootmethod: BIOS
>>>>>>> 
>>>>>>> This makes me quite sure that the system has booted via BIOS - as I'm
>>>>>>> sure since I've checked that many times. UEFI doesn't work on those
>>>>>>> systems with FreeBSD. I'm not sure antmore, but I tried also Windows 7
>>>>>>> on those mainboards booting via UEFI - and I might recall that they
>>>>>>> failed also. I also recall that there were issues with earlier UEFI
>>>>>>> versions regarding booting only Windows 8/8.1 - and nothing else, but
>>>>>>> the fact that Linux worked confuses me a bit.
>>>>>>> 
>>>>>>> If this ASRock crap (never ever again this brand!) doesn't work at all -
>>>>>>> who cares, I intend to purchase new server grade hardware. But the more
>>>>>>> puzzling issue is with the Fujitsu, which I consider serious and from
>>>>>>> the behaviour the Fujitsu failure looks exactly like the ASRock -
>>>>>>> Windows 7 works, RedHat 7.5 works (I assume I can trust the Firmware
>>>>>>> settings when I disable CSM support, that the Firmware will only
>>>>>>> EFI/UEFI capable loader? Or is there a ghosty override somwhere to be
>>>>>>> expected?). Also on ASRock disabling CSM should ensure not booting a
>>>>>>> dual-bootstrap-capable system. This said, on the recent Fujitsu, it
>>>>>>> seems to boil down to a FreeBSD UEFI-firmware interaction problem,
>>>>>>> while the ASRock is still under suspicion to be broken by design.     
>>>>>>>> 
>>>>>>>> GPT partitions can never start from disk absolute sector 1; this is
>>>>>>>> because at sector 0 there is MBR (for compatibility), sector 1 is GPT
>>>>>>>> table and then sectors 2-33 have GPT partition table entries, so the
>>>>>>>> first possible data sector is 34 (absolute 34). Thats assuming 512B
>>>>>>>> sectors. For details see UEFI 2.7 Chapter 5.3.1 page 131.      
>>>>>>> 
>>>>>>> Thanks for the explanation. That implies the installer did right, gpart
>>>>>>> did also right and therefore there must be an issue with the stuff
>>>>>>> located within the EFI partition?     
>>>>>> 
>>>>>> Ok, so, it is not about UEFI bootcode but BIOS, and if we reach BIOS
>>>>>> loader at all or not - that is, if the BIOS bootstrap is actually caring
>>>>>> to read the MBR code and start it, since once the MBR code is started,
>>>>>> it is all about our code.    
>>>>> 
>>>>> I'm getting confused a bit here. Do you mean by "BIOS" the CSM? or do you
>>>>> mean that specific portion of the UEFI firmware, which looks for the
>>>>> proper UEFI partition?
>>>>> 
>>>> 
>>>> BIOS as either native or CSM. Note that from boot code point of view the
>>>> CSM boot *is* BIOS boot, we have no access to UEFI features.
>>>> 
>>>>> 
>>>>> The boxes in question, most notably the more recent Fujitsu Esprimo Q956,
>>>>> refuse booting UEFI, even if properly setup (in terms of what FreeBSD
>>>>> provides on recent CURRENT) is applied and CSM is switched off in the
>>>>> firmware. Again: GPT partition scheme.
>>>>> 
>>>>> The system boots properly if a second partition of type "freebsd-boot" is
>>>>> applied and bootcode is properly applied via "gpart bootcode -b /boot/pmbr
>>>>> -p /boot/gptboot -i 2 ada0" (ada0 is the device).     
>>>>>> 
>>>>>> btw, you can try to validate the installed boot blocks by using recent
>>>>>> enough loader (usb or iso) and then you can use from OK prompt:    
>>>>> 
>>>>> lsdev provides me with the follwoing informations (CSM enabled):
>>>>> 
>>>>> OK lsdev
>>>>> disk devices:
>>>>> 	disk0:		BIOS DRIVE C ...
>>>>> 
>>>>> 		disk0p1:	EFI
>>>>> 		disk0p2:	FreeBSD BOOT
>>>>> 		disk0p3:	FreeBSD SWAP
>>>>> 		disk0p4:	FreeBSD ZFS
>>>>> zfs devices:
>>>>> 	zfs:zroot
>>>>> 
>>>>> OK chain disk0
>>>>> open failed     (so for disk0p{1-4}.
>>>>> 
>>>>> OK chain zroot
>>>>> failed to read disk (just for completeness)    
>>>> 
>>>> 
>>>> chain command does use only device name (such as disk0: or disk0p2: ), but
>>>> not zfs pool as device.  I just found I haven’t ported the code to read the
>>>> file.  
>>> 
>>> ??
>>> 
>>>> 
>>>> the point for chain command test is to see if the normal read and execute
>>>> would work, so in your case please try:
>>>> 
>>>> chain disk0:  
>>> 
>>> As stated above, I did so, and the result is also mentioned above, I always
>>> get "open failed".
>>> This is the same for 
>>> 
>>> chain disk0
>>> chain disk0p1
>>> chain disk0p2
>>> chain disk0p3
>>> chain disk0p4
>>> 
>>> as already said. CSM is enabled in this case.  
>> 
>> sigh… chain command does take device as argument, device must always end with
>> colon…. in this case, the devil is in details:) as I wrote above, the command
>> should be:
>> 
>> chain disk0:
>> 
>> The disk0p1: etc will only work when partition boot code was installed (which
>> you most likely do not have - the only possible candidate could be FreeBSD
>> ZFS partition).
> 
> The command "chain disk0:" works as expected (CSM enabled, GPT partition
> scheme, but with PMBR bootblock installed and freebsd-boot partition conatining
> gptzfsboot installed.
> 
> 
>> 
>>> 
>>>> 
>>>> to read pmbr (512B) and execute it. The expected outcome would be that pmbr
>>>> boot code would browse the GPT, read stage1 from disk0p2: and execute it;
>>>> stage1 would detect FreeBSD ZFS from disk0p4: and load and
>>>> execute /boot/loader. If that will happen, it means the boot code in our
>>>> stages is just fine, but the bios (CSM) does not load pmbr….  if thats
>>>> true, it would mean that you either need to use UEFI boot or need to have
>>>> some hack to fool the BIOS or just not use GPT on that machine with CSM.  
>>> 
>>> To make it clear here: The only way to boot this box is using CSM (as it is
>>> the same with the ASRock boards mentioned earlier). But my intention is to
>>> disable CSM and use a GPT/UEFI environment only! And GPT/UEFI doesn't work
>>> with FreeBSD, neither with 12-CURRENT, nor 11.2-RELENG.
>>> 
>>> It would be nice if this could be fixed. I'm more interested in the fix on
>>> the recent Fujitsu device than the outdated ASRock crap, but if the fix for
>>> the Fujitsu Firmware could fix older issues as a byproduct, I'd appreciate
>>> that.
>>> 
>>> Kind regards,
>>> 
>> 
>> ok, somehow I have lost that part of the discussion. Well, you wrote that the
>> UEFI boot fails when the first partition starts from sector 40 - does it mean
>> you have boot when the partition will start from some other sector? I think,
>> there is something else going on.
> 
> Well, I simply try to describe what I "see" to make things disambiguous. I'm
> not familiar with the deeper insights of disk layouts on a binary level. So,
> you explained to me the reason, why ESP (EGI partition) starts at block 40. I
> compared that to the FreeBSD USB flash image FreeBSD provides, but this is
> another story since the image uses MBR scheme as I figured out.
> 
> 
>> 
>> What you can do is to see if that firmware will offer you EFI shell option,
>> from there you can try to start the bootx64.efi manually and see what error
>> you will get. However, the number 1 cause for failing to start the bootloader
>> in UEFI is secure boot - we do not support it and secure boot must be
>> switched off. 
>> 
>> However, they seem to claim "The Secure Boot option is available in the
>> UEFI/BIOS of most if not all ASRock boards. It is disabled by default.” 
>> 
>> Still suggest to double check if thats really the case. Also, if the
>> bootx64.efi start will fail and no messages are appearing on screen, then
>> either there is something in firmware logs or you could get them from trying
>> to start bootx64.efi from UEFI shell.
> 
> Since I'm with this problem since 2014 and try from time to time, be ausred
> that I tried every possible permutationof all reasonable options, even those
> nonsense, to get rid of that problem.
> 
> I never had any problems with any other UEFI capable server/workstation
> firmware so far booting FreeBSD off in UEFI-native (GPT partition scheme, CSM
> disabled) so far - until now, when I ran into this Fujitsu ESPRIMO Q956 with
> the most recent firmware (as of lat week, week 29 of 2018) having the very same
> problems. 
> 
> 
> 
> I figured out something strange on the Fujitsu - and that is the same with the
> ASRock boards.
> 
> We/I prepare some USB flash drives to boot a NanoBSD for a very small
> appliance, but nevertheless, the USB flash device is booted on Fujitsu servers
> with UEFI-only configurations. I assume at this point that disabling on the
> most recent Fujitsu firmwares on reasonable "new" hardware (not older than
> three years) will disable any(!) legacy BIOS capabilities. The same is assumed
> for the Fujitus ESPRIMO Q956. I can not speak for the ASRock A77 Pro4/m boards
> mentioned above/earlier, they are from 2012/2013 and "quite old".
> 
> The NanoBSD image of ours doesn't have a "freebsd-boot" partition. The
> partition scheme of the flash device is GPT. The layout looks like this:
> 
> gpart show -l da4
> =>      40  15425456  da4  GPT  (7.4G)
>        40      2000    1  efiboot0  (1.0M)
>      2040   1453584    3  disk1a  (710M)
>   1455624      4096    5  disk3  (2.0M)
>   1459720  13965776       - free -  (6.7G)
> 
> I created the flash with md, gpart and dd straightforward, efiboot0 is the ESP
> partition and its format/content is created via dd if=/boot/boot1.efifat
> of=/dev/da4p1 - I presume this is very simple.
> 
> This USB flash device boots(!) successfully (UEFI!) on both the ASRock boards
> and the Esprimo Q956!
> 
> But any SSD prepared the same way doesn't. Why? 
> 
> On the ASRock, I recall having fiddled around with HDD also for a while
> conatining Windows 7/SP1 and FreeBSD. Windows7 booted, FreeBSD - I can't
> remember. 
> 
> In the lack of proper hardware I'm unable to check whether USB-attached HDD or
> SSD will boot or HDD will boot (just in case the local SATA has problems
> booting UEFI and USB not).
> 
> Kind regards,
> 
> Oliver 
> 

Am. well. I think the suggestion to test out FAT32 is still good one to test. This is because it is known that some vendors do not support booting FAT12/FAT16 from HDD (the likely reason is that UEFI specification does not tell which FAT must be supported, and only hint about FAT12/FAT16 in context of removable devices).

There are other possible causes too, for example: https://ubuntuforums.org/showthread.php?t=2147295 

Also about the ESP sizes: https://www.ctrl.blog/entry/esp-size-guide

"The UEFI System Partition should be at least 260 MiB (273 MB) to ensure its properly formatted with FAT32 so that you avoid UEFI implementation compatibility issues. (If you do have incompatible hardware that requires FAT16-formatting, then I suggest you move aside the files on the UEFI System Partition, convert the partition to FAT16, and copy the files back over to it.)”

So, as you see, even just telling “use FAT32” is not universal medicine, but I suspect it is still more universal than using FAT12/FAT16:)

Just to be clear, there is *no* standard size rule for ESP, there are only suggestions from vendors… 

Yes, this all means that if the solution from default installer does not work, the manual work is needed to identify why the default is not working and the findings should be reported, so the installer (and possibly other parts of the system) could be adjusted. Since this all is vendor specific, it has to be handled case by case.

rgds,
toomas
Received on Wed Jul 25 2018 - 06:46:20 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC