Re: head -r538966 on OrangePi+ 2ed: boot loader crashes when USB drive is present at power-on/boot: its a misaligned access by code from -r354746

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 19 Mar 2020 01:20:02 -0700
[I built with -DDEBUG -DDISK_DEBUG -DPART_DEBUG and show
with some extra output as well. It shows that a misaligned
access causes the crash. The access in question is from
head -r354746 code additions.]

On 2020-Mar-18, at 13:36, Mark Millard <marklmi at yahoo.com> wrote:

> Without a USB drive present at power-on or
> boot, the OPi+2e boots fine. (The USB drives
> involved have a partition holding a ufs file
> system and a partition holding a swap/aging
> space.)
> 
> In all cases below, /boot/ is from the microsd
> card. But the intended configuration is for
> vfs.root.mountfrom to be used to direct the
> stages after kernel-booting to the USB drive.
> 
> The output sequence related to the crash when
> the USB drive is present looks like:
> 
> QUOTE
> . . .
> END QUOTE

I'm replacing the original quote with
better information.

First I quote the definition of dos_partition
for reference:

struct dos_partition {
        unsigned char   dp_flag;        /* bootstrap flags */
        unsigned char   dp_shd;         /* starting head */
        unsigned char   dp_ssect;       /* starting sector */
        unsigned char   dp_scyl;        /* starting cylinder */
        unsigned char   dp_typ;         /* partition type */
        unsigned char   dp_ehd;         /* end head */
        unsigned char   dp_esect;       /* end sector */
        unsigned char   dp_ecyl;        /* end cylinder */
        uint32_t        dp_start;       /* absolute starting sector number */
        uint32_t        dp_size;        /* partition size in sectors */
};

Note that access to dp_start or dp_size
requires address%4==0 alignment but the
other fields do not. This is important for
&dp[i].dp_start reported in the below (and
then dp[i].dp_start is accessed in the code).
I also had it report dp, which shows that
dp has dp%4!=0 as well.



Here is with -DDEBUG and -DDISK_DEBUG -DPART_DEBUG
in place for the code (with some extra debug output
added):

FreeBSD/armv7 U-Boot loader, Revision 1.3

signature:
  version       = 1
  checksum      = 0x98de198b
  sc entry      = 0xbdf8cb01

addresses info:
 _etext (sdata) = 0x4204fdb4
 _edata         = 0x4205ebb8
 __sbss_start   = 0x4205ec28
 __sbss_end     = 0x4205ec28
 __sbss_start   = 0x4205ec28
 _end           = 0x42062aa0
 syscall entry  = 0xbdf8cb01
DRAM: 2048MB
Number of U-Boot devices: 2
U-Boot env: loaderdev not set, will probe all devices.
stor_init(): storage devices found: 2
Found U-Boot device: disk
  Probing all <unknown> devices...
  Checking unit=0 slice=<auto> partition=<auto>...disk_open: disk0s0: unit 0, slice 0, partition -2 => 0x420631c0
stor_readdev(): reading blk=0 size=1 _at_ 0x420633c0
stor_readdev(): reading blk=64 size=1 _at_ 0x420636c0
stor_readdev(): reading blk=1 size=1 _at_ 0x420636c0
ptable_open: BEFORE NDOSPART loop #1
ptable_open: dp=0x4206357e
ptable_open: IN NDOSPART loop #1 after 1st if
ptable_open: IN NDOSPART loop #1: &dp[i].dp_start=0x42063586
data abort
pc : [<42009350>]          lr : [<42009348>]
reloc pc : [<ce07d350>]    lr : [<ce07d348>]
sp : b9f649e8  ip : 4205e200     fp : b9f64a18
r10: 42063640  r9 : b9f64a50     r8 : 420633c0
r7 : 420636c0  r6 : 4200818c     r5 : 4205cd95  r4 : 42063586
r3 : bdf8cdf7  r2 : 0000000a     r1 : 01c28000  r0 : 000000ee
Flags: nZCv  IRQs off  FIQs off  Mode SVC_32
Code: e08f0000 eb0114af e5d801c2 e35000ee (05940000) 
Resetting CPU ...

resetting ...


My extra messages are from:

# svnlite diff /usr/src/stand/common/part.c
Index: /usr/src/stand/common/part.c
===================================================================
--- /usr/src/stand/common/part.c	(revision 358966)
+++ /usr/src/stand/common/part.c	(working copy)
_at__at_ -715,18 +715,26 _at__at_
 	 * start sector 1. After DOSPTYP_PMBR, there may be other partitions.
 	 * UEFI compliant PMBR has no other partitions.
 	 */
+	DPRINTF("BEFORE NDOSPART loop #1");
+	DPRINTF("dp=%p", (void*)dp);
 	for (i = 0; i < NDOSPART; i++) {
 		if (dp[i].dp_flag != 0 && dp[i].dp_flag != 0x80) {
 			DPRINTF("invalid partition flag %x", dp[i].dp_flag);
 			goto out;
 		}
+		DPRINTF("IN NDOSPART loop #1 after 1st if");
+		DPRINTF("IN NDOSPART loop #1: &dp[i].dp_start=%p", (void*)&dp[i].dp_start);
 #ifdef LOADER_GPT_SUPPORT
 		if (dp[i].dp_typ == DOSPTYP_PMBR && dp[i].dp_start == 1) {
+			DPRINTF("BEFORE table->type assignment for PMBR detected");
+			DPRINTF("table=%p for PMBR detected", (void*)table);
 			table->type = PTABLE_GPT;
 			DPRINTF("PMBR detected");
 		}
+		DPRINTF("IN NDOSPART loop #1 after 2nd if");
 #endif
 	}
+	DPRINTF("AFTER NDOSPART loop #1");
 #ifdef LOADER_GPT_SUPPORT
 	if (table->type == PTABLE_GPT) {
 		table = ptable_gptread(table, dev, dread);


Note that "IN NDOSPART loop #1 after 2nd if" is
never reported. Nor is "BEFORE table->type
assignment for PMBR detected". The condition in
the 2nd if never completes its evaluation because
of the misaligned access involved in dp[i].dp_start
being evaluated.


> Stopping it earlier and using:
> 
> setenv loaderdev mmc 0
> boot
> 
> avoids the problem because it avoids "probing" the
> USB drive at the stage indicated above.

Yep: The above loop is then not evaluated.

> But the
> boot then actually uses:
> 
> vfs.root.mountfrom="ufs:/dev/gpt/BPIM3root"
> 
> in the /boot/loader.conf on the microsd card.
> ufs:/dev/gpt/BPIM3root is a reference to the ufs
> partition on the USB media. (With the mountfrom
> commented out the microsd card is bootable by
> itself.) So later the USB drive is put to use
> successfully when the initial probing is avoided.
> 
> Looking in the /usr/src/stand/uboot/common/main.c
> code shows:
> 
> static int
> probe_disks(int devidx, int load_type, int load_unit, int load_slice,
>    int load_partition)
> {
> . . .
>        if (load_unit == -1) {
>                printf("  Probing all %s devices...\n", device_typename(load_type));
>                /* Try each disk of given type in succession until one works. */
>                for (unit = 0; unit < UB_MAX_DEV; unit++) {
>                        currdev.dd.d_unit = uboot_diskgetunit(load_type, unit);
>                        if (currdev.dd.d_unit == -1)
>                                break;
>                        print_disk_probe_info();
>                        open_result = devsw[devidx]->dv_open(&f, &currdev);
>                        if (open_result == 0) {
>                                printf(" good.\n");
>                                return (0);
>                        }
>                        printf("\n");
>                }
>                return (-1);
>        }
> . . .
> }
> 
> So it appears that the crash is during the code involved
> for the line:
> 
> open_result = devsw[devidx]->dv_open(&f, &currdev);
> 
> Note that the boot attempt reported "Probing all <unknown>
> devices..." so device_typename(load_type) came up with
> "<unknown>". (I've no clue if that is significant to the
> issue or not.)


Looks like the below paragraph was junk. Sorry.

> It appeared that /usr/src/stand/usb/storage/umass_loader.c
> and its umass_disk_open and umass_disk_open_sub might
> be involved and then code from the likes of:
> /usr/src/sys/dev/usb/ --such as from usb_msctest.c for
> usb_msc_read_capacity. (I stopped looking around
> there: well outside areas I know how to interpret.)


> For reference, I used the OPi+2e u-boot material from
> my poudriere-devel based port builds:
> 
> # ls -ldT /usr/local/share/u-boot/u-boot-orangepi-plus-2e/*
> -rw-r--r--  1 root  wheel     503 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/README
> -rw-r--r--  1 root  wheel     199 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/boot.scr
> -rw-r--r--  1 root  wheel      66 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/metadata
> -rw-r--r--  1 root  wheel  471250 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/u-boot-sunxi-with-spl.bin
> 
> 
> 
> (The USB drives are USB SSDs.)




===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Thu Mar 19 2020 - 07:20:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:23 UTC