Re: head -r538966 on OrangePi+ 2ed: boot loader crashes when USB drive is present at power-on/boot: its a misaligned access by code from -r354746

From: Toomas Soome <tsoome_at_me.com>
Date: Thu, 19 Mar 2020 11:00:42 +0200
I think this should fix it. We need to create copy of dos partition array, so we will get proper alignment. 


tsoome_at_freebsd-2:/usr/src % svn diff stand/common/part.c
Index: stand/common/part.c
===================================================================
--- stand/common/part.c	(revision 359099)
+++ stand/common/part.c	(working copy)
_at__at_ -654,6 +654,7 _at__at_
 	int has_ext;
 #endif
 	table = NULL;
+	dp = NULL;
 	buf = malloc(sectorsize);
 	if (buf == NULL)
 		return (NULL);
_at__at_ -708,7 +709,11 _at__at_
 		goto out;
 	}
 	/* Check that we have PMBR. Also do some validation. */
-	dp = (struct dos_partition *)(buf + DOSPARTOFF);
+	dp = malloc(NDOSPART * sizeof(struct dos_partition));
+	if (dp == NULL)
+		goto out;
+	bcopy(buf + DOSPARTOFF, dp, NDOSPART * sizeof(struct dos_partition));
+
 	/*
 	 * In mac we can have PMBR partition in hybrid MBR;
 	 * that is, MBR partition which has DOSPTYP_PMBR entry defined as
_at__at_ -770,6 +775,7 _at__at_
 #endif /* LOADER_MBR_SUPPORT */
 #endif /* LOADER_MBR_SUPPORT || LOADER_GPT_SUPPORT */
 out:
+	free(dp);
 	free(buf);
 	return (table);
 }
tsoome_at_freebsd-2:/usr/src %


> On 19. Mar 2020, at 10:20, Mark Millard <marklmi_at_yahoo.com> wrote:
> 
> [I built with -DDEBUG -DDISK_DEBUG -DPART_DEBUG and show
> with some extra output as well. It shows that a misaligned
> access causes the crash. The access in question is from
> head -r354746 code additions.]
> 
> On 2020-Mar-18, at 13:36, Mark Millard <marklmi at yahoo.com> wrote:
> 
>> Without a USB drive present at power-on or
>> boot, the OPi+2e boots fine. (The USB drives
>> involved have a partition holding a ufs file
>> system and a partition holding a swap/aging
>> space.)
>> 
>> In all cases below, /boot/ is from the microsd
>> card. But the intended configuration is for
>> vfs.root.mountfrom to be used to direct the
>> stages after kernel-booting to the USB drive.
>> 
>> The output sequence related to the crash when
>> the USB drive is present looks like:
>> 
>> QUOTE
>> . . .
>> END QUOTE
> 
> I'm replacing the original quote with
> better information.
> 
> First I quote the definition of dos_partition
> for reference:
> 
> struct dos_partition {
>        unsigned char   dp_flag;        /* bootstrap flags */
>        unsigned char   dp_shd;         /* starting head */
>        unsigned char   dp_ssect;       /* starting sector */
>        unsigned char   dp_scyl;        /* starting cylinder */
>        unsigned char   dp_typ;         /* partition type */
>        unsigned char   dp_ehd;         /* end head */
>        unsigned char   dp_esect;       /* end sector */
>        unsigned char   dp_ecyl;        /* end cylinder */
>        uint32_t        dp_start;       /* absolute starting sector number */
>        uint32_t        dp_size;        /* partition size in sectors */
> };
> 
> Note that access to dp_start or dp_size
> requires address%4==0 alignment but the
> other fields do not. This is important for
> &dp[i].dp_start reported in the below (and
> then dp[i].dp_start is accessed in the code).
> I also had it report dp, which shows that
> dp has dp%4!=0 as well.
> 
> 
> 
> Here is with -DDEBUG and -DDISK_DEBUG -DPART_DEBUG
> in place for the code (with some extra debug output
> added):
> 
> FreeBSD/armv7 U-Boot loader, Revision 1.3
> 
> signature:
>  version       = 1
>  checksum      = 0x98de198b
>  sc entry      = 0xbdf8cb01
> 
> addresses info:
> _etext (sdata) = 0x4204fdb4
> _edata         = 0x4205ebb8
> __sbss_start   = 0x4205ec28
> __sbss_end     = 0x4205ec28
> __sbss_start   = 0x4205ec28
> _end           = 0x42062aa0
> syscall entry  = 0xbdf8cb01
> DRAM: 2048MB
> Number of U-Boot devices: 2
> U-Boot env: loaderdev not set, will probe all devices.
> stor_init(): storage devices found: 2
> Found U-Boot device: disk
>  Probing all <unknown> devices...
>  Checking unit=0 slice=<auto> partition=<auto>...disk_open: disk0s0: unit 0, slice 0, partition -2 => 0x420631c0
> stor_readdev(): reading blk=0 size=1 _at_ 0x420633c0
> stor_readdev(): reading blk=64 size=1 _at_ 0x420636c0
> stor_readdev(): reading blk=1 size=1 _at_ 0x420636c0
> ptable_open: BEFORE NDOSPART loop #1
> ptable_open: dp=0x4206357e
> ptable_open: IN NDOSPART loop #1 after 1st if
> ptable_open: IN NDOSPART loop #1: &dp[i].dp_start=0x42063586
> data abort
> pc : [<42009350>]          lr : [<42009348>]
> reloc pc : [<ce07d350>]    lr : [<ce07d348>]
> sp : b9f649e8  ip : 4205e200     fp : b9f64a18
> r10: 42063640  r9 : b9f64a50     r8 : 420633c0
> r7 : 420636c0  r6 : 4200818c     r5 : 4205cd95  r4 : 42063586
> r3 : bdf8cdf7  r2 : 0000000a     r1 : 01c28000  r0 : 000000ee
> Flags: nZCv  IRQs off  FIQs off  Mode SVC_32
> Code: e08f0000 eb0114af e5d801c2 e35000ee (05940000) 
> Resetting CPU ...
> 
> resetting ...
> 
> 
> My extra messages are from:
> 
> # svnlite diff /usr/src/stand/common/part.c
> Index: /usr/src/stand/common/part.c
> ===================================================================
> --- /usr/src/stand/common/part.c	(revision 358966)
> +++ /usr/src/stand/common/part.c	(working copy)
> _at__at_ -715,18 +715,26 _at__at_
> 	 * start sector 1. After DOSPTYP_PMBR, there may be other partitions.
> 	 * UEFI compliant PMBR has no other partitions.
> 	 */
> +	DPRINTF("BEFORE NDOSPART loop #1");
> +	DPRINTF("dp=%p", (void*)dp);
> 	for (i = 0; i < NDOSPART; i++) {
> 		if (dp[i].dp_flag != 0 && dp[i].dp_flag != 0x80) {
> 			DPRINTF("invalid partition flag %x", dp[i].dp_flag);
> 			goto out;
> 		}
> +		DPRINTF("IN NDOSPART loop #1 after 1st if");
> +		DPRINTF("IN NDOSPART loop #1: &dp[i].dp_start=%p", (void*)&dp[i].dp_start);
> #ifdef LOADER_GPT_SUPPORT
> 		if (dp[i].dp_typ == DOSPTYP_PMBR && dp[i].dp_start == 1) {
> +			DPRINTF("BEFORE table->type assignment for PMBR detected");
> +			DPRINTF("table=%p for PMBR detected", (void*)table);
> 			table->type = PTABLE_GPT;
> 			DPRINTF("PMBR detected");
> 		}
> +		DPRINTF("IN NDOSPART loop #1 after 2nd if");
> #endif
> 	}
> +	DPRINTF("AFTER NDOSPART loop #1");
> #ifdef LOADER_GPT_SUPPORT
> 	if (table->type == PTABLE_GPT) {
> 		table = ptable_gptread(table, dev, dread);
> 
> 
> Note that "IN NDOSPART loop #1 after 2nd if" is
> never reported. Nor is "BEFORE table->type
> assignment for PMBR detected". The condition in
> the 2nd if never completes its evaluation because
> of the misaligned access involved in dp[i].dp_start
> being evaluated.
> 
> 
>> Stopping it earlier and using:
>> 
>> setenv loaderdev mmc 0
>> boot
>> 
>> avoids the problem because it avoids "probing" the
>> USB drive at the stage indicated above.
> 
> Yep: The above loop is then not evaluated.
> 
>> But the
>> boot then actually uses:
>> 
>> vfs.root.mountfrom="ufs:/dev/gpt/BPIM3root"
>> 
>> in the /boot/loader.conf on the microsd card.
>> ufs:/dev/gpt/BPIM3root is a reference to the ufs
>> partition on the USB media. (With the mountfrom
>> commented out the microsd card is bootable by
>> itself.) So later the USB drive is put to use
>> successfully when the initial probing is avoided.
>> 
>> Looking in the /usr/src/stand/uboot/common/main.c
>> code shows:
>> 
>> static int
>> probe_disks(int devidx, int load_type, int load_unit, int load_slice,
>>   int load_partition)
>> {
>> . . .
>>       if (load_unit == -1) {
>>               printf("  Probing all %s devices...\n", device_typename(load_type));
>>               /* Try each disk of given type in succession until one works. */
>>               for (unit = 0; unit < UB_MAX_DEV; unit++) {
>>                       currdev.dd.d_unit = uboot_diskgetunit(load_type, unit);
>>                       if (currdev.dd.d_unit == -1)
>>                               break;
>>                       print_disk_probe_info();
>>                       open_result = devsw[devidx]->dv_open(&f, &currdev);
>>                       if (open_result == 0) {
>>                               printf(" good.\n");
>>                               return (0);
>>                       }
>>                       printf("\n");
>>               }
>>               return (-1);
>>       }
>> . . .
>> }
>> 
>> So it appears that the crash is during the code involved
>> for the line:
>> 
>> open_result = devsw[devidx]->dv_open(&f, &currdev);
>> 
>> Note that the boot attempt reported "Probing all <unknown>
>> devices..." so device_typename(load_type) came up with
>> "<unknown>". (I've no clue if that is significant to the
>> issue or not.)
> 
> 
> Looks like the below paragraph was junk. Sorry.
> 
>> It appeared that /usr/src/stand/usb/storage/umass_loader.c
>> and its umass_disk_open and umass_disk_open_sub might
>> be involved and then code from the likes of:
>> /usr/src/sys/dev/usb/ --such as from usb_msctest.c for
>> usb_msc_read_capacity. (I stopped looking around
>> there: well outside areas I know how to interpret.)
> 
> 
>> For reference, I used the OPi+2e u-boot material from
>> my poudriere-devel based port builds:
>> 
>> # ls -ldT /usr/local/share/u-boot/u-boot-orangepi-plus-2e/*
>> -rw-r--r--  1 root  wheel     503 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/README
>> -rw-r--r--  1 root  wheel     199 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/boot.scr
>> -rw-r--r--  1 root  wheel      66 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/metadata
>> -rw-r--r--  1 root  wheel  471250 Oct 26 19:12:16 2019 /usr/local/share/u-boot/u-boot-orangepi-plus-2e/u-boot-sunxi-with-spl.bin
>> 
>> 
>> 
>> (The USB drives are USB SSDs.)
> 
> 
> 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
> 
Received on Thu Mar 19 2020 - 08:00:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:23 UTC