Re: Fresh installed Freebsd 9 don't boot from hd

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Sun, 23 Oct 2011 20:57:59 +0300
on 23/10/2011 18:27 Dennis Koegel said the following:
> On Fri, Oct 21, 2011 at 04:33:38PM -0400, John Baldwin wrote:
>> Working offline with Dennis, we found that changing the CFLAGS in 
>> sys/boot/i386/gptboot/Makefile from "-O1" to "-Os -mrtd" (partially reverting 
>> an earlier commit) fixed gptboot.  The next test for someone to do would be to 
>> try just adding "-mrtd" and leaving "-O1" as-is to see if that fixes it.
> 
> More test results:
> 
> gcc -Os -fno-guess-branch-probability -fomit-frame-pointer -fno-unit-at-a-time \
> 	-mno-align-long-strings -mrtd [from before r225530]: Boots OK
> gcc -Os -mrtd: Boots OK
> gcc -O1 -mrtd: Fails
> gcc -O1: Fails
> gcc -O0: Fails
> gcc -Os: Boots OK
> 
> clang -O1: Fails
> clang -Os: Fails
> clang -Oz: Fails
> 
> I've put some printf()s into gpt{,boot}.c to trace where the reboot is
> triggered. It appears to be in drvsize() (called from gptread()). OTOH
> the debug output may have changed where the problem occurs, I don't
> know about that.
> 
> With 9.0R drawing near, CFLAGS should be s/-O1/-Os/, until we can figure
> out what happens. But as for why gcc's magic -Os is required and clang's
> output doesn't work at all, I'm clueless.

Thank you for your very valuable analysis!
I looked at a difference in assembly code of the drvsize function produced by
gcc -Os and by gcc -O1.  One thing that was immediately obvious is that gcc
places the params array and the sectors variable in a different order for
different options.  One idea is that if BIOS actually writes beyond the end of
the array, then in one case it could be harmless (overwrites the sector
variable), but in the other case it could be more harmful.
I found a document that suggests a possibility of BIOS writing more bytes to the
array than its current size of 0x42:
http://www.t13.org/documents/UploadedDocuments/docs2008/e08134r1-BIOS_Enhanced_Disk_Drive_Services_4.0.pdf

Of course, the size of the array is passed to BIOS at the start of the array and
so a _non-buggy_ BIOS should not write beyond the array, but we live in a
non-perfect world.

Could you please test this hypothesis by trying the following patch?
diff --git a/sys/boot/i386/common/drv.c b/sys/boot/i386/common/drv.c
index 11f6628..5996a80 100644
--- a/sys/boot/i386/common/drv.c
+++ b/sys/boot/i386/common/drv.c
_at__at_ -37,10 +37,10 _at__at_ __FBSDID("$FreeBSD$");
 uint64_t
 drvsize(struct dsk *dskp)
 {
-	unsigned char params[0x42];
+	unsigned char params[0x4A];
 	uint64_t sectors;

-	*(uint32_t *)params = sizeof(params);
+	*(uint16_t *)params = sizeof(params);

 	v86.ctl = V86_FLAGS;
 	v86.addr = 0x13;



P.S. the assembly diff to which I referred above:
--- drvsize.Os.S	2011-10-23 20:17:56.871996966 +0300
+++ drvsize.O1.S	2011-10-23 20:18:27.430995560 +0300
_at__at_ -4,8 +4,8 _at__at_
 	pushl	%ebp
 	movl	%esp, %ebp
 	subl	$76, %esp
-	leal	-74(%ebp), %ecx
-	movl	$66, -74(%ebp)
+	leal	-66(%ebp), %ecx
+	movl	$66, -66(%ebp)
 	movl	$262144, __v86
 	movl	$19, __v86+4
 	movl	$18432, __v86+24
_at__at_ -28,20 +28,20 _at__at_
 	pushl	%eax
 	pushl	$.LC4
 	call	printf
-	xorl	%eax, %eax
-	xorl	%edx, %edx
-	popl	%ecx
-	popl	%ecx
+	movl	$0, %eax
+	movl	$0, %edx
+	addl	$8, %esp
 	jmp	.L16
+	.p2align 2,,3
 .L14:
 	pushl	$8
-	leal	-58(%ebp), %eax
+	leal	-50(%ebp), %eax
 	pushl	%eax
-	leal	-8(%ebp), %eax
+	leal	-76(%ebp), %eax
 	pushl	%eax
 	call	memcpy
-	movl	-8(%ebp), %eax
-	movl	-4(%ebp), %edx
+	movl	-76(%ebp), %eax
+	movl	-72(%ebp), %edx
 	addl	$12, %esp
 .L16:
 	leave

-- 
Andriy Gapon
Received on Sun Oct 23 2011 - 15:58:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:19 UTC