Re: gptboot rewrite, bootonce, etc.

From: Julian Elischer <julian_at_freebsd.org>
Date: Fri, 17 Sep 2010 21:05:40 -0700
  On 9/17/10 4:45 PM, Pawel Jakub Dawidek wrote:
> Hi.
>
> My company was in need for functionality similar to nextboot(8), but on
> boot loader level, so we can have two partitions we boot from where one
> is known to be good and the other is used for upgrades. We upgrade by
> dd(1)ing entire partition image onto unused partition, we mark it as
> try-to-boot-from-it-but-only-once, reboot and if we fail to boot from
> the new partition, we fall back to the old, good partition. If we
> succeed on the other hand, we mark the new partition as our boot
> partition and mark the other one as unused.
>
> Well, how hard can it be?
>
> After around two weeks of work, I ended up rewriting gptboot in large
> parts, reorganizing a lot of code, improving and extending gpart a bit
> and implementing desire functionality.
>
> Here is the patch for review and test:
>
> 	http://people.freebsd.org/~pjd/patches/gptboot.patch
>
> The list of changes:
>
> - Split code shared by almost any boot loader into separate files and
>    clean up most layering violations:
>
> 	sys/boot/i386/common/rbx.h:
>
> 		RBX_* defines
> 		OPT_SET()
> 		OPT_CHECK()
>
> 	sys/boot/common/util.[ch]:
>
> 		memcpy()
> 		memset()
> 		memcmp()
> 		bcpy()
> 		bzero()
> 		bcmp()
> 		strcmp()
> 		strncmp() [new]
> 		strcpy()
> 		strcat()
> 		strchr()
> 		strlen()
> 		printf()
>
> 	sys/boot/i386/common/cons.[ch]:
>
> 		ioctrl
> 		putc()
> 		xputc()
> 		putchar()
> 		getc()
> 		xgetc()
> 		keyhit() [now takes number of seconds as an argument]
> 		getstr()
>
> 	sys/boot/i386/common/drv.[ch]:
>
> 		struct dsk
> 		drvread()
> 		drvwrite() [new]
> 		drvsize() [new]
>
> 	sys/boot/common/crc32.[ch] [new]
>
> 	sys/boot/common/gpt.[ch] [new]
>
> - Teach gptboot and gptzfsboot about new files. I haven't touched the
>    rest, but there is still a lot of code duplication to be removed.
>
> - Implement full GPT support. Currently we just read primary header and
>    partition table and don't care about checksums, etc. With the patch we
>    verify checksums of primary header and primary partition table and if
>    there is a problem we fall back to backup header and backup partition
>    table.
>
> - Clean up most messages to use prefix of boot program, so in case of an
>    error we know where the error comes from, eg.:
>
> 	gptboot: unable to read primary GPT header
>
> - If we can't boot, print boot prompt only once and not every five
>    seconds.
>
> - Introduce three new GPT attributes:
>
> 	bootme - this is bootable partition
> 	bootonce - try to boot from this partition only once
> 	bootfailed - we failed to boot from this partition
>
> - Extend gpart to allow to manipulate new attributes:
>
> 	gpart set -a bootme -i 3 ada0
> 	gpart set -a bootonce -i 4 ada0
> 	gpart unset -a bootfailed -i 2 ada0
>
>    Note, that setting 'bootonce' attribute automatically sets 'bootme'
>    attribute.
>
> - Change boot order of gptboot to the following:
>
> 	1. Try to boot from all the partitions that have both 'bootme'
> 	   and 'bootonce' attributes one by one.
> 	2. Try to boot from all the partitions that have only 'bootme'
> 	   attribute one by one.
> 	3. If there are no partitions with 'bootme' attribute, boot from
> 	   the first UFS partition.
>
> - The 'bootonce' functionality is implemented in the following way:
>
> 	1. Walk through all the partitions and when 'bootonce'
> 	   attribute is found without 'bootme' attribute, remove
> 	   'bootonce' attribute and set 'bootfailed' attribute.
> 	   'bootonce' attribute alone means that we tried to boot from
> 	   this partition, but boot failed after leaving gptboot and
> 	   machine was restarted.
> 	2. Find partition with both 'bootme' and 'bootonce' attributes.
> 	3. Remove 'bootme' attribute.
> 	4. Try to execute /boot/loader or /boot/kernel/kernel from that
> 	   partition. If succeeded we stop here.
> 	5. If execution failed, remove 'bootonce' and set 'bootfailed'.
> 	6. Go to 2.
>
>     If whole boot succeeded there is new /etc/rc.d/gptboot script that
>     will log all partitions that we failed to boot from (the ones with
>     'bootfailed' attribute) and will remove this attribute. It will also
>     find partition with 'bootonce' attribute - this is the partition we
>     booted from successfully. The script will log success and remove the
>     attribute.
>
>     All the GPT updates we do here goes to both primary and backup GPT if
>     they are valid. We don't touch headers or partition tables when
>     checksum doesn't match.
>
> Any comments or suggestions? Be aware that at this point I'm soo full of
> boot loaders and I'm not looking for much more work in this area, so
> small tweaks are fine, but bigger things will have to wait until I can
> sleep at nights again. Well, there is still dedup support that waits to
> be implemented in gptzfsboot...


nextboot USED to work at the bootloader level, but it got 
broken^H^H^H^H^H^H^H
changed by someone several years ago.  Ironport still use the old 
bootblock
for that reason.

It used to store the string for boot1 to use in the second block of 
the disk
and boot0 would read it and write it back disabled using a bios 
command, so that
the boot after that would not do it again if it failed. boot0 then 
passed it to boot1
in the stack to use.

I did have a version that kept the boot string in a special partition. 
(of 1 block)

Obviously what you are doing is much more fancy.
Received on Sat Sep 18 2010 - 02:16:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:07 UTC