Re: r253070 and "disappearing" zpool

From: Johannes Totz <johannes_at_jo-t.de> Date: Thu, 25 Jul 2013 16:33:58 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:39 UTC

On 24/07/2013 12:47, Andriy Gapon wrote:
> on 22/07/2013 23:38 Pawel Jakub Dawidek said the following:
>> On Mon, Jul 22, 2013 at 10:29:40AM +0300, Andriy Gapon wrote:
>>> I think that this setup (on ZFS level) is quite untypical, although not
>>> impossible on FreeBSD (and perhaps only FreeBSD).
>>> It's untypical because you have separate boot pool (where loader, loader.conf
>>> and kernel are taken from) and root pool (where "/" is mounted from).
>>
>> As I said elsewhere, it is pretty typical when full disk encryption is
>> used.
>
> I am judging by the number of reports / amount of feedback so far.

I'm using a similar configuration too, where I have a USB stick with 
unencrypted kernel and /boot bits which load a GELI keyfile (from its 
own pool zboot), and then the rest of the system starts up from the 
fully encrypted HDD (from another pool zsystem, so boot and rootfs are 
on different pools).

I'm not sure I understand the problem though. What exactly "broke" after 
your commit? The pool that contains the bits that would normally go to 
/boot is not imported automatically, but the rest is working (ie. /boot 
symlink pointing to nowhere)? Or does booting somehow fail?

>
>> The /boot/ has to be unencrypted and can be stored on eg. USB
>> pendrive which is never left unattended, unlike laptop which can be left
>> in eg. a hotel room, but with entire disk encrypted.
>
> As we discussed elsewhere, there are many options of configuring full disk
> encryption.  Including decisions whether root filesystem should be separate from
> boot filesystem, choice of filesystem type for boot fs, ways of tying various
> pieces together, and many more.
>
> I do not believe that my change is incompatible with full disk encryption in
> general.
>
>>> So, I see three ways of resolving the problem that my changes caused for your
>>> configuration.
>>>
>>> 1.  [the easiest] Put zpool.cache loading instructions that used to be in
>>> defaults/loader.conf into your loader.conf.  This way everything should work as
>>> before -- zpool.cache would be loaded from your boot pool.
>>>
>>> 2. Somehow (I don't want to go into any technical details here) arrange that
>>> your root pool has /boot/zfs/zpool.cache that describes your boot pool.  This is
>>> probably hard given that your /boot is a symlink at the moment.  This probably
>>> would be easier to achieve if zpool.cache lived in /etc/zfs.
>>>
>>> 3. [my favorite]  Remove an artificial difference between your boot and root
>>> pools, so that they are a single root+boot pool (as zfs gods intended).  As far
>>> as I understand your setup, you use GELI to protect some sensitive data.
>>> Apparently your kernel is not sensitive data, so I wonder if your /bin/sh or
>>> /sbin/init are really sensitive either.
>>> So perhaps you can arrange your unencrypted pool to hold all of the base system
>>> (boot + root) and put all your truly sensitive filesystems (like e.g. /home or
>>> /var/data or /opt/xyz) onto your encrypted pool.
>>
>> If all you care about is laptop being stolen, then that would work.
>>
>> If you however want to be protected from someone replacing your /sbin/init
>> with something evil then you use encryption or even better integrity
>> verification also supported by GELI.
>
> There are different ways to ensure that.  Including storing cryptographic
> checksums in a safe place or keeping init in the same place where kernel is
> kept.  And probably many more.
>
>> Remember, tools not policies.
>
> I am not trying to enforce any policy on end-users here.
>
>> There is also option number 4 - backing out your commit.
>
> That's definitely an option.  I'll discuss it a few lines below.
>
>> When I saw your commit removing those entries from defaults/loader.conf,
>> I thought it is fine, as we now don't require zpool.cache to import the
>> root pool, which was, BTW, very nice and handy improvement. Now that we
>> know it breaks existing installations I'd prefer the commit to be backed
>> out.
>
> "breaks" sounds dramatic, but let's take a step back and see what exactly is broken.
> The system in question still can boot without a problem, it is fully usable and
> it is possible to change its configuration without any hassle.  The only thing
> that changed is that its boot pool is not imported automatically.
> Let's also recall that the system was not created / configured by any of the
> existing official or semi-official tools and thus it does not represent any
> recommended way of setting up such systems.  Glen configured it this way, but it
> doesn't mean that that is the way.
>
> I think that there are many of ways of changing configuration of that system to
> make behave as before again.
> Three I mentioned already.  Another is to add rc script to import the boot pool,
> given that it is a special, designated pool.  Yet another is to place
> zpool.cache onto the root pool and use nullfs (instead of a symlink) to make
> /boot be from the boot pool but /boot/zfs be from the root pool.
>
>>   This is because apart from breaking some existing installations it
>>> doesn't gain us anything.
>
> I think I addressed the "breaking" part, as to the gains - a few lines below.
>
>>> So I understand that my change causes a problem for a setup like yours, but I
>>> believe that the change is correct.
>>
>> The change is clearly incorrect or incomplete as it breaks existing
>> installations and doesn't allow for full disk encryption configuration
>> on ZFS-only systems.
>
> I think I addressed the breaking part and also addressed your overly general
> statement about full disk encryption.  So I don't think that my change is
> "clearly incorrect", otherwise that would be clear even to me.
>
>> BTW. If moving zpool.cache to /etc/zfs/ will work for both cases that's
>> fine by me, although the migration might be tricky.
>
> Yes, that's migration that's scary to me too.
>
>
> Now, about the postponed points.
> I will reproduce a section from my email that you've snipped.
>
>>> P.S.
>>> ZFS/FreeBSD boot process is extremely flexible.  For example zfsboot can take
>>> zfsloader from pool1/fsA, zfsloader can boot kernel from pool2/fsB and kernel
>>> can mount / from pool3/fsC.  Of these 3 filesystems from where should
>>> zpool.cache be taken?
>>> My firm opinion is that it should be taken from / (pool3/fsC in the example
>>> above).  Because it is the root filesystem that defines what a system is going
>>> to do ultimately: what daemons are started, with what configurations, etc.
>>> And thus it should also determine what pools to auto-import.
>>> We can say that zpool.cache is analogous to /etc/fstab in this respect.
>
> So do you or do you not agree with my reasoning about from where zpool.cache
> should be taken?
> If you do not, then please explain why.
> If you do, then please explain how this would be compatible with the old way of
> loading zpool.cache.
>
> I think that ensuring that zpool.cache is always loaded from a root filesystem
> is the gain from my change.
>