Re: ZFS: amd64, devd, root file system.

From: Stefan Esser <se_at_FreeBSD.org>
Date: Sun, 15 Apr 2007 00:19:15 +0200
Pawel Jakub Dawidek wrote:
> On Sat, Apr 14, 2007 at 10:03:12PM +0200, Stefan Esser wrote:
>> Pawel Jakub Dawidek wrote:
>>> On Sat, Apr 14, 2007 at 11:21:37AM +0200, Stefan Esser wrote:
>>>> It is amazingly simple to get a test setup going and it worked fine
>>>> in my initial simple test cases. But now I've run into problems that
>>>> probably are not technical but caused by a lack of understanding ...
>>> This is not the first report that it doesn't work as it should. One was
>>> that /boot/defaults/loader.conf wasn't fresh enough, and there were no:
>> Hi Pawel,
>>
>> thanks for the reply, I got it working with some effort, see below ...
>>
>>> zpool_cache_load="YES"
>> This is apparently implied by zfs_load="YES" and redundant.
> 
> No, it isn't. It is absolutely necessary.
> 
>>> zpool_cache_type="/boot/zfs/zpool.cache"
>>> zpool_cache_name="/boot/zfs/zpool.cache"
>> These are defined in /boot/defaults/loader.conf ...
> 
> zpool_cache_load="YES" should be as well. I hope you didn't change it.
> You should not need to touch /boot/defaults/loader.conf.

Yes, it is there, I did not change it but missed it when I looked up
the other default values. (All files in /{boot,etc}/defaults/ are
unmodified and from a very recent -current.)

>> This could be fixed by exporting and then importing the pool (with -f).
>> There after the pool could be mounted and I could manually set up the
>> complete file system hierarchy. I verified that "/boot/zfs/zpool.cache"
>> was updated during the import (written to the boot partition), but the
>> next reboot failed again and I again got the same error status as shown
>> above.
> 
> Are you sure the updated file is the same file which is loaded on boot?

Yes, definitely, literally tens of times (I checked the modification
date and verified the readable parts matched what I changed, e.g. that
the pool name and underlying device where there). One of the problems
that I encountered was that mounting /boot (in my recovery root) R/W
meant that I could not later mount it within the ZFS root (after the
export/import of the pool). But I got around this problem and I'm quite
sure that /boot/zfs/zpool.cache was written to and that this file was
loaded with the kernel and the zfs module.

>> I made an attempt to fix it by creating another pool on an (during the
>> tests) unused swap partition on my "normal" UFS system disk (which I
>> had made an IDE slave for these tests). After copying the necessary
>> files over to the newly created "test2" pool on the SWAP partition I
>> got a system that mounted "zfs:test2" and that just worked ...
>>
>> Not working:    zpool create test  ad0s2
>> Working:        zpool create test2 ad1s1b
>>
>> (I.e. "test2" could be mounted automatically, while "test" required me
>> to boot with an UFS root and to export/import the pool before it could
>> be manually mounted.)
>>
>> Well, after some more testing I destroyed the pool "test" and created
>> it on "ad0s2c" instead of "ad0s2", and voila, I had my problem solved.
>>
>> It appears, that a zpool can be manually mounted if it resides on ad0s2,
>> but in order to make the kernel accept it during boot, it must be in a
>> BSD partition. Does that make sense? (I did not want to try again with
>> another pool in a slice, since I did not want to give up what I had just
>> achieved with so much effort ;-)
> 
> I'm sorry, but it doesn't make sense at all:) All GEOM providers should
> be equal for ZFS, no matter if this is disk, slice, partition, mirror,
> encrypted provider or anything else.

Yes, I had read this and for that reason never bothered to change the
underlying device. But when I just created a pool in ad1s1b I could
enter zfs:test2 at the prompt (after zfs:test had not been found). This
was absolutely reproducible (with the boot then failing on test2 since
it did not contain a /dev for the devfs mount in the beginning). This
made me suspicious and I prepared a ZFS root in test2. That worked, but
since it was on the wrong disk (to be removed from the system), I tried
to destroy the pool "test" and then created it again on ad0s2. It did
fail with identical problems (could be manually mounted but not during
the automatic root mount step while booting). Then I made another try
and destroyed test and created it on ad0s2c and found that it worked
without problem there after.

>>>> Do I need fstab entries for for ZFS file systems (e.g. "test/usr")
>>>> or does ZFS mount them automatically when the pool "test" is mounted?
>>> They are mount via rc.d/zfs script.
>> Oh well, I should have looked there instead of asking ;-)
>> Hmmm, I assume that "zfs mount -a" will ignore file systems that are
>> marked as "legacy", and those will instead mounted together with other
>> local file systems?
> 
> That's right.
> 
>>> For now only exports file. zpool.cache use to be there as well, but we
>>> need it in /boot/zfs/ to be able to have root-on-ZFS.
>> Yes, I see. It might be useful to make zpool.cache available in /etc/zfs
>> via a symlink, but this might also cause confusion or inconsistencies
>> and I see good reasons to maintain that file in /boot/zfs.
> 
> Hmm? What for do you need zpool.cache in /etc/zfs/? This file is for ZFS
> internal use only, I see no reason to symlink it or do anything with it.
> ZFS should work in the way that you shouldn't even know it exists and we
> should be moving into that direction:)

No, I do not need it there ... But /etc/zfs appears to be the logical
place (if it was not required for booting). I assume that zpool.cache
preserves the pool information over reboots and is not specifically
used for the ZFS root configuration (and thus could be in /etc/zfs, if
support of a ZFS root was not desirable).

>> Ok, it took me quite a few hours to get ZFS installed the way I wanted
>> it, and it seems that ad0s2 and ad0s2c are quite different with regard
>> to their suitability to hold ZFS pools. Was this to be expected?
>>
>> Or is the diagnosis wrong and something else is responsible that it
>> works for me, now?
> 
> I don't know what was the reason, but ad0s2/ad0s2c should make no
> difference...

Well, it did in my case, but I'm not going to try it with ad0s2 again
right now (ENOTIME). I can not explain it, but ad0s2c made it work for
me instantly, while a pool on ad0s2 always resulted in the FAULTED state
of the pool because the device ad0s2 could not be opened (see output
from "zpool status" in previous mail).

My disk drive is partitioned this way:

da0
	da0s1 (type 165)
		da0s1a	256MB
		da0s1b	768MB
		da0s1c	1024MB (overlapping a and b)
	da0s2 (type 165)
		da0s2c	300GB (rest of disk)

Regards, STefan
Received on Sat Apr 14 2007 - 20:19:31 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:08 UTC