Re: HEADSUP: GEOM label may be broken [Was Re: svn commit: r361838 - in head/sys/geom: . label]

From: Conrad Meyer <cem_at_freebsd.org>
Date: Sat, 6 Jun 2020 09:28:14 -0700
# geli attach /dev/gpt/testingeli
Enter passphrase:
GEOM_ELI: Device md0s1.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: software
GEOM_LABEL[2]: Tasting md0s1.eli.
# ls -l /dev/gpt/testingeli*
lrwxr-xr-x 1 root wheel  8 Jun  6 09:22 /dev/gpt/testingeli -> ../md0s1
lrwxr-xr-x 1 root wheel 12 Jun  6 09:24 /dev/gpt/testingeli.eli -> ../md0s1.eli

https://reviews.freebsd.org/D25168

On Sat, Jun 6, 2020 at 8:52 AM Conrad Meyer <cem_at_freebsd.org> wrote:
>
> Hi Xin Li,
>
> Thank you for the report and diagnosis.  Sorry for the breakage.
>
> I have reverted the change and hope to address the issues you have identified.
>
> There seem to be two issues identified:
> 1. When ZFS attaches to a partition of a disk, the /dev/diskid label disappears.
> 2. geli(8) attach of symlink labels doesn't work
>
> (1) is interesting, as it does not seem to happen to UFS or swap consumers:
>
> # mount
> /dev/gpt/rootfs on / (ufs, local, noatime, soft-updates)
> # swapctl -l
> Device:       1024-blocks     Used:
> /dev/vtbd0p2    1048576         0
> # ls -l /dev/diskid/DISK-BHYVE-2613-8EFD-BAF4 /dev/gpt/{rootfs,swapfs}
> lrwxr-xr-x 1 root wheel  8 May 22 20:18
> /dev/diskid/DISK-BHYVE-2613-8EFD-BAF4 -> ../vtbd0
> lrwxr-xr-x 1 root wheel 10 May 22 20:18 /dev/gpt/rootfs -> ../vtbd0p3
> lrwxr-xr-x 1 root wheel 10 May 22 20:18 /dev/gpt/swapfs -> ../vtbd0p2
>
> I wonder what is different in the ZFS case here.
>
> (2)
>
> # ls -l /dev/gpt/testingeli
> lrwxr-xr-x 1 root wheel 8 Jun  6 08:33 /dev/gpt/testingeli -> ../md0s1
> # geli init /dev/gpt/testingeli
> Enter new passphrase:
> Reenter new passphrase:
> ...
> # geli attach /dev/gpt/testingeli
> Enter passphrase:
> geli: Provider gpt/testingeli is invalid.
> geli: There was an error with at least one provider.
>
> So at least the latter problem is straightforward to resolve.  I have
> a patch I'm testing locally now, and will upload to phabricator
> shortly.
>
> Best regards,
> Conrad
>
>
> On Sat, Jun 6, 2020 at 1:17 AM Xin Li <delphij_at_delphij.net> wrote:
> >
> > I just spent quite some time to revive my laptop.  TL;DR: if you are
> > using /dev/diskid or /dev/gptid labels and GELI, please wait until
> > things settled.
> >
> > ===
> >
> > On 6/5/20 9:12 AM, Conrad Meyer wrote:
> > > Author: cem
> > > Date: Fri Jun  5 16:12:21 2020
> > > New Revision: 361838
> > > URL: https://svnweb.freebsd.org/changeset/base/361838
> > >
> > > Log:
> > >   geom_label: Use provider aliasing to alias upstream geoms
> > >
> > >   For synthetic aliases (just pseudonyms inferred from metadata like GPT or
> > >   UFS labels, GPT UUIDs, etc), use the GEOM provider aliasing system to create
> > >   a symlink to the real device instead of creating an independent device.
> > >   This makes it more clear which labels and devices correspond, and we can
> > >   safely have multiple labels to a single device accessed at once.
> > >
> > >   The confusingly named geom_label on-disk construct continues to behave
> > >   identically to how it did before.
> > >
> > >   This requires teaching GEOM's provider aliasing about the possibility
> > >   that aliases might be added later in time, and GEOM's devfs interaction
> > >   layer not to worry about existing aliases during retaste.
> > >
> > >   Discussed with:     imp
> > >   Relnotes:   sure, if we don't end up reverting it
> >
> > This would break (and the effect can be quite persistent and hard to
> > repair, see explanation below) existing configuration as some GEOM
> > classes are not converted to support the new way of device
> > representation, so I'd like to request this change be reverted for now
> > until these are fixed.
> >
> > Consider the following configuration, where one have a hard drive
> > partitioned with GPT, like:
> >
> > =>        40  1953525088  ada1  GPT  (932G)
> >           40      262144     1  efi  (128M)
> >       262184     8388608     2  freebsd-zfs  (4.0G)
> >      8650792    67108864     3  freebsd-swap  (32G)
> >     75759656  1877765472     4  freebsd-zfs  (895G)
> >
> > Now, the first ZFS pool is created as root pool.  ZFS gets an exclusive
> > hold of 'ada1p2' despite the pool is carefully created to use
> > /dev/diskid/<DISKSERIAL>p2 instead of ada1p2.
> >
> > ZFS writes the new device path to vdev label.  For ZFS, this doesn't
> > matter much as it always checks the label.
> >
> > However, this will prevent GEOM from properly creating
> > /dev/diskid/<DISKSERIAL>.
> >
> > In order to prevent accidentally writing data to wrong disk, for "raw"
> > disk partitions it's usually a good idea to reference them with labels,
> > either by /dev/diskid or /dev/gptid.  With /dev/ada1p2 exclusively
> > accessed by ZFS, the /dev/diskid/<DISKSERIAL> representing the disk is
> > now gone.
> >
> > And to make the situation even worse, simply changing the partition
> > reference to the corresponding /dev/gptid/<uuid> doesn't really work
> > either, when one is encrypting partitions individually.  In the example
> > above, adap4 is an encrypted partition and with the alias change, one
> > can no longer "geli attach" them via /dev/gptid/<uuid>.  They can still
> > be attached by the "canonical" path (/dev/ada1p4), but for the swap
> > partition that would completely defeat the purpose of using label (to
> > prevent accidentally writing to the wrong disk).
> >
> > For my case, reverting to an older kernel is not sufficient to fix the
> > configuration, because ZFS is recording the "canonical" device path
> > (/dev/ada1p2) and /dev/diskid label for this disk disappeared somewhat
> > permanently (I would have to find a USB drive somewhere to fix the root
> > pool to use the "right" device path).
> >
> > >   Differential Revision:      https://reviews.freebsd.org/D24968
> > [...]
> > > Modified: head/sys/geom/label/g_label.c
> > > ==============================================================================
> > > --- head/sys/geom/label/g_label.c     Fri Jun  5 16:05:09 2020        (r361837)
> > > +++ head/sys/geom/label/g_label.c     Fri Jun  5 16:12:21 2020        (r361838)
> > > _at__at_ -344,18 +345,16 _at__at_ g_label_taste(struct g_class *mp, struct g_provider *p
> > >  {
> > >       struct g_label_metadata md;
> > >       struct g_consumer *cp;
> > > +     struct g_class *clsp;
> > >       struct g_geom *gp;
> > >       int i;
> > > +     bool changed;
> > >
> > >       g_trace(G_T_TOPOLOGY, "%s(%s, %s)", __func__, mp->name, pp->name);
> > >       g_topology_assert();
> > >
> > >       G_LABEL_DEBUG(2, "Tasting %s.", pp->name);
> > >
> > > -     /* Skip providers that are already open for writing. */
> > > -     if (pp->acw > 0)
> > > -             return (NULL);
> > > -
> > >       if (strcmp(pp->geom->class->name, mp->name) == 0)
> > >               return (NULL);
> > >
> > > _at__at_ -391,9 +390,16 _at__at_ g_label_taste(struct g_class *mp, struct g_provider *p
> > >               if (md.md_provsize != pp->mediasize)
> > >                       break;
> > >
> > > +             /* Skip providers that are already open for writing. */
> > > +             if (pp->acw > 0) {
> > > +                     g_access(cp, -1, 0, 0);
> > > +                     goto end;
> > > +             }
> > > +
> >
> > (Is this still necessary when the eventual provider would be the real one?)
> >
> > I think symlink aliasing is a the right direction but we need to be
> > really careful to not break existing and legitimate usage.  Since this
> > also breaks GELI when using with labels, I'd like to request that this
> > change be backed out for now until the consumers are correctly fixed.
> >
> > Cheers,
> >
Received on Sat Jun 06 2020 - 14:28:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC