Re: fsid change of ZFS?

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Wed, 24 Aug 2011 09:36:37 -0400 (EDT)
Kostik Belousov wrote:
> On Wed, Aug 24, 2011 at 09:34:58PM +0900, Hiroki Sato wrote:
> > Kostik Belousov <kostikbel_at_gmail.com> wrote
> >   in <20110824082119.GJ17489_at_deviant.kiev.zoral.com.ua>:
> >
> > ko> On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek
> > wrote:
> > ko> > On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
> > ko> > > Pawel Jakub Dawidek wrote:
> > ko> > > > On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem
> > wrote:
> > ko> > > > > Ok, I'll admit I wasn't very fond of a fixed table that
> > would
> > ko> > > > > inevitably
> > ko> > > > > get out of date someday, either.
> > ko> > > > >
> > ko> > > > > I didn't think hashing for the cases not in the table
> > was worth the
> > ko> > > > > effort,
> > ko> > > > > but doing a hash instead of a table seems reasonable.
> > ko> > > > >
> > ko> > > > > I see that ZFS only uses the low order 8 bits, so I'll
> > try and come
> > ko> > > > > up
> > ko> > > > > with an 8bit hash solution and will post a patch for
> > testing/review
> > ko> > > > > soon.
> > ko> > > > >
> > ko> > > > > I don't think the vfs_sysctl() is that great a concern,
> > given that
> > ko> > > > > it
> > ko> > > > > appears to be deprecated already anyhow. (With an 8bit
> > hash,
> > ko> > > > > vfs_typenum
> > ko> > > > > won't be that sparse.) I'll also make sure that whatever
> > hash I use
> > ko> > > > > doesn't collide for the current list of file names
> > (although I will
> > ko> > > > > include
> > ko> > > > > code that handles a collision in the patch).
> > ko> > > >
> > ko> > > > Sounds great. Thanks!
> > ko> > > >
> > ko> > > Here's the patch. (Hiroki could you please test this,
> > thanks, rick.)
> > ko> > > ps: If the white space gets trashed, the same patch is at:
> > ko> > > http://people.freebsd.org/~rmacklem/fsid.patch
> > ko> >
> > ko> > The patch is fine by me. Thanks, Rick!
> > ko>
> > ko> Sorry, I am late.
> > ko>
> > ko> It seems that the probability of the collisions for the hash is
> > quite high.
> > ko> Due to the fixup procedure, the resulting typenum will depend on
> > the order
> > ko> of the module initialization, isn't it ? IMO, it makes the patch
> > goal not
> > ko> met.
> >
> >  I tried the following two experiments (the complete results are
> >  attached) to confirm the probability:
> >
> >  1. [fsidhash1.txt]
> > 	well-known vfc_name and the names "[a-z]fs" (# of names is 36)
> > 	with no fix-up recalculation.
> >
> >  2. [fsidhash2.txt]
> > 	well-known vfc_name and the names "[a-z][a-z]fs" (# of names is
> > 	710)
> > 	with no fix-up recalculation.
> >
> >  There is no collision in the case 1. And when [a-z][a-z]fs are
> >  included the average number of the collided names in the same hash
> >  value is 4.43 (i.e. 160 different hash values are generated, the
> >  theoretical best number is (710 entries / 256 buckets) = 2.77).
> >
> >  At least, vfc_names we currently have in our kernel code have no
> >  collision, fortunately. As you noticed "[a-z][a-z]fs" is an
> >  impractical data set and these results cannot explain the
> >  characteristics for all possible and practical vfc_names, so
> >  whether
> >  this hash is reasonable or not depends on how we think of them.
> >  Comments or other better idea?
> 
> Might be, add a new byte field to struct vfsconf, containing the
> suggested
> byte to be used by fsid. Divide the namespace into two half by the
> highest
> bit 7. Use the byte values with the highest bit clear for statically
> assignment for in-tree modules. Kernel will still check for the
> uniquiness
> on initialization.
> 
Well, doesn't this result in the same issue as the fixed table?
In other words, the developer has to supply the "suggested byte" for
fsid and make sure that it doesn't conflict with other "suggested byte"
values or suffer the same consequence as forgetting to update the fixed
table. (ie. It just puts the fixed value in a different place, from what
I see, for in-tree modules. Also, with a fixed table, they are all in
one place, so it's easy to choose a non-colliding value?)

> Reserve the value '0' to mean that kernel must assign the next unused
> value >= 128. This is done to support out-of-tree modules.
> 
> Might be, also reserve the value 0 to mean 'kernel
> Main drawback is that is sounds complicated.
Received on Wed Aug 24 2011 - 11:36:38 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:17 UTC