Re: ACL issue (Was Re: HEADS UP: ZFSv28 is in!)

From: Jeremy Chadwick <freebsd_at_jdc.parodius.com>
Date: Sun, 6 Mar 2011 08:23:42 -0800
On Sun, Mar 06, 2011 at 11:06:09AM -0500, Steve Wills wrote:
> On 03/06/11 10:37, Jeremy Chadwick wrote:
> > 
> > At first glance it looks like acl_set_fd_np(3) isn't working on an
> > md-backed filesystem; specifically, it's returning EOPNOTSUPP.  You
> > should be able to reproduce the problem by doing a setfacl on something
> > in /tmp/foobar.
> > 
> > Looking through src/bin/cp/utils.c, this is the code:
> > 
> > 420         if (acl_set_fd_np(dest_fd, acl, acl_type) < 0) {
> > 421                 warn("failed to set acl entries for %s", to.p_path);
> > 422                 acl_free(acl);
> > 423                 return (1);
> > 424         }
> > 
> > EOPNOTSUPP for acl_set_fd_np(3) is defined as:
> > 
> >      [EOPNOTSUPP]       The file system does not support ACL retrieval.
> > 
> > This would be referring to the destination filesystem.
> > 
> > Looking through the md(4) source for references to EOPNOTSUPP, we do
> > find some references:
> > 
> > $ egrep -n -r "EOPNOTSUPP|ENOTSUP" /usr/src/sys/dev/md
> > /usr/src/sys/dev/md/md.c:423:           return (EOPNOTSUPP);
> > /usr/src/sys/dev/md/md.c:475:                   error = EOPNOTSUPP;
> > /usr/src/sys/dev/md/md.c:523:           return (EOPNOTSUPP);
> > /usr/src/sys/dev/md/md.c:601:           return (EOPNOTSUPP);
> > /usr/src/sys/dev/md/md.c:731:                           error = EOPNOTSUPP;
> > 
> > Line 423 is within mdstart_malloc(), and it returns EOPNOTSUPP on any
> > BIO operation other than READ/WRITE/DELETE.  Line 475 is a continuation
> > of that.
> > 
> > Line 508 is within mdstart_vnode(), behaving effectively the same as
> > line 423.  Line 601 is within mdstart_swap(), behaving effectively the
> > same as line 423.
> > 
> > Line 731 is within md_kthread(), and indicates only BIO operation
> > BIO_GETATTR is supported.  This would not be an "ACL attribute" thing,
> > but rather getting attributes of the backing device itself.  The code
> > hints at that:
> > 
> >  722                 if (bp->bio_cmd == BIO_GETATTR) {
> >  723                         if ((sc->fwsectors && sc->fwheads &&
> >  724                             (g_handleattr_int(bp, "GEOM::fwsectors",
> >  725                             sc->fwsectors) ||
> >  726                             g_handleattr_int(bp, "GEOM::fwheads",
> >  727                             sc->fwheads))) ||
> >  728                             g_handleattr_int(bp, "GEOM::candelete", 1))
> >  729                                 error = -1;
> >  730                         else
> >  731                                 error = EOPNOTSUPP;
> >  732                 } else {
> 
> Thanks for the investigation! So this seems to be a bug in md? That's
> too bad, I was enjoying using it to make my tinderbox builds faster.

Sorry, I should have been more clear -- my investigation wasn't to
determine if the issue you're reporting was a bug or not, but more along
the lines of "hmm, where is userland getting EOPNOTSUPP from in the
kernel in this situation?"  It could be that some piece hasn't been
implemented somewhere yet (more an "incomplete" than a bug :-) ).

I tend to trace source the way I did above in hopes that someone (kernel
dev, etc.) will chime in and go "Oh, yes, THAT... let me tell you about
that!"  It's also for educational purposes; I figure sharing the innards
along with some simple descriptions might help people feel more
comfortable (vs. thinking everything is a black box; don't let the magic
smoke out!).  Sometimes digging through the code helps.

> > This leaves me with some ideas; just tossing them out here...
> > 
> > 1. Maybe/somehow this is caused by swap being used as the backing
> >    type/store for md(4)?  Try using "mdconfig -t malloc -o reserve"
> >    instead, temporarily anyway.
> 
> Seems to be the same.

I'm not too surprised, but at least that rules out swap vs.
non-block-device stuff being somehow responsible.

I'm not a user of ACLs myself, but Robert Watson might know what's up
with this, or where to go looking.  I've CC'd him here.

> > 2. Are you absolutely 100% sure the kernel you're using was built
> >    with "options UFS_ACL" defined in it?  Doing a "strings -a
> >    /boot/kernel/kernel | grep UFS_ACL" should suffice.
> > 
> 
> Yep, it does:
> 
> % strings -a /boot/kernel/kernel | grep UFS_ACL
> options UFS_ACL
> 
> (My kernel config is just "include GENERIC" then a bunch of "nooptions"
> for KDB, DDB, GDB, INVARIANTS, WITNESS, etc.)

Cool, good to rule out the obvious.  Thanks.

The only other thing I can think of off the top of my head would be to
"ktrace -t+ -i" the cp -p, then provide output of kdump -s -t+ after.
I wouldn't say go about this quite yet (it may not even help determine
what's going on); maybe wait for Robert to take a look first.

-- 
| Jeremy Chadwick                                   jdc_at_parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |
Received on Sun Mar 06 2011 - 15:23:47 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:12 UTC