Re: files disappearing from ls on NFS

From: Hartmut Brandt <hartmut.brandt_at_dlr.de>
Date: Tue, 7 May 2013 09:12:20 +0200
On Mon, 6 May 2013, Rick Macklem wrote:

RM>Hartmut Brandt wrote:
RM>> Hi Rick,
RM>> 
RM>> the patch doesn't help. So how can I help to fix that? Of course, I
RM>> can use the work-around with oldnfs, but ...
RM>> 
RM>Well, I plan on going through the readdir code and seeing if I can spot
RM>a case that would break for small RPC replies. If I can find something,
RM>I'll email you a patch for testing. (I can't seem to reproduce the problem
RM>here.)
RM>
RM>The mysterious part for me is why it has shown up recently, because there
RM>hasn't been any recent change committed that seems like it could cause this.
RM>(Maybe it is just a co-incidence that it showed up recently and the bug has
RM> been there all along?)
RM>
RM>I'll admit my worst fear is that is somehow caused by the switch to clang for
RM>certain arches. If that is the case, it could take a long time to isolate.

I'm quite sure that I've build the system in February with clang already. 
But in march or so a new clang version was committed.

harti

RM>> -----Original Message-----
RM>> From: Rick Macklem [mailto:rmacklem_at_uoguelph.ca]
RM>> Sent: Saturday, May 04, 2013 11:33 PM
RM>> To: Brandt, Hartmut
RM>> Cc: current_at_freebsd.org; Andrzej Tobola
RM>> Subject: Re: files disappearing from ls on NFS
RM>> 
RM>> Hartmut Brandt wrote:
RM>> > On Fri, 3 May 2013, Rick Macklem wrote:
RM>> >
RM>> > RM>Ok, if you succeed in isolating the commit, that would be great.
RM>> >
RM>> > Hmm. I'm somewhat stuck. clang from yesterday can't compile clang
RM>> > from
RM>> > a month ago...
RM>> >
RM>> > harti
RM>> >
RM>> Oh well. You could try this patch (which is the one to fix readdir for
RM>> union mounts), since I can see that VOP_VPTOCNP() will also be broken
RM>> without it. (I can't see how that would break "ls", but it breaks
RM>> __getcwd() and friends, so maybe it can affect "ls" somehow?)
RM>> 
RM>> It's a cut/paste under windows, so I'm afraid the whitespace will be
RM>> messed up, but it's pretty simple to apply by hand.
RM>> 
RM>> Index: nfs_clvnops.c
RM>> ===================================================================
RM>> --- nfs_clvnops.c (revision 249568)
RM>> +++ nfs_clvnops.c (working copy)
RM>> _at__at_ -2221,6 +2221,7 _at__at_
RM>> !NFS_TIMESPEC_COMPARE(&np->n_mtime, &vattr.va_mtime)) {
RM>> mtx_unlock(&np->n_mtx);
RM>> NFSINCRGLOBAL(newnfsstats.direofcache_hits);
RM>> + *ap->a_eofflag = 1;
RM>> return (0);
RM>> } else
RM>> mtx_unlock(&np->n_mtx); _at__at_ -2233,8 +2234,10 _at__at_
RM>> tresid = uio->uio_resid;
RM>> error = ncl_bioread(vp, uio, 0, ap->a_cred);
RM>> 
RM>> - if (!error && uio->uio_resid == tresid)
RM>> + if (!error && uio->uio_resid == tresid) {
RM>> NFSINCRGLOBAL(newnfsstats.direofcache_misses);
RM>> + *ap->a_eofflag = 1;
RM>> + }
RM>> return (error);
RM>> }
RM>> 
RM>> I haven't yet succeeded in reproducing the problem, but will be poking
RM>> at it some more, rick
RM>> 
RM>> > RM>
RM>> > RM>rick
RM>> > RM>
RM>> > RM>> harti
RM>> > RM>>
RM>> > RM>> On Fri, 3 May 2013, Rick Macklem wrote:
RM>> > RM>>
RM>> > RM>> RM>Hartmut Brandt wrote:
RM>> > RM>> RM>> Hi,
RM>> > RM>> RM>>
RM>> > RM>> RM>> I've updated one of my -current machines this week
RM>> > (previous
RM>> > RM>> update
RM>> > RM>> RM>> was in
RM>> > RM>> RM>> february). Now I see a strange effect (it seems only on
RM>> > NFS
RM>> > RM>> mounts):
RM>> > RM>> RM>> ls or
RM>> > RM>> RM>> even echo * will list only some files (strange enough the
RM>> > first
RM>> > RM>> files
RM>> > RM>> RM>> from
RM>> > RM>> RM>> the normal, alphabetically ordered list). If I change
RM>> > something
RM>> > RM>> in the
RM>> > RM>> RM>> directory (delete a file or create a new one) for some
RM>> > time
RM>> > the
RM>> > RM>> RM>> complete
RM>> > RM>> RM>> listing will appear but after sime time (seconds to a
RM>> > minute
RM>> > or
RM>> > RM>> so)
RM>> > RM>> RM>> again
RM>> > RM>> RM>> only part of the files is listed.
RM>> > RM>> RM>>
RM>> > RM>> RM>> A ktrace on ls /usr/src/lib/libc/gen shows that
RM>> > getdirentries is
RM>> > RM>> RM>> called
RM>> > RM>> RM>> only once (returning 4096). For a full listing
RM>> > getdirentries
RM>> > is
RM>> > RM>> called
RM>> > RM>> RM>> 5
RM>> > RM>> RM>> times with the last returning 0.
RM>> > RM>> RM>>
RM>> > RM>> RM>> I can still open files that are not listed if I know their
RM>> > name,
RM>> > RM>> RM>> though.
RM>> > RM>> RM>>
RM>> > RM>> RM>> The NFS server is a Windows 2008 server with an OpenText
RM>> > NFS
RM>> > RM>> Server
RM>> > RM>> RM>> which
RM>> > RM>> RM>> works without problems to all the other FreeBSD machines.
RM>> > RM>> RM>>
RM>> > RM>> RM>> So what could that be?
RM>> > RM>> RM>>
RM>> > RM>> RM>Someone else reported missing files returned via "ls"
RM>> > recently,
RM>> > RM>> when
RM>> > RM>> RM>they used a small readdirsize (below 8K). I haven't yet had
RM>> > a
RM>> > RM>> change to try
RM>> > RM>> RM>and reproduce it or do any snooping around.
RM>> > RM>> RM>
RM>> > RM>> RM>There haven't been any recent changes to readdir in the NFS
RM>> > client,
RM>> > RM>> RM>except a trivial one that adds a check for vnode type being
RM>> > VDIR,
RM>> > RM>> RM>so I don't see that it can be a recent NFS change.
RM>> > RM>> RM>
RM>> > RM>> RM>If you can increase the readdirsize, try that to see if it
RM>> > avoids
RM>> > RM>> RM>the problem. "nfsstat -m" shows you what the mount options
RM>> > end
RM>> > up
RM>> > RM>> RM>being after doing the mount. The server might be limiting
RM>> > the
RM>> > RM>> readdirsize
RM>> > RM>> RM>to 4K, so you should check, even if you specify a large
RM>> > value
RM>> > for
RM>> > RM>> RM>the mount.
RM>> > RM>> RM>
RM>> > RM>> RM>rick
RM>> > RM>> RM>
RM>> > RM>> RM>> Regards,
RM>> > RM>> RM>> harti
RM>> > RM>> RM>> _______________________________________________
RM>> > RM>> RM>> freebsd-current_at_freebsd.org mailing list
RM>> > RM>> RM>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
RM>> > RM>> RM>> To unsubscribe, send any mail to
RM>> > RM>> RM>> "freebsd-current-unsubscribe_at_freebsd.org"
RM>> > RM>> RM>
RM>> > RM>> _______________________________________________
RM>> > RM>> freebsd-current_at_freebsd.org mailing list
RM>> > RM>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
RM>> > RM>> To unsubscribe, send any mail to
RM>> > RM>> "freebsd-current-unsubscribe_at_freebsd.org"
RM>> > RM>
RM>> > _______________________________________________
RM>> > freebsd-current_at_freebsd.org mailing list
RM>> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
RM>> > To unsubscribe, send any mail to
RM>> > "freebsd-current-unsubscribe_at_freebsd.org"
RM>
Received on Tue May 07 2013 - 05:12:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:37 UTC