Re: Data corruption over NFS in -current

From: Daniel Braniss <danny_at_cs.huji.ac.il>
Date: Thu, 12 Jan 2012 10:19:01 +0200
> 
> --+QahgC5+KEYLbs62
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> 
> Stefan Bethke wrote on Wed, Jan 11, 2012 at 07:14:44PM +0100: 
> > Am 11.01.2012 um 17:57 schrieb Martin Cracauer:
> > 
> > > I'm sorry for the unspecific bug report but I thought a heads-up is
> > > better than none.
> > > 
> > > $ uname -a
> > > FreeBSD wings.cons.org 10.0-CURRENT FreeBSD 10.0-CURRENT #2: Wed Dec
> > > 28 12:19:21 EST 2011
> > > cracauer_at_wings.cons.org:/usr/src/sys/amd64/compile/WINGS  amd64
> > 
> > I'm sure Rick will want to know which NFS version, which client code (default new code I'm assuming) and which mount options...
> 
> It's all default both in fstab and as reported by mount(8).
> 
> This is a diskless PXE boot but the mount affected (usr) is not the
> root filesystem, so this should come in via fstab.
> 
> BTW, my /usr/ports is another mount so the corruption is cross-mount
> (garbage from /usr/ports entering /usr).
> 
> Appending nfsstat output.
> 
> I am re-running things contiguously to see how reproducible this is.
> This machine was recently updated from a -current almost a year old,
> so it's its first time with the new NFS client code.
> 
> Martin
I've seen problems, but they were always related to programs running out of 
resources and not reporting it correctly - in dataless specialy if running
out of memory and there is no swap available.
btw, most of my servers are dataless (they boot via PXE but have local
swap, var, etc)

hth,
	danny


> 
> > > I see filesystem corruption on NFS filesystems here.  I am running a
> > > heavy shellscript that is noodling around with ascii files assembling
> > > them with awk and whatnot.  Some actions are concurrent with up to 21
> > > forks doing full-CPU load scripting.  This machine is a K8 with a
> > > total of 8 cores, diskless NFS and memory filesystem for /tmp.
> > > 
> > > I observe two problems:
> > > - for no reason whatsoever, some files change from my 
> > >  (user/group) cracauer/wheel to root/cracauer
> > > - the same files will later be corrupted.  The beginning of the file
> > >  is normal but then it has what looks like parts of /usr/ports,
> > >  including our CVS files and binary junk, mostly zeros
> > > 
> > > I did do some ports building lately but not at the same time that this
> > > problem manifested itself.  I speculate some ports blocks were still
> > > resident in the filesystem buffer cache.
> > > 
> > > Server is Linux.
> > > 
> > > Martin
> > > -- 
> > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > > Martin Cracauer <cracauer_at_cons.org>   http://www.cons.org/cracauer/
> > > _______________________________________________
> > > freebsd-current_at_freebsd.org mailing list
> > > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> > 
> > -- 
> > Stefan Bethke <stb_at_lassitu.de>   Fon +49 151 14070811
> > 
> > 
> > 
> 
> -- 
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> Martin Cracauer <cracauer_at_cons.org>   http://www.cons.org/cracauer/
> 
> --+QahgC5+KEYLbs62
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: attachment; filename=l
> 
> Client Info:
> Rpc Counts:
>   Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
>  94392942    513117   3637266      2577  40227237   2824593    333832    304567
>    Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
>     32522      5121      4856     20363     13954    179035         0   3534382
>     Mknod    Fsstat    Fsinfo  PathConf    Commit
>         5  21127240         3      2999    521782
> Rpc Info:
>  TimedOut   Invalid X Replies   Retries  Requests
>         0         0         0         0 167678419
> Cache Info:
> Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits    Misses
> 1933340911  73265447 1123380719   3636242  90975094    450509   4917135   2824593
> BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses Accs Hits    Misses
>  54732346      2577    599049    142917    352394         0 733726346   3534382
> 
> Server Info:
>   Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
>         0         0         0         0         0         0         0         0
>    Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
>         0         0         0         0         0         0         0         0
>     Mknod    Fsstat    Fsinfo  PathConf    Commit
>         0         0         0         0         0
> Server Ret-Failed
>                 0
> Server Faults
>             0
> Server Cache Stats:
>    Inprog      Idem  Non-idem    Misses
>         0         0         0         0
> Server Write Gathering:
>  WriteOps  WriteRPC   Opsaved
>         0         0         0
> 
> --+QahgC5+KEYLbs62
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> --+QahgC5+KEYLbs62--
> 
Received on Thu Jan 12 2012 - 07:38:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC