Re: newnfs pkgng database corruption?

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Fri, 12 Apr 2013 23:16:43 -0400 (EDT)
Baptiste Daroussin wrote:
> On Fri, Apr 12, 2013 at 12:56:10PM +0000, Eggert, Lars wrote:
> > Hi,
> >
> > On Apr 12, 2013, at 1:10, Rick Macklem <rmacklem_at_uoguelph.ca> wrote:
> > > Well, I have no idea why an NFS server would reply errno 70 if the
> > > file
> > > still exists, unless the client has somehow sent a bogus file
> > > handle
> > > to the server. (I am not aware of any client bug that might do
> > > that. I
> > > am almost suspicious that there might be a memory problem or
> > > something
> > > that corrupts bits in the network layer. Do you have TSO enabled
> > > for your
> > > network interface by any chance? If so, I'd try disabling that on
> > > the
> > > network interface. Same goes for checksum offload.)
> > >
> > > rick
> > > ps: If you can capture packets between the client and server at
> > > the
> > >    time this error occurs, looking at them in wireshark might be
> > >    useful?
> >
> > I will try all of those things.
> >
You might still try the above suggestions, but since Error 70 wasn't an
errno.h error number, it isn't a stale fh problem and, as such, there
isn't any evidence that bits are getting messed with by the network layers.

rick

> > But first, a question that someone who understands pkgng will be
> > able to answerr: Is this "fake-pkg" process even running on the NFS
> > mount? The WRKDIR is /tmp, which is an mfs mount.
> 
> fake-pkg is run in WRKDIR, but it calls pkgng which will open
> /var/db/pkg/local.sqlite aka nfs mount.
> 
> The Error 70 is EX_SOFTWARE returned by pkgng.
> 
> Can you try the following patch:
> http://people.freebsd.org/~bapt/patch-libpkg__pkgdb.c
> 
> Just add that file to /usr/ports/ports-mgmt/pkg/files/
> 
> If that works for you, that means the posix advisory locks is somehow
> failing on
> nfsv4 files.
> 
> Given it is already known to be failing on nfsv3 (because people often
> misconfigure it) I'll probablmy make unix-dotfile the default locking
> system
> when local.sqlite is stored on network filesystem.
> 
> regards,
> Bapt
Received on Sat Apr 13 2013 - 01:16:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:36 UTC