Re: process killed: text file modification

From: Ian Lepore <ian_at_freebsd.org>
Date: Sun, 12 Mar 2017 11:47:19 -0600
On Thu, 2017-03-09 at 21:07 +0100, Gergely Czuczy wrote:
> 
> On 2017. 03. 09. 20:47, Gergely Czuczy wrote:
> > 
> > 
> > 
> > On 2017. 03. 09. 19:44, John Baldwin wrote:
> > > 
> > > On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote:
> > > > 
> > > > [+freebsd-fs]
> > > > 
> > > > 
> > > > On 2017. 03. 09. 14:20, Gergely Czuczy wrote:
> > > > > 
> > > > > On 2017. 03. 09. 11:27, Gergely Czuczy wrote:
> > > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > I'm trying to build a few things from ports on an rpi3, the
> > > > > > ports
> > > > > > collection is mounted over NFS from another machine. When
> > > > > > it's trying
> > > > > > to build pkg i'm getting the error message in syslog:
> > > > > > 
> > > > > > rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file
> > > > > > modification
> > > > > > 
> > > > > > The report to pkg_at_:
> > > > > > https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/
> > > > > > 002048.html 
> > > > > > 
> > > > > > 
> > > > > > In ports-mgmt/pkg's config.log It fails at the following
> > > > > > entry:
> > > > > > configure:3726: checking whether we are cross compiling
> > > > > > configure:3734: cc -o conftest -O2 -pipe  -Wno-error
> > > > > > -fno-strict-aliasing   conftest.c  >&5
> > > > > > configure:3738: $? = 0
> > > > > > configure:3745: ./conftest
> > > > > > configure:3749: $? = 137
> > > > > > configure:3756: error: in 
> > > > > > `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0':
> > > > > > configure:3760: error: cannot run C compiled programs.
> > > > > > If you meant to cross compile, use `--host'.
> > > > > > See `config.log' for more details
> > > > > > 
> > > > > > # uname -a
> > > > > > FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949:
> > > > > > Thu Mar 9
> > > > > > 08:58:46 CET 2017
> > > > > > aegir_at_marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.
> > > > > > aarch64/tank/rpi3/src/sys/AEGIR 
> > > > > > 
> > > > > > arm64
> > > > > So far, a few additions:
> > > > > Time is synced between the NFS server and the client.
> > > > > it's an open() call which is getting the kill, and it's not
> > > > > the file
> > > > > what's being opened, but the process executing it.
> > > > > Here's a simple code that reproduces it:
> > > > > #include <stdio.h>
> > > > > 
> > > > > int main() {
> > > > > 
> > > > >    FILE *f = fopen ("/bar", "w");
> > > > > 
> > > > >    fclose(f);
> > > > >    return 0;
> > > > > }
> > > > > 
> > > > > Conditions to reproduce it:
> > > > >   - The resulting binary must be executed from the nfs mount
> > > > >   - The binary must be built after mounting the NFS share.
> > > > > 
> > > > > I haven't tried building it on a different host, I don't have
> > > > > access
> > > > > to multiple RPis. Also, if I build the binary, umount/remount
> > > > > the NFS
> > > > > mount point, which has the binary, execute it, then it works.
> > > > > 
> > > > > I've also tried this with the raspbsd.org's image, I could
> > > > > reproduce
> > > > > it as well.
> > > > > 
> > > > > Another interesting thing is, when I first booted the RPi up,
> > > > > the NFS
> > > > > server was a 10.2-STABLE, and later got updated to 11-STABLE. 
> > > > > While it
> > > > > was 10.2 I've tried to build some port, and I don't remember
> > > > > having
> > > > > this issue.
> > > > > 
> > > > > So, could someone please help me figure this out and fix it?
> > > > > This
> > > > > stuff should work pretty much.
> > > > > 
> > > > So, this error message comes from here:
> > > > https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio
> > > > .c?revision=314436&view=markup#l1674 
> > > > 
> > > > 
> > > > It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np-
> > > > >n_vattr.na_mtime)
> > > > comparision that fails, np should be the NFS node structure,
> > > > from the
> > > > vnode's v_data, and n_vattr is the attribute cache. As I've
> > > > seen these
> > > > two are being updated together, so I don't really see by the
> > > > code why
> > > > they might differ. Could someone please take a look at it, with
> > > > more
> > > > experience in the NFS code? -czg
> > > Can you print out the two mtimes?  I wonder if what's happening
> > > is that
> > > your server uses different granularity (for example just seconds)
> > > than
> > > your client, so on the client we generate a timestamp with a non-
> > > zero
> > > nanoseconds but when the server receives that timestamp it
> > > "truncates"
> > > it.  During open() we forcefully re-fetch the timestamp (for CTO
> > > consistency) and then notice it doesn't match.  For now I would
> > > start
> > > with comparing the timestamps and maybe the
> > > vfs.timestamp_precision
> > > sysctls on client and server (if server is a FreeBSD box).
> > Here are the time values:
> > Mar  9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + 
> > -3298114786336  &np->n_vattr.na_mtime: -3298114786616 +
> > -3298114786608
> > Mar  9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed:
> > text 
> > file modification
> > Mar  9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + 
> > -3298114786336  &np->n_vattr.na_mtime: -3298114786616 +
> > -3298114786608
> > Mar  9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed:
> > text 
> > file modification
> > 
> > Printed this way:
> >                          printf("np->n_mtime: %ji + %ji 
> > &np->n_vattr.na_mtime: %ji + %ji",
> > (intmax_t)(&np->n_mtime.tv_sec),
> > (intmax_t)(&np->n_mtime.tv_nsec),
> > (intmax_t)(&np->n_vattr.na_mtime.tv_sec),
> > (intmax_t)(&np->n_vattr.na_mtime.tv_nsec));
> Sorry, I made a typo there. Here's it now:
> Mar  9 20:05:35 rpi3 kernel: np->n_mtime: 1489089935 + 219323000 
> &np->n_vattr.na_mtime: 1489089935 + 221438000
> Mar  9 20:05:35 rpi3 kernel: pid 847 (csh), uid 0, was killed: text
> file 
> modification
> Mar  9 20:05:35 rpi3 kernel: np->n_mtime: 1489089935 + 219323000 
> &np->n_vattr.na_mtime: 1489089935 + 221438000
> Mar  9 20:05:35 rpi3 kernel: pid 847 (csh), uid 0, was killed: text
> file 
> modification
> 
> That's a difference of 2115 micro seconds.

I think this is a problem that sysctl vfs.timestamp_precision is
supposed to help solve.  Unfortunately, that knob doesn't provide very
much control; you can truncate timestamps to full seconds,
microseconds, or 1/Hz which may be milliseconds on many systems.
 You've got a difference of about 2ms, so only the full seconds
granularity might help.

It's also not clear to me whether that setting would have to be changed
on the server, or the client, or both.

-- Ian
Received on Sun Mar 12 2017 - 16:47:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC