Re: r367672 broke the NFS server

From: Rick Macklem <rmacklem_at_uoguelph.ca> Date: Thu, 31 Dec 2020 05:16:27 +0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:26 UTC

Rick Macklem wrote:
>Kostik wrote:
> >
> >Idea of the change is to restart the syscall at top level.  So for NFS
> >server the right approach is to not send a response and also to not
> >free the request mbuf chain, but to restart processing.
> Yes. I took a look and I think restarting the operation by rolling the
> working position in the mbuf lists back and redoing the operation
> is feasible and easier than fixing the individual operations.
>
> For NFSv4, you cannot redo the entire compound, since non-idempotent
> operations like exclusive open may have already been completed.
> However, rolling back to the beginning of the operation should be
> doable.
Turned out to be quite easy. I'll stick a patch up on phabricator
tomorrow, after I do a little more testing.
NFSv4.0 is still broken, because it screws up the seqid, but I can
fix that separately.

I do see the code looping about 2-3 times before it gets a successful
ufs_create(). Does that sound reasonable?
Here's some debug printfs for the test run of 4 concurrent compiles.
(proc=8 is create and proc=12 is remove. Each line is a ERELOOKUP
 retry. This is for the 4 threads, but I had the thread tid in another printf
 and it showed 2-3 attempts for the same thread. They should be serialized
 by the exclusive lock on the directory vnode.)
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=12
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=12
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=12
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=12
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=12
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8
tryag3 stat=0 proc=12
tryag3 stat=0 proc=8
tryag3 stat=0 proc=8

Thanks for the suggestion, Kostik.

rick

> --> It will serve as a good test, in that it may expose bugs in the
>       RPC/operation code where failure (ERELOOKUP) doesn't clean
>       things up correctly.
>       --> In NFSv4, there is the open/lock state that cannot be updated
>             for this error case. (The seqid stuff in NFSv4.0 Open can be fun.
>             Its used to serialize the operations and the number must be
>             incremented for some errors, but not for others. The 10026
>             error occurs when you don't get this right.)
Note that ERELOOKUP error can only show up from the VOPs that modify the volume.
Otherwise we simply do not call into SU.  In particular, I believe that opens
in the sense of NFS are safe.

Regardless of it, there should be either a catch-all check for ERELOOKUP,
or assert that ERELOOKUP did not leaked, as it is done for syscalls

>
> I'll start working on this to-day, but I have no idea how long it might
> take?
>
> >I am sorry I forgot about NFS server when designing this fix, the only
> >mild excuse I can provide is that the change was quite complicated as is.
> >I will start looking at the fix.
> No problem. Sometimes I'd like to forget about NFS too;-).
>
> For the rollback/redo the RPC/operation case, it's probably easier for me
> to do it. As above, I'll start on it, but...
>
> My main concern is how long it will take, given the FreeBSD13 release
> starts soon.
For sure I will help you if needed, and I believe that we could ask for
testing from Peter.
_______________________________________________
freebsd-current_at_freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"