Re: EDEADLK from fcntl(F_SETFL) ?

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Fri, 4 Jul 2014 12:32:51 +0300
On Thu, Jul 03, 2014 at 07:15:51PM -0700, Adrian Chadd wrote:
> Hi,
> 
> I'm currently testing this out. It seems to be working out alright.
> 
> adrian_at_test3:~/work/freebsd % svn diff stable/10/src/sys/kern/
> 
> Index: stable/10/src/sys/kern/kern_lockf.c
> 
> ===================================================================
> 
> --- stable/10/src/sys/kern/kern_lockf.c (revision 267627)
> 
> +++ stable/10/src/sys/kern/kern_lockf.c (working copy)
> 
> _at__at_ -1425,6 +1425,14 _at__at_
> 
>                         if (lockf_debug & 1)
> 
>                                 lf_print("lf_setlock: deadlock", lock);
> 
>  #endif
> 
> +
> 
> +                       /*
> 
> +                        * If the lock isn't waiting, return EAGAIN
> 
> +                        * rather than EDEADLK.
> 
> +                        */
> 
> +                       if (((lock->lf_flags & F_WAIT) == 0) &&
> 
> +                           (error == EDEADLK))
> 
> +                               error = EAGAIN;
> 
>                         lf_free_lock(lock);
> 
>                         goto out;
> 
>                 }
> 
> On 3 July 2014 17:45, Adrian Chadd <adrian.chadd_at_gmail.com> wrote:
> > Hi!
> >
> > I've seen sqlite3 crap out due to "disk IO error". It looks like the
> > F_SETFL path is returning EDEADLK when it shouldn't be - only the
> > "wait" version of this should be.
> >
> > The kernel code looks to be:
> >
> > lf_setlock() -> lf_add_outgoing() -> lf_add_edge() -> graph_add_edge()
> > -> EDEADLK
> >
> > .. and lf_setlock() will return an error from lf_add_outgoing()
> > without checking if it's (a) EDEADLK, and (b) whether we're going to
> > sleep or not.
> >
> > So, sqlite3 trips up on this. I'm sure other things do. What should
> > the correct thing be? It looks like EWOULDBLOCK is the correct value
> > to return for F_SETFL failing, not EDEADLK.
> >
> > What do those-who-know-POSIX-standards-better-than-I think?

I doubt that the patch is correct. If there is an issue in kernel, the
patch only hides it.

Note that lf_setlock() first calls lf_getblock() to verify that there
is no contending lock on the range, and if there is a conflicting lock,
the very first statement inside the if() checks for F_WAIT.

Either you get a real deadlock, or there is a bug elsewere.

Received on Fri Jul 04 2014 - 07:32:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:50 UTC