Re: LOR tcp_input.c vs. tcp_usrreq.c (was: Re: 2 LORs on my NFS server.)

From: Don Lewis <truckman_at_FreeBSD.org> Date: Sat, 16 Aug 2003 14:24:46 -0700 (PDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:19 UTC

On 16 Aug, Tilman Linneweh wrote:
> * Tilman Linneweh [Fr, 15 Aug 2003 at 16:17 GMT]:
>> 
>> My CURRENT is already a bit old:
>> 
>> # uname -a
>> FreeBSD polly.arved.de 5.1-CURRENT FreeBSD 5.1-CURRENT #1: Sun Jul 20
>> 01:00:14 CEST 2003    
>> tilman_at_sauna.arved.de:/usr/obj/usr/src/CURRENT/sys/POLLY  i386
> 
> I updated my CURRENT to 
> 
> polly# uname -a
> FreeBSD polly.arved.de 5.1-CURRENT FreeBSD 5.1-CURRENT #1: Sat Aug 16
> 10:11:52 CEST 2003    
> tilman_at_sauna.arved.de:/usr/obj/usr/source/CURRENT/sys/POLLY  i386
> 
> and this LOR is reproducable. 
>  
>> This happend while the machine was NFS-serving around 3 clients with
>> normal udp NFS and a  fourth. client tried to mount something via
>> mount_nfs -T -a 2
> 
> The problem is the client with TCP mounts. I tried this time with a single
> NetBSD client that does a TCP mount and cd'd to the mounted directory.
> 
> lock order reversal
>  1st 0xc1a17278 inp (inp) _at_ /usr/source/CURRENT/sys/netinet/tcp_input.c:654
>  2nd 0xc046bd6c tcp (tcp) _at_ /usr/source/CURRENT/sys/netinet/tcp_usrreq.c:621
> Stack backtrace:
> backtrace(1,0,ffffffff,c0445068,c04451d0) at backtrace+0x12
> witness_lock(c046bd6c,8,c03c334c,26d,0) at witness_lock+0x55e
> _mtx_lock_flags(c046bd6c,0,c03c334c,26d) at _mtx_lock_flags+0x7d
> tcp_usr_rcvd(c1ce8800,80) at tcp_usr_rcvd+0x1b
> soreceive(c1ce8800,c891ab1c,c891ab28,c891ab20,0) at soreceive+0x815
> nfsrv_rcv(c1ce8800,c1a70780,4) at nfsrv_rcv+0x75
> sowakeup(c1ce8800,c1ce884c) at sowakeup+0x7f
> tcp_input(c0b9ac00,14) at tcp_input+0x11f6
> ip_input(c0b9ac00) at ip_input+0x7c8
> swi_net(0) at swi_net+0xe6
> ithread_loop(c0b87180,c891ad48,c0b87180,c0221660,0) at ithread_loop+0x11c
> fork_exit(c0221660,c0b87180,c891ad48) at fork_exit+0xab
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xc891ad7c, ebp = 0 ---
> Debugger("witness_lock")
> Stopped at      Debugger+0x45:  xchgl   %ebx,in_Debugger.0
> 

This is a known issue.

------ Forwarded message ------
    From: Don Lewis <truckman_at_freebsd.org>
 Subject: Re: LOR in NFS server
    Date: Thu, 24 Apr 2003 21:20:56 -0700 (PDT)
      To: gordont_at_gnf.org
      Cc: current_at_freebsd.org

On 24 Apr, Gordon Tetlow wrote:
> I generated it while running nessus against my local machine.
> 
> lock order reversal
>  1st 0xc9384c44 inp (inp) _at_ /local/usr.src/sys/netinet/tcp_input.c:649
>  2nd 0xc05aa84c tcp (tcp) _at_ /local/usr.src/sys/netinet/tcp_usrreq.c:621
> Stack backtrace:
> backtrace(c04e9f03,c05aa84c,c04f0770,c04f0770,c04f1ae4) at backtrace+0x17
> witness_lock(c05aa84c,8,c04f1ae4,26d,0) at witness_lock+0x692
> _mtx_lock_flags(c05aa84c,0,c04f1ae4,26d,0) at _mtx_lock_flags+0xb2
> tcp_usr_rcvd(c8a63800,80,c04ea514,df0e9a9c,3b9aca00) at tcp_usr_rcvd+0x30
> soreceive(c8a63800,df0e9ad8,df0e9ae4,df0e9adc,0) at soreceive+0x86a
> nfsrv_rcv(c8a63800,c6d4fb00,4,34,10430) at nfsrv_rcv+0x8a
> sowakeup(c8a63800,c8a6384c,c04f11d5,434,108) at sowakeup+0x97
> tcp_input(c21f5400,14,c0304f91,df0e9c5c,c02f60ba) at tcp_input+0x1341
> ip_input(c21f5400,0,c04efede,e9,c21bd280) at ip_input+0x7b0
> swi_net(0,0,c04e4eed,217,c21c73c0) at swi_net+0x111
> ithread_loop(c21c6100,df0e9d48,c04e4d5d,314,c21c8d10) at ithread_loop+0x16c
> fork_exit(c02ec2d0,c21c6100,df0e9d48) at fork_exit+0xc0
> fork_trampoline() at fork_trampoline+0x1a
> --- trap 0x1, eip = 0, esp = 0xdf0e9d7c, ebp = 0 ---

Hmn ... does NFS over TCP even work with a -current box as the server?
It looks like tcp_input() has grabbed the locks in tcbinfo and inp, and
then tcp_usr_rcvd() attempts to grab the same locks.

I can think of three possible ways of fixing this problem.

	1) Drop the locks in tcp_input() before calling sorwakeup() and grab
	   them again if necessary.  One has to be careful not to break
	   anything by doing this.  This also adds overhead for non-NFS
	   traffic.

	2) Never call soreceive() from nfsrv_rcv(), always wake nfsd instead.
	   This has the advantage of minimizing the amount of time that the
	   locks are held, but increases overhead under lightly loaded
	   conditions.

	3) Somehow tell tcp_usr_rcvd() not to attempt to grab the locks in
	   this specific case.

------ End forwarded message ------

------ Forwarded message ------
    From: Jeffrey Hsu <hsu_at_freebsd.org>
 Subject: Re: LOR in NFS server
    Date: Fri, 25 Apr 2003 01:02:56 -0700
      To: gordont_at_gnf.org
      Cc: current_at_freebsd.org

  > 1st 0xc9384c44 inp (inp) _at_ /local/usr.src/sys/netinet/tcp_input.c:649
  > 2nd 0xc05aa84c tcp (tcp) _at_ /local/usr.src/sys/netinet/tcp_usrreq.c:621

This old nag warning has been there since last year and was first reported
by Lars Eggert <larse_at_ISI.EDU>.  I made up a fix for him at the time which
you can find at http://www.freebsd.org/~hsu/hammer.diff.  Lars has verified
that this eliminates the nag warnings.

But, I'm hoping to have a more unified solution to the general sowakeup
problem, so have not committed this patch.

							Jeffrey

------ End forwarded message ------