Re: nve related LOR triggered by lots of small packets, and a hard hang

From: Andrey V. Elsukov <bu7cher_at_yandex.ru>
Date: Mon, 19 Feb 2007 14:34:58 +0300
Bjoern A. Zeeb пишет:
>> : lock order reversal:
>> :  1st 0xc3629f00 inp (tcpinp) _at_ 
>> /src/usr.src/sys/netinet/tcp_usrreq.c:801
>> :  2nd 0xc0a9feec tcp (tcp) _at_ /src/usr.src/sys/netinet/tcp_input.c:626
> 
> I add this with LOR ID 200 to the LOR page:
>     http://sources.zabbadoz.net/freebsd/lor.html#200
Hi, All.

I have this LOR and deadlock on my notebook with recent CURRENT.

My hardvare detected with nve(4) as "NVIDIA nForce MCP13 Networking 
Adapter":

nve0: <NVIDIA nForce MCP13 Networking Adapter> port 0x30b8-0x30bf mem 
0xc0007000-0xc0007fff irq 5 at device 20.0 on pci0
nve0: Ethernet address 00:90:f5:4f:18:1b
miibus0: <MII bus> on nve0
rlphy0: <RTL8201L 10/100 media interface> PHY 1 on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
nve0: using obsoleted if_watchdog interface

With nfe(4) as "NVIDIA nForce 430 MCP13 Networking Adapter":

nfe0: <NVIDIA nForce 430 MCP13 Networking Adapter> port 0x30b8-0x30bf 
mem 0xc0007000-0xc0007fff irq 5 at device 20.0 on pci0
miibus0: <MII bus> on nfe0
rlphy0: <RTL8201L 10/100 media interface> PHY 1 on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
nfe0: using obsoleted if_watchdog interface

DDB message:
lock order reversal:
  1st 0xc2ec2480 inp (tcpinp) _at_ /usr/src/sys/netinet/tcp_usrreq.c:801
  2nd 0xc07bbc6c tcp (tcp) _at_ /usr/src/sys/netinet/tcp_input.c:638
KDB: stack backtrace:
db_trace_self_wrapper(c06ee408) at db_trace_self_wrapper+0x25
kdb_backtrace(0,ffffffff,c077d708,c077d730,c072f944,...) at 
kdb_backtrace+0x29
witness_checkorder(c07bbc6c,9,c06fcd51,27e) at witness_checkorder+0x586
_mtx_lock_flags(c07bbc6c,0,c06fcd51,27e,0,...) at _mtx_lock_flags+0x84
tcp_input(c2def300,14,c0759b80,835115ac,0,...) at tcp_input+0x432
ip_input(c2def300) at ip_input+0x5c9
netisr_dispatch(2,c2def300,0,c2be2000,c2df0800,...) at 
netisr_dispatch+0x58
ether_demux(c2be2000,c2def300,c2def300,c314da10,d3ee075c,...) at 
ether_demux+0x28a
ether_input(c2be2000,c2def300,c2b53cd8,0,c31605fb,...) at 
ether_input+0x21e
nve_ospacketrx(c2b53c00,d3ee0798,1,0,0,...) at nve_ospacketrx+0xa4
UpdateReceiveDescRingData(c315fc64,c315fc84,c315fd24,c315fd50,c315fd68,...) 
at UpdateReceiveDescRingData+0x2f8
nve_osalloc(c2b80940,d4152010,c2b53c00,c315fbcc,c315fc64,...) at 
nve_osalloc
_end(0,c2c5c008,3065766e,0,0,...) at 0xc2b80900
_end(c2b80940,d4152010,c2b53c00,c315fbcc,c315fc64,...) at 0xc2af3630
_end(0,c2c5c008,3065766e,0,0,...) at 0xc2b80900
< ..too many strings ..>

db> where
Tracing pid 1045 tid 100068 td 0xc2dea510
kdb_enter(c06bbc68) at kdb_enter+0x2b
witness_checkorder(c07bbc6c,9,c06fcd51,27e) at witness_checkorder+0x599
_mtx_lock_flags(c07bbc6c,0,c06fcd51,27e,0,...) at _mtx_lock_flags+0x84
tcp_input(c2def300,14,c0759b80,835115ac,0,...) at tcp_input+0x432
ip_input(c2def300) at ip_input+0x5c9
netisr_dispatch(2,c2def300,0,c2be2000,c2df0800,...) at 
netisr_dispatch+0x58
ether_demux(c2be2000,c2def300,c2def300,c314da10,d3ee075c,...) at 
ether_demux+0x28a
ether_input(c2be2000,c2def300,c2b53cd8,0,c31605fb,...) at 
ether_input+0x21e
nve_ospacketrx(c2b53c00,d3ee0798,1,0,0,...) at nve_ospacketrx+0xa4
UpdateReceiveDescRingData(c315fc64,c315fc84,c315fd24,c315fd50,c315fd68,...) 
at UpdateReceiveDescRingData+0x2f8
nve_osalloc(c2b80940,d4152010,c2b53c00,c315fbcc,c315fc64,...) at 
nve_osalloc
_end(0,c2c5c008,3065766e,0,0,...) at 0xc2b80900
_end(c2b80940,d4152010,c2b53c00,c315fbcc,c315fc64,...) at 0xc2af3630

< ..too many strings ..>

db> ps
   pid  ppid  pgrp   uid   state   wmesg     wchan    cmd
  1045  1043  1043    21  R       CPU 0               vsftpd

db> show allpcpu
Current CPU: 0

cpuid        = 0
curthread    = 0xc2dea510: pid 1045 "vsftpd"
curpcb       = 0xd3ee0d90
fpcurthread  = none
idlethread   = 0xc2a14510: pid 10 "idle"
APIC ID      = 0
currentldt   = 0x50
spin locks held:

db> show locks
exclusive sleep mutex inp (tcpinp) r = 0 (0xc2ec2480) locked _at_ 
/usr/src/sys/netinet/tcp_usrreq.c:801

db> show alllocks
Process 1045 (vsftpd) thread 0xc2dea510 (100068)
exclusive sleep mutex inp (tcpinp) r = 0 (0xc2ec2480) locked _at_ 
/usr/src/sys/netinet/tcp_usrreq.c:801
Process 21 (irq5: nvidia0 nve0) thread 0xc2ac2510 (100023)
exclusive sleep mutex tcp r = 0 (0xc07bbc6c) locked _at_ 
/usr/src/sys/netinet/tcp_input.c:638

db> call boot(0)
Waiting (max 60 seconds) for system process `vnlru' to stop...Sleeping 
on "suspkt" with the following non-sleepable locks held:
exclusive sleep mutex inp (tcpinp) r = 0 (0xc2ec2480) locked _at_ 
/usr/src/sys/netinet/tcp_usrreq.c:801
KDB: enter: witness_warn
[thread pid 1045 tid 100068 ]
Stopped at      kdb_enter+0x2b: nop

db> panic
panic: from debugger
Uptime: 7m18s
Physical memory: 434 MB
Dumping 47 MB: 32 16
Dump complete

If somebody have interest to help me resolve this problem, i can easy 
reproduce this deadlock - the simple downloading of big file (avi 
file, for example) from ftp server from my notebook will result deadlock.

With nfe(4) i don't have this deadlock, but nfe useless for me, he 
often displays message "nfe0: watchdog timeout".

-- 
WBR, Andrey V. Elsukov
Received on Mon Feb 19 2007 - 10:35:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC