nve related LOR triggered by lots of small packets, and a hard hang

From: Sergey Zaharchenko <doublef-ctm_at_yandex.ru>
Date: Wed, 10 Jan 2007 15:07:31 +0300
Hello -current,

While chasing that smbfs recursive locking thing, I decided to try
copying a large amount of small files (/usr/src actually) to an SMB
share to which I am connected by an NVIDIA nForce MCP2 card. I have come
across a lock order reversal which seems related to the card. First,
some files are copied, then I see the following kernel messages, some
more files are copied, and then the system hangs without responding to
the keyboard or anything.

: lock order reversal:
:  1st 0xc3629f00 inp (tcpinp) _at_ /src/usr.src/sys/netinet/tcp_usrreq.c:801
:  2nd 0xc0a9feec tcp (tcp) _at_ /src/usr.src/sys/netinet/tcp_input.c:626
: KDB: stack backtrace:
: db_trace_self_wrapper(c0950c60) at db_trace_self_wrapper+0x25
: kdb_backtrace(0,ffffffff,c0a612a8,c0a612d0,c09f8e84,...) at kdb_backtrace+0x29
: witness_checkorder(c0a9feec,9,c095ec63,272) at witness_checkorder+0x586
: _mtx_lock_flags(c0a9feec,0,c095ec63,272,0,...) at _mtx_lock_flags+0x84
: tcp_input(c32df800,14,c3300800,100a8c0,0,...) at tcp_input+0x432
: ip_input(c32df800) at ip_input+0x5a6
: netisr_dispatch(2,c32df800,0,c32c5000,c3300800,...) at netisr_dispatch+0x58
: ether_demux(c32c5000,c32df800,c32caed8,c32df800,dd1757d4,...) at ether_demux+0x28a
: ether_input(c32c5000,c32df800,c32caed8,0,c0970133,...) at ether_input+0x202
: nve_ospacketrx(c32cae00,dd175810,1,0,0,...) at nve_ospacketrx+0xd9
: UpdateReceiveDescRingData(c08981a4,c08981c4,c0898260,c089828c,c08982a4,...) at UpdateReceiveDescRingData+0x2f8
: nve_osalloc(c32cb200,dd391010,c32cae00,c0898108,c08981a4,...) at nve_osalloc
: _end(c33a5c00,c0a9e784,3065766e,0,0,...) at 0xc32aa600
: _end(c32cb200,dd391010,c32cae00,c0898108,c08981a4,...) at 0xc3327680
: _end(c33a5c00,c0a9e784,3065766e,0,0,...) at 0xc32aa600
: _end(c32cb200,dd391010,c32cae00,c0898108,c08981a4,...) at 0xc3327680

The last 2 strings repeat themselves a lot of times (kdb seems to have a
limit of 1024 stack trace strings, which came in very helpful). No info
about the actual hang... The LOR looks like #009
(http://sources.zabbadoz.net/freebsd/lor/009.html), but is different
actually. Any ideas? BTW, what is _end?

-- 
DoubleF
No virus detected in this message. Ehrm, wait a minute...
/kernel: pid 56921 (antivirus), uid 32000: exited on signal 9
Oh yes, no virus:)

Received on Wed Jan 10 2007 - 11:08:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC