On Fri, 29 Aug 2008, John Baldwin wrote: > Unfortunately it hung trying to dump, so all I have is the stack trace from > DDB. This is recent HEAD running stress2 > > panic: _mtx_lock_sleep: recursed on non-recursive mutex rtentry _at_ ../../1 Kip and I have theorized that increased parallelism at higher layers of the network stack is exposing route locking and reference counting to more stress than it had done previously, and that as such we're starting to trigger races in the routing code more than we used to. While I wouldn't rule out a FIB-related bug, it seems more likely to me that we've hit a general bug in locking/references in the ethernet link layer / ARP, and we need to take a careful look at what's going on throughout that layer. Unfortunately, that's not something I have time to work on currently, so it would be great if people with an existing interest in the routing code (Julian and Qing have done the most work there recently?) could spend a few hours looking really carefully at what is happening. Robert N M Watson Computer Laboratory University of Cambridge > > cpuid = 1 > KDB: enter: panic > [thread pid 14025 tid 100928 ] > Stopped at kdb_enter+0x3d: movq $0,0x435054(%rip) > db> tr > Tracing pid 14025 tid 100928 td 0xffffff0003773360 > kdb_enter() at kdb_enter+0x3d > panic() at panic+0x14b > _mtx_lock_flags() at _mtx_lock_flags > _mtx_lock_flags() at _mtx_lock_flags+0xc3 > rt_check_fib() at rt_check_fib+0x1ea > arpresolve() at arpresolve+0x77 > ether_output() at ether_output+0x180 > ip_output() at ip_output+0xb4f > udp_send() at udp_send+0x47d > sosend_dgram() at sosend_dgram+0x1fa > soo_write() at soo_write+0x30 > dofilewrite() at dofilewrite+0x7a > kern_writev() at kern_writev+0x52 > write() at write+0x4d > syscall() at syscall+0x1bf > Xfast_syscall() at Xfast_syscall+0xab > --- syscall (4, FreeBSD ELF64, write), rip = 0x80071cb7c, rsp = > 0x7fffffffe628,- > db> c > Uptime: 1h39m18s > Physical memory: 2038 MB > Dumping 263 MB:pid 14025 (udp), uid 26840, was killed: exceeded maximum CPU > limt > pid 14099 (udp), uid 26840, was killed: exceeded maximum CPU limit > pid 14100 (udp), uid 26840, was killed: exceeded maximum CPU limit > > -- > John Baldwin > _______________________________________________ > freebsd-current_at_freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org" >Received on Sat Aug 30 2008 - 07:52:09 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:34 UTC