On Fri, 8 Sep 2006, Peter Holm wrote: > During boot of GENERIC HEAD from Sep 7 07:29 UTC I got this page > fault: > > Kernel page fault with the following non-sleepable locks held: > exclusive sleep mutex unp r = 0 (0xc0a5520c) locked _at_ > kern/uipc_usrreq.c:987 > KDB: stack backtrace: > kdb_backtrace(1,c410b000,c,c3f77a20,e43f7a28,...) at > kdb_backtrace+0x29 > witness_warn(5,0,c0941302) at witness_warn+0x192 > trap(8,28,c4190028,c413a7a8,c4195690,...) at trap+0x108 > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0xc06e01e6, esp = 0xe43f7a70, ebp = 0xe43f7bfc --- > unp_connect(c41ce000,c3f797e0,c3f77a20,c0a5520c,0,...) at > unp_connect+0x292 > uipc_connect(c41ce000,c3f797e0,c3f77a20) at uipc_connect+0x3e > soconnect(c41ce000,c3f797e0,c3f77a20) at soconnect+0x4e > kern_connect(c3f77a20,3,c3f797e0,c3f797e0,0,...) at kern_connect+0x76 > connect(c3f77a20,e43f7d04) at connect+0x30 > syscall(3b,3b,3b,1,8270000,...) at syscall+0x256 > > http://people.freebsd.org/~pho/stress/log/cons207.html. > > The core file is toast and I missed a back trace of pid 678 :-( This is likely one of the remaining race conditions in UNIX domain sockets having to do with simultaneous connect and close, which occur due to dropping locks for either a blocking name lookup or a recursion via the socket layer into the protocol a second time. When the UNIX domain socket global lock is dropped and re-acquired, the UNIX domain socket code needs to re-evaluate its assumptions regarding any references it has to other UNIX domain sockets, which may have "gone away" while the lock was released. Interestingly, many of these races also existed in 4.x and before, but they are more exposed with greater kernel parallelism. I recently closed a spate of them, but it looks like a few remain. In this case, the listen socket has possibly been closed (although possibly not) while sonewconn() is called. It could be a reference needs to be added to so2 before dropping the unp lock. I saw John's follow-up, but if ups/he don't have a fixed in a few days once I get back to the UK, I can investigate. Send me a ping next week if I appear to forget :-). Robert N M Watson Computer Laboratory University of CambridgeReceived on Sat Sep 09 2006 - 10:33:33 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:00 UTC