I've been getting a lot of these panics on recent -current builds that are being caused by the nvidia driver: panic: spin locks can only use msleep_spin I managed to compile the part of the driver that there is source code for with debug symbols, but the only thing that's showing up in the stack trace are obfuscated function names from the binary module. Some of the addresses look very suspicious, so it seems the stack is likely corrupted. (the nvidia module exists at 0xc0c2e000 - 0xc13cf000, the ones in the 0xc590+ range don't seem to correspond to any loaded module) #3 0xc06de0e2 in unlock_spin (lock=Could not find the frame base for "unlock_spin". ) at /compile/src/sys/kern/kern_mutex.c:166 #4 0xc06b46fb in _cv_wait (cvp=0xc59d6a18, lock=0xc59d6a00) at /compile/src/sys/kern/kern_condvar.c:131 #5 0xc1012c71 in ?? () #6 0xc59d6a18 in ?? () #7 0xc59d6a00 in ?? () #8 0xc110a9b0 in ?? () #9 0x00000273 in ?? () #10 0xc5a17000 in ?? () #11 0xc593e800 in ?? () #12 0xff78e84c in ?? () #13 0xc0ce5316 in _nv009651rm () #14 0xc59d6a00 in ?? () #15 0x20000000 in ?? () #16 0x00000028 in ?? () #17 0xc5a2dd00 in ?? () #18 0xff78e86c in ?? () #19 0xc9c8e8e0 in ?? () #20 0xff78e86c in ?? () #21 0xc0cedda4 in _nv009831rm () #22 0xc59d6a00 in ?? () #23 0x00000001 in ?? () #24 0x00000000 in ?? () #25 0xc9c8e8e0 in ?? () #26 0xc9c8e8e0 in ?? () #27 0xc5a2dc00 in ?? () #28 0xff78e88c in ?? () #29 0xc1014ca8 in ?? () #30 0x00000000 in ?? () #31 0xc5a2dd00 in ?? () #32 0xc9332b00 in ?? () #33 0xc5a2dc00 in ?? () #34 0xc5a2dd00 in ?? () #35 0xc5a2de00 in ?? () #36 0xff78e8ac in ?? () #37 0xc1011e0b in ?? () #38 0xc5a2dc00 in ?? () #39 0xc5a2de00 in ?? () #40 0xd7769800 in ?? () #41 0xc5a2de00 in ?? () #42 0xca61faa0 in ?? () #43 0xc1365f60 in ?? () #44 0xff78e8cc in ?? () #45 0xc06b6c13 in giant_close (dev=0xc59d6a00, fflag=536870912, devtype=40, td=0xc5a2dd00) at /compile/src/sys/kern/kern_conf.c:327 (kgdb) up 3 166 panic("spin locks can only use msleep_spin"); (kgdb) print lock Could not find the frame base for "unlock_spin". (kgdb) up 1 #4 0xc06b46fb in _cv_wait (cvp=0xc59d6a18, lock=0xc59d6a00) at /compile/src/sys/kern/kern_condvar.c:131 131 lock_state = class->lc_unlock(lock); (kgdb) print *lock $8 = {lo_name = 0xc110a995 "rm.mutex_mtx", lo_type = 0xc110a995 "rm.mutex_mtx", lo_flags = 720896, lo_witness_data = { lod_list = {stqe_next = 0x0}, lod_witness = 0x0}} (kgdb) print *class $10 = {lc_name = 0xc095aba9 "spin mutex", lc_flags = 10, lc_ddb_show = 0, lc_lock = 0xc06de0f0 <lock_spin>, lc_unlock = 0xc06de0d0 <unlock_spin>} rm.mutex_mtx is indeed created in nvidia_os.c with mtx_init(&mtx->mutex_mtx, "rm.mutex_mtx", NULL, MTX_SPIN | MTX_RECURSE); I don't see any explicit calls to unlock_spin in the part we have source for, just mtx_unlock_spin. I'm unsure why the spin mutex class has pointers to these dummy functions that simply panic, but I'm not very well versed on the internals of kernel lock primitives. Any suggestions? I'm not sure if this is an nvidia problem that we need to refer to them or if a change in the kernel has broken something it depends on. CraigReceived on Fri Sep 21 2007 - 18:25:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:18 UTC