(unknown charset) Re: radeon_cp_texture: page fault with non-sleepable locks held

From: (unknown charset) Andriy Gapon <avg_at_freebsd.org> Date: Mon, 08 Nov 2010 13:50:25 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC

on 05/11/2010 09:27 Andriy Gapon said the following:
> 
> I use FreeSBD head and KDE 4 with all the bells and whistles enabled.
> Apparently recent KDE update has enabled even more of them, because I started to
> have panics with a kernel that has INVARIANTS and WITNESS enabled.

I tried to solve the problem by changing drmdev from mutex to sx:
http://people.freebsd.org/~avg/drm-sx.diff

The things have improved, I am not getting the panic anymore.
Instead I have this LOR now:
lock order reversal:
1st 0xffffff0001b968a0 drmdev (drmdev) _at_ /usr/src/sys/dev/drm/drm_drv.c:791
2nd 0xffffff0072a87200 user map (user map) _at_ /usr/src/sys/vm/vm_map.c:3548
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff801b8b3a = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff803a7a6a = kdb_backtrace+0x3a
_witness_debugger() at 0xffffffff803bd40c = _witness_debugger+0x2c
witness_checkorder() at 0xffffffff803be879 = witness_checkorder+0x959
_sx_slock() at 0xffffffff80378af8 = _sx_slock+0x88
_vm_map_lock_read() at 0xffffffff805109e6 = _vm_map_lock_read+0x36
vm_map_lookup() at 0xffffffff805127b4 = vm_map_lookup+0x54
vm_fault() at 0xffffffff805097f9 = vm_fault+0xf9
trap_pfault() at 0xffffffff80545d0f = trap_pfault+0x11f
trap() at 0xffffffff80546597 = trap+0x657
calltrap() at 0xffffffff805305c8 = calltrap+0x8
--- trap 0xc, rip = 0xffffffff8054405d, rsp = 0xffffff81241b47f0, rbp =
0xffffff81241b4870 ---
copyin() at 0xffffffff8054405d = copyin+0x3d
radeon_cp_texture() at 0xffffffff8022fbd7 = radeon_cp_texture+0x167
drm_ioctl() at 0xffffffff8020fa38 = drm_ioctl+0x318
devfs_ioctl_f() at 0xffffffff802dd649 = devfs_ioctl_f+0x109
kern_ioctl() at 0xffffffff803c1107 = kern_ioctl+0x1f7
ioctl() at 0xffffffff803c12c8 = ioctl+0x168
syscallenter() at 0xffffffff803b57be = syscallenter+0x26e
syscall() at 0xffffffff80545e52 = syscall+0x42
Xfast_syscall() at 0xffffffff805308a2 = Xfast_syscall+0xe2

Is this a serious LOR?
How can I resolve it?

> The panic:
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex drmdev (drmdev) r = 0 (0xffffff0001b968a0) locked _at_
> /usr/src/sys/dev/drm/drm_drv.c:791
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0xffffffff801b8afa = db_trace_self_wrapper+0x2a
> kdb_backtrace() at 0xffffffff803a7afa = kdb_backtrace+0x3a
> _witness_debugger() at 0xffffffff803bd49c = _witness_debugger+0x2c
> witness_warn() at 0xffffffff803bed32 = witness_warn+0x322
> trap() at 0xffffffff8054639f = trap+0x39f
> calltrap() at 0xffffffff80530688 = calltrap+0x8
> --- trap 0xc, rip = 0xffffffff8054411d, rsp = 0xffffff81241917f0, rbp =
> 0xffffff8124191870 ---
> copyin() at 0xffffffff8054411d = copyin+0x3d
> radeon_cp_texture() at 0xffffffff8022fcc7 = radeon_cp_texture+0x167
> drm_ioctl() at 0xffffffff8020fa78 = drm_ioctl+0x318
> devfs_ioctl_f() at 0xffffffff802dd739 = devfs_ioctl_f+0x109
> kern_ioctl() at 0xffffffff803c1197 = kern_ioctl+0x1f7
> ioctl() at 0xffffffff803c1358 = ioctl+0x168
> syscallenter() at 0xffffffff803b584e = syscallenter+0x26e
> syscall() at 0xffffffff80545f12 = syscall+0x42
> Xfast_syscall() at 0xffffffff80530962 = Xfast_syscall+0xe2
> --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x801f96a1c, rsp = 0x7fffffffe7a8,
> rbp = 0xc020644e ---
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x832372000
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff8054411d
> stack pointer           = 0x28:0xffffff81241917f0
> frame pointer           = 0x28:0xffffff8124191870
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 3
> current process         = 3439 (initial thread)
> trap number             = 12
> panic: page fault
> cpuid = 0
> 
> 
> The panic is quite obvious: drmdev mutex is taken and held in drm_ioctl() and
> radeon_cp_texture() can perform copyin and/or copyout, so it's a matter of a
> chance (or proper workload) to hit a page fault there.
> 
> What's not obvious is how to properly fix this.
> Any ideas?
> 
> Probably less important is what started to trigger the problem.  Because the
> code hasn't been changed in ages and I have never seen this issue before.
> But, d'oh, it seems that this issue has been already reported:
> http://www.mail-archive.com/freebsd-hackers_at_freebsd.org/msg67757.html
> 
> I will appreciate any help.
> Thanks!

-- 
Andriy Gapon