Re: [drm2][panic] Running XOrg with SNA enabled causes system panic after few hours on G33

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Sat, 15 Jun 2013 08:17:46 +0300
On Fri, Jun 14, 2013 at 10:16:15AM +0300, Artyom Mirgorodskiy wrote:
> Thank you! This patch also solve my issue (unable shutdown):
> http://lists.freebsd.org/pipermail/freebsd-current/2013-May/042011.html
> 
> On Tuesday 11 June 2013 12:34:16 Oleg Sidorkin wrote:
> > Hello.
> > 
> > I'm running recent 9.1/stable with the recent XOrg on the system with
> > G33 chipset.
> > My pciconf -lvb output is here: http://pastebin.com/LDzKzf1i
> > 
> > If I add
> > Option "AccelMethod" "sna"
> > to my xorg.conf system panics after few hours:
> > 
> > (kgdb) bt
> > #0  doadump (textdump=<value optimized out>)
> >     at /usr/src/sys/kern/kern_
> > shutdown.c:272
> > #1  0xffffffff8050a19f in kern_reboot (howto=260)
> >     at /usr/src/sys/kern/kern_shutdown.c:449
> > #2  0xffffffff8050a6a3 in panic (fmt=0x104 <Address 0x104 out of bounds>)
> >     at /usr/src/sys/kern/kern_shutdown.c:637
> > #3  0xffffffff80765f77 in vm_page_insert (m=0xfffffe0226126b50,
> >     object=0xfffffe0208de8488, pindex=3) at /usr/src/sys/vm/vm_page.c:914
> > #4  0xffffffff814a889d in i915_gem_pager_fault (vm_obj=0xfffffe0208de8488,
> >     offset=3, prot=<value optimized out>, mres=0xffffff824705b680)
> >     at /usr/src/sys/modules/drm2/i915kms/../../../dev/drm2/i915/i915_gem.c:1429
> > #5  0xffffffff80747fe3 in dev_pager_getpages (object=0xfffffe0208de8488,
> >     ma=0xffffff824705b680, count=1, reqpage=<value optimized out>)
> >     at /usr/src/sys/vm/device_pager.c:260
> > #6  0xffffffff80754bb6 in vm_fault_hold (map=0xfffffe000c247188,
> >     vaddr=34458505216, fault_type=2 '\002', fault_flags=0, m_hold=0x0)
> >     at vm_pager.h:128
> > #7  0xffffffff80756ca3 in vm_fault (map=0xfffffe000c247188, vaddr=34458505216,
> >     fault_type=<value optimized out>, fault_flags=0)
> >     at /usr/src/sys/vm/vm_fault.c:229
> > #8  0xffffffff8078e01f in trap_pfault (frame=0xffffff824705bc40, usermode=1)
> >     at /usr/src/sys/amd64/amd64/trap.c:762
> > #9  0xffffffff8078e864 in trap (frame=0xffffff824705bc40)
> > 
> > (kgdb) bt full
> > #0  doadump (textdump=<value optimized out>)
> >     at /usr/src/sys/kern/kern_shutdown.c:272
> > No locals.
> > #1  0xffffffff8050a19f in kern_reboot (howto=260)
> >     at /usr/src/sys/kern/kern_shutdown.c:449
> >         _ep = (struct eventhandler_entry *) 0x0
> >         _el = (struct eventhandler_list *) 0xfffffe0009c7f700
> >         first_buf_printf = 1
> > #2  0xffffffff8050a6a3 in panic (fmt=0x104 <Address 0x104 out of bounds>)
> >     at /usr/src/sys/kern/kern_shutdown.c:637
> >         td = (struct thread *) 0x0
> >         bootopt = <value optimized out>
> >         newpanic = <value optimized out>
> >         ap = {{gp_offset = 8, fp_offset = 48,
> >     overflow_arg_area = 0xffffff824705b570,
> >     reg_save_area = 0xffffff824705b490}}
> >         panic_cpu = 3
> >         buf = "vm_page_insert: page already inserted", '\0' <repeats 218 times>
> > #3  0xffffffff80765f77 in vm_page_insert (m=0xfffffe0226126b50,
> >     object=0xfffffe0208de8488, pindex=3) at /usr/src/sys/vm/vm_page.c:914
> >         root = 0x0
> > #4  0xffffffff814a889d in i915_gem_pager_fault (vm_obj=0xfffffe0208de8488,
> >     offset=3, prot=<value optimized out>, mres=0xffffff824705b680)
> > 
> > (kgdb) up 4
> > #4  0xffffffff814a889d in i915_gem_pager_fault (vm_obj=0xfffffe0208de8488,
> >     offset=3, prot=<value optimized out>, mres=0xffffff824705b680)
> >     at /usr/src/sys/modules/drm2/i915kms/../../../dev/drm2/i915/i915_gem.c:1429
> > 1429            vm_page_insert(m, vm_obj, OFF_TO_IDX(offset));
> > (kgdb) p vm_obj
> > $1 = 0xfffffe0208de8488
> > (kgdb) p m->object
> > $2 = 0xfffffe0208de8488
> > 
> > It works fine for weeks without Option "AccelMethod" "sna".
> > 
> > I replaced
> >  vm_page_insert(m, vm_obj, OFF_TO_IDX(offset));
> > with the code
> >        if (m->object==NULL){
> >            vm_page_insert(m, vm_obj, OFF_TO_IDX(offset));
> >        }
> >        else{
> >            if(m->object!=vm_obj)
> >                panic("i915_gem_pager_fault: tried to assign already
> > assigned page to the different object");
> >        }
> > and it worked with SNA enabled for about a week with no problems. But
> > I'm not sure that is a good solution.
> > 
> > I can provide additional info if required.
> > 
> > Thanks
> > --
> > Oleg Sidorkin
> > _______________________________________________
> > freebsd-x11_at_freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-x11
> > To unsubscribe, send any mail to "freebsd-x11-unsubscribe_at_freebsd.org"
> -- 
> Artyom Mirgorodskiy
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"

I did not see the original mail with the backtrace.

FWIW, it seems that the issue is that other thread might have faulted
on the same GTT offset and bound the page before the paniced thread.
If this is indeed the situation, then the proper fix is to check for
the race, and not to just avoid the insertion.  Re-instantiating the
fences is particularly wrong IMO.

Try this patch (untested, I only compiled it).

diff --git a/sys/dev/drm2/i915/i915_gem.c b/sys/dev/drm2/i915/i915_gem.c
index 8ce8bec..c505cdb 100644
--- a/sys/dev/drm2/i915/i915_gem.c
+++ b/sys/dev/drm2/i915/i915_gem.c
_at__at_ -1362,7 +1362,6 _at__at_ unlocked_vmobj:
 	cause = ret = 0;
 	m = NULL;
 
-
 	if (i915_intr_pf) {
 		ret = i915_mutex_lock_interruptible(dev);
 		if (ret != 0) {
_at__at_ -1372,6 +1371,23 _at__at_ unlocked_vmobj:
 	} else
 		DRM_LOCK(dev);
 
+	/*
+	 * Since the object lock was dropped, other thread might have
+	 * faulted on the same GTT address and instantiated the
+	 * mapping for the page.  Recheck.
+	 */
+	VM_OBJECT_WLOCK(vm_obj);
+	m = vm_page_lookup(vm_obj, OFF_TO_IDX(offset));
+	if (m != NULL) {
+		if ((m->flags & VPO_BUSY) != 0) {
+			DRM_UNLOCK(dev);
+			vm_page_sleep(m, "915pee");
+			goto retry;
+		}
+		goto have_page;
+	} else
+		VM_OBJECT_WUNLOCK(vm_obj);
+
 	/* Now bind it into the GTT if needed */
 	if (!obj->map_and_fenceable) {
 		ret = i915_gem_object_unbind(obj);
_at__at_ -1425,8 +1441,9 _at__at_ unlocked_vmobj:
 		goto retry;
 	}
 	m->valid = VM_PAGE_BITS_ALL;
-	*mres = m;
 	vm_page_insert(m, vm_obj, OFF_TO_IDX(offset));
+have_page:
+	*mres = m;
 	vm_page_busy(m);
 
 	CTR4(KTR_DRM, "fault %p %jx %x phys %x", gem_obj, offset, prot,

Received on Sat Jun 15 2013 - 03:17:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:38 UTC