Re: Kernel crash during video transcoding

From: Alexandre Levy <a13xlevy_at_gmail.com>
Date: Sat, 15 Aug 2020 20:35:05 +0100
Hi,

I could finally generate a crash dump even with a black screen, I had to
guess I was in the crash handler and I type "dump" and enter which worked.
The driver logs "[drm] Cannot find any crtc or sizes" which I guess is the
reason why I couldn't see anything on my screen.

Back to the initial problem, I could start a kgdb session, loaded the
i915kms.ko symbols and here are the results :

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff8049c26a in db_dump (dummy=<optimized out>,
dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>) at
/usr/src/sys/ddb/db_command.c:575
#3  0xffffffff8049c02c in db_command (last_cmdp=<optimized out>,
cmd_table=<optimized out>, dopager=1) at /usr/src/sys/ddb/db_command.c:482
#4  0xffffffff8049bd9d in db_command_loop () at
/usr/src/sys/ddb/db_command.c:535
#5  0xffffffff8049f048 in db_trap (type=<optimized out>, code=<optimized
out>) at /usr/src/sys/ddb/db_main.c:270
#6  0xffffffff80c1b374 in kdb_trap (type=3, code=0, tf=<optimized out>) at
/usr/src/sys/kern/subr_kdb.c:699
#7  0xffffffff8100ca98 in trap (frame=0xfffffe00d7567300) at
/usr/src/sys/amd64/amd64/trap.c:576
#8  <signal handler called>
#9  kdb_enter (why=0xffffffff811d5de0 "panic", msg=<optimized out>) at
/usr/src/sys/kern/subr_kdb.c:486
#10 0xffffffff80bd00be in vpanic (fmt=<optimized out>, ap=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:902
#11 0xffffffff80bcfe53 in panic (fmt=0xffffffff81c8c7c8 <cnputs_mtx>
"\b\214\031\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:839
#12 0xffffffff8100cee7 in trap_fatal (frame=0xfffffe00d7567600, eva=0) at
/usr/src/sys/amd64/amd64/trap.c:915
#13 0xffffffff8100c360 in trap (frame=0xfffffe00d7567600) at
/usr/src/sys/amd64/amd64/trap.c:212
#14 <signal handler called>
#15 _rw_wowned (c=0x2659c92217d5aa52) at /usr/src/sys/kern/kern_rwlock.c:270
#16 0xffffffff80ec23ed in vm_page_busy_acquire (m=0xfffffe00040ff9e8,
allocflags=16) at /usr/src/sys/vm/vm_page.c:884
#17 0xffffffff82b4e980 in remap_io_mapping (vma=0xfffff80315148300,
addr=<optimized out>, pfn=<optimized out>, size=<optimized out>,
iomap=<optimized out>)
    at
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/intel_freebsd.c:193
#18 0xffffffff82be1c5f in i915_gem_fault (dummy=<optimized out>,
vmf=<optimized out>)
    at
/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/gem/i915_gem_mman.c:367
#19 0xffffffff82cb5ddf in linux_cdev_pager_populate
(vm_obj=0xfffff80368501420, pidx=<optimized out>, fault_type=<optimized
out>, max_prot=<optimized out>,
    first=0xfffffe00d7567868, last=0xfffffe00d7567888) at
/usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:554
#20 0xffffffff80ea9e8f in vm_pager_populate (object=0x2659c92217d5aa52,
pidx=18446741874754451944, fault_type=0, max_prot=0 '\000',
first=<optimized out>, last=<optimized out>)
    at /usr/src/sys/vm/vm_pager.h:172
#21 vm_fault_populate (fs=<optimized out>) at /usr/src/sys/vm/vm_fault.c:444
#22 vm_fault_allocate (fs=<optimized out>) at
/usr/src/sys/vm/vm_fault.c:1028
#23 vm_fault (map=<optimized out>, vaddr=<optimized out>,
fault_type=<optimized out>, fault_flags=<optimized out>, m_hold=<optimized
out>) at /usr/src/sys/vm/vm_fault.c:1338
#24 0xffffffff80ea98ee in vm_fault_trap (map=0xfffffe00c0f539e8,
vaddr=<optimized out>, fault_type=<optimized out>, fault_flags=0,
signo=0xfffffe00d7567ac4,
    ucode=0xfffffe00d7567ac0) at /usr/src/sys/vm/vm_fault.c:585
#25 0xffffffff8100d0de in trap_pfault (frame=0xfffffe00d7567b00,
usermode=<optimized out>, signo=<optimized out>, ucode=0xffffffff81d1de80
<w_locklistdata+160624>)
    at /usr/src/sys/amd64/amd64/trap.c:817
#26 0xffffffff8100c72c in trap (frame=0xfffffe00d7567b00) at
/usr/src/sys/amd64/amd64/trap.c:340
#27 <signal handler called>
#28 0x000000080296659a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffbf38
(kgdb) list *0xffffffff82be1c5f
0xffffffff82be1c5f is in i915_gem_fault
(/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/gem/i915_gem_mman.c:367).
362             ret = i915_vma_pin_fence(vma);
363             if (ret)
364                     goto err_unpin;
365
366             /* Finally, remap it using the new GTT offset */
367             ret = remap_io_mapping(area,
368                                    area->vm_start +
(vma->ggtt_view.partial.offset << PAGE_SHIFT),
369                                    (ggtt->gmadr.start +
vma->node.start) >> PAGE_SHIFT,
370                                    min_t(u64, vma->size, area->vm_end -
area->vm_start),
371                                    &ggtt->iomap);
(kgdb) list *0xffffffff82b4e980
0xffffffff82b4e980 is in remap_io_mapping
(/usr/ports/graphics/drm-devel-kmod/work/drm-kmod-drm_v5.3_4/drivers/gpu/drm/i915/intel_freebsd.c:193).
188                 pidx++, pa += PAGE_SIZE) {
189     retry:
190                     m = vm_page_grab(vm_obj, pidx, VM_ALLOC_NOCREAT);
191                     if (m == NULL) {
192                             m = PHYS_TO_VM_PAGE(pa);
193                             if (!vm_page_busy_acquire(m,
VM_ALLOC_WAITFAIL))
194                                     goto retry;
195                             if (vm_page_insert(m, vm_obj, pidx)) {
196                                     vm_page_xunbusy(m);
197                                     VM_OBJECT_WUNLOCK(vm_obj);
(kgdb) list *0xffffffff80ec23ed
0xffffffff80ec23ed is in vm_page_busy_acquire
(/usr/src/sys/vm/vm_page.c:884).
879                     if (vm_page_tryacquire(m, allocflags))
880                             return (true);
881                     if ((allocflags & VM_ALLOC_NOWAIT) != 0)
882                             return (false);
883                     if (obj != NULL)
884                             locked = VM_OBJECT_WOWNED(obj);
885                     else
886                             locked = false;
887                     MPASS(locked || vm_page_wired(m));
888                     if (_vm_page_busy_sleep(obj, m, m->pindex, "vmpba",
allocflags,

It seems like the problem occured when calling vm_page_busy_acquire(m,
VM_ALLOC_WAITFAIL) where m might be a NULL pointer ? I am very new to
kernel debugging so not sure where to go from there.

Thanks.

Le lun. 10 août 2020 à 12:04, Hans Petter Selasky <hps_at_selasky.org> a
écrit :

> Hi,
>
> On 2020-08-10 12:59, Alexandre Levy wrote:
> > Looking at the code, the error happens during the call to VM_OBJECT_WLOCK
> > (memory page locking for write ?) in the intel_freebsd.c (see [1] below).
> > I'm out for a few days but I'll try to dig more into it when I'm back
> next
> > weekend although I have no experience in the drm-devel-kmod codebase. In
> > the meantime if you have any suggestions on debugging this further I'm
> > happy to follow them.
>
> The problem is likely that the vm_obj is NULL.
>
> I think I recall that this function is special and can only be called
> from a certain context, unlike in Linux. Will need the full backtrace
> with line numbers in order to debug this.
>
> --HPS
>
Received on Sat Aug 15 2020 - 17:35:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC