Re: unkillable process consuming 100% cpu

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu>
Date: Wed, 13 Nov 2019 09:49:37 -0800
On Wed, Nov 13, 2019 at 04:22:19PM +0100, Hans Petter Selasky wrote:
> On 2019-11-13 15:52, Steve Kargl wrote:
> >      at /usr/src/sys/amd64/amd64/trap.c:743
> > #7  0xffffffff808b0468 in trap (frame=0xfffffe00b460e0c0)
> >      at /usr/src/sys/amd64/amd64/trap.c:407
> > #8  <signal handler called>
> > #9  0x0000000000000000 in ?? ()
> > #10 0xffffffff817d2c0f in radeon_ttm_tt_to_gtt (ttm=0xfffff80061eeb248)
> >      at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:720
> > #11 radeon_ttm_tt_set_userptr (ttm=0xfffff80061eeb248, addr=1,
> >      flags=2147483647)
> 
> Hi,
> 
> I don't see any function call here. Can you try to double check the 
> backtrace?
> 
> Which version of FreeBSD is this?
> 

% uname -a (trimmed)
FreeBSD 13.0-CURRENT r353571

% kgdb /usr/lib/debug/boot/kernel/kernel.debug vmcore.2
% bt
...
#7  0xffffffff808b0468 in trap (frame=0xfffffe00b460e0c0)
    at /usr/src/sys/amd64/amd64/trap.c:407
#8  <signal handler called>
#9  0x0000000000000000 in ?? ()
#10 0xffffffff817d2c0f in radeon_ttm_tt_to_gtt (ttm=0xfffff80061eeb248)
    at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:720
#11 radeon_ttm_tt_set_userptr (ttm=0xfffff80061eeb248, addr=1, 
    flags=2147483647)
    at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:804
#12 0xffffffff817adc9b in radeon_is_px (dev=0xfffff8017fe84e00)
    at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_device.c:156

Looking at radeon_ttm.c, line 720 is the if-stmt in this function

static struct radeon_ttm_tt *radeon_ttm_tt_to_gtt(struct ttm_tt *ttm)
{
 if (!ttm || ttm->func != &radeon_backend_func)
  return NULL;
 return (struct radeon_ttm_tt *)ttm;
}

(kgdb) p ttm->func
$2 = (struct ttm_backend_func *) 0x2310000
(kgdb) p &radeon_backend_func
$4 = (struct ttm_backend_func *) 0xffffffff8186d870 <radeon_backend_func>

AFAIK, 0x2310000 is not a valid address.

(kgdb) p *ttm
$5 = {bdev = 0xffffffff819021ef, func = 0x2310000, dummy_read_page = 0x0, 
  pages = 0xfffff800612c0000, page_flags = 2173789980, num_pages = 0, 
  sg = 0x0, glob = 0x2a, swap_storage = 0xfffff8017fe84e00, 
  caching_state = (unknown: 145613312), 
  state = (tt_unbound | tt_unpopulated | unknown: 4294965248)}

Moving to frame 12 suggests that the stack is corrupt (whether
by the dump or the crash I don't know)

(kgdb) frame 12
#12 0xffffffff817adc9b in radeon_is_px (dev=0xfffff8017fe84e00)
    at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_device.c:156
156             if (rdev->flags & RADEON_IS_PX)
(kgdb) p *dev
Cannot access memory at address 0xfffff8017fe84e00
(kgdb) p rdev
$25 = (struct radeon_device *) 0x0


-- 
Steve
Received on Wed Nov 13 2019 - 16:49:41 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC