Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)

From: Bakul Shah <bakul_at_iitbombay.org>
Date: Fri, 27 Nov 2020 15:07:10 -0800
> On Nov 27, 2020, at 1:47 PM, Bakul Shah <bakul_at_iitbombay.org> wrote:
> 
> 
> 
>> On Nov 27, 2020, at 9:09 AM, Rebecca Cran <rebecca_at_bsdio.com> wrote:
>> 
>> On 11/27/20 4:29 AM, Hans Petter Selasky wrote:
>>> 
>>> Is the problem always triggered by hald? If you disable hald in rc.conf, does the system run for a longer period of time?
>> 
>> It turns out that disabling ntpd let the system run for a longer period of time - until I ran "sysctl sys" at which point I got a panic.
>> 
>> And this time the panic actually implicates amdgpu.ko, which is an improvement:
>> 
>> 
>> #9  0x0000000000000000 in ?? ()
>> #10 0xffffffff82a14c4e in amdgpu_device_get_pcie_replay_count ()
>>   from /boot/modules/amdgpu.ko
>> #11 0xffffffff82a14b80 in sysctl_handle_attr () from /boot/modules/amdgpu.ko
>> 
>> #12 0xffffffff80c06cc1 in sysctl_root_handler_locked (oid=0xfffffe02133ff000,
>>    arg1=0xfffffe016e360980, arg2=-8724518803888, req=0xfffffe016e360980,
>>    tracker=0xfffff81099af6280) at /usr/src/sys/kern/kern_sysctl.c:184
>> #13 0xffffffff80c0610c in sysctl_root (oidp=<optimized out>,
>>    arg1=0xfffff810aa27e650, arg2=-2100190360, req=0xfffffe016e360980)
>>    at /usr/src/sys/kern/kern_sysctl.c:2211
>> 
>> 
>> Since it _is_ a problem in amdgpu, I'll stop this thread and re-post on freebsd-x11.
> 
> FWIW, I am using amdgpu on a Ryzen 5 3500U system on a couple days old
> -current (r368025). "sysctl sys" complains about "unknown oid 'sys'".
> I am runing hald & ntpd.  I had a few amdgpu related panics initially
> but they vanished once I added
> 	PORTS_MODULES=graphics/drm-devel-kmod
> to /etc/src.conf to compile it along with the kernel. I am running
> GENERIC-NODEBUG. The machine gets rebooted when I install a new kernel
> (usually once a week).
> 
> My guess is some weird interaction rather than something in amdgpu.

To get sysctl sys working I compiled a GENERIC kernel from today's
368108 revision and so far there are no problems.

$ sysctl sys.device.drmn0.pcie_replay_count
sys.device.drmn0.pcie_replay_count: 0

sysctl -a also works.

Last commit log on drm-devel-kmod (the last tiem may be what you're
running into):
Author: manu <manu_at_FreeBSD.org>
Date:   Mon Nov 9 13:37:12 2020 +0000

    drm-current-kmod/drm-devel-kmod: Update to latest version

    - Use acpi code from base (thanks to wulf_at_)
    - Add radeon/i386 patches (thanks to tilj_at_)
    - Translate O_ flags for linuxulator (thanks to Greg V)
    - Lot of linuxkpi cleanup
    - Hack for amdgpu when the IP isn't init properly, this happens
      on one of my laptop with a dGPU. We still don't support it but
      we don't panic when we load amdgpu
Received on Fri Nov 27 2020 - 22:07:15 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:26 UTC