Re: hwpstate_intel hangs kernel

From: Andreas Nilsson <andrnils_at_gmail.com>
Date: Wed, 27 May 2020 20:25:52 +0200
On Fri, May 22, 2020 at 1:57 AM Diane Bruce <db_at_db.net> wrote:

> On Wed, Feb 05, 2020 at 02:45:50PM +0100, Andreas Nilsson wrote:
>
> Ok I am going to respond to this old email from February..
>
> > Hello,
> >
> > I upgraded to a newer version,  git 87d669d3863-c266265, and I do not
> > experience the random hang anymore. The machine still hangs on boot on
> > "hwpstate_intel0: <Intel Speed Shift> on cpu0" unless I set
> > 'hint.hwpstate_intel.0.disabled="1"' in loader.conf.
> >
>
> As a few others know on IRC I ran into exactly this same problem
> on a brand new Lenovo Carbon. I missed this thread somehow.
> I also had to bisect the commit. Would it be possible to put
> a note into UPDATING and default to disabled=1 for now? ;)
>


Well, I've been trying to chase this a bit more. But I could sure use some
help from more experienced kernel developers.

 debug.hwpstate_verbose="1" in loader.conf, and booting in verbose mode I
get this:

pcib0: allocated type 4 (8x3f8-8xaf8) for rid 8 of uart0
uart0 failed to probe at port 0x3f8 irg 4 on isa0
pcib0: allocated type 4 (0x2f8-0x2f8) for rid 0 uart1
uart1 failed to probe at port 0x2f8 irq 3 on isa0
isa_probe_children: probing PnP devices
Acpi0sExecute: task queue not started
cpu0: hwpstate registered
Acpi0sExecute: task queue not started
cpu1: hwpstate registered
Acpi0sExecute; task queue not started
cpu2: hwpstate registered
Acpi0sExecute: task queue not started
cpu3: hwpstate registered
hwpstate_intel0: <Intel Speed Shift> on cpu0
hwpstate_intel0: hwpstate_attach1
hwpstate_intel0: hwpstate_attach2
hwpstate_intel0: hwpstate_attach3

where the hwpstate_attachX is some device_printf(dev,"hwpstate_attachX\n");
I've sprinkled in to try to where it actually fails.

I'm not sure about if device_printf happens immediately. The modifications
have been made to the function intel_hwpstate_attach around line 480 in
 sys/x86/cpufreq/hwpstate_intel.c

        /* ecx */
        if (cpu_power_ecx & CPUID_PERF_BIAS)
                sc->hwp_perf_bias = true;

        ret = set_autonomous_hwp(sc);
        device_printf(dev, "hwpstate_attach3\n");
        if (ret){
                device_printf(dev, "hwpstate_attach3a %i\n",ret);
                return (ret);
        }
        device_printf(dev, "hwpstate_attach4\n");

Any ideas to test? I'm curios about the "Acpi0sExecute: task queue not
started" lines, but I've not had the chance to see if they are present on a
computer that successfully boots with the hwpstate driver.

Best regards
Andreas


> ...
> >
> > Best regards
> > Andreas
> >
> > On Sat, Feb 1, 2020 at 11:26 PM Andreas Nilsson <andrnils_at_gmail.com>
> wrote:
> >
> > > Hello Conrad,
> > >
> > > thank you Andrey for bisecting! I'll try with that hint and see how it
> > > works for me.
> > >
> > > Best regards
> > > Andreas
> > >
> > > On Sat, Feb 1, 2020, 18:18 Conrad Meyer <cem_at_freebsd.org> wrote:
> > >
> > >> Hi Andrey,
> > >>
> > >> Please try 'hint.hwpstate_intel.0.disabled="1"' as a workaround for
> now.
> > >>
> > >> I think I have identified at least one problematic piece of code,
> > >> although I don't know if it's the root cause.  I will go ahead and fix
> > >> that, which may not fix the hang, and also add some debug printfs that
> > >> can be enabled to help identify the real issue.
> > >>
> > >> Thanks for the report and bisect.
> > >>
> > >> Best,
> > >> Conrad
> > >>
> > >> On Sat, Feb 1, 2020 at 6:06 AM Andrey V. Elsukov <bu7cher_at_yandex.ru>
> > >> wrote:
> > >> >
> > >> > 31.01.2020 18:11, Andrey V. Elsukov пишет:
> > >> > > On 24.01.2020 19:52, Andreas Nilsson wrote:
> > >> > >> It hangs during kernel boot and the last message printed on
> console
> > >> is:
> > >> > >> hwpstate_intel0: <Intel Speed Shift> on cpu0
> > >> > >
> > >> > > Hi,
> > >> > >
> > >> > > Did you find the cause of this hang?
> > >> > > I also tried to update today from r350816 to r357330. But my
> Lenovo X1
> > >> > > Carbon 4th hangs on the same message.
> > >> > >
> > >> >
> > >> > Hi,
> > >> >
> > >> > I have bisected the bad commit, it is r357002.
>
> Yep. I also had to bisect this from what is now some 5 months ago :-(
>
> Diane
> --
> - db_at_FreeBSD.org db_at_db.net http://www.db.net/~db
>
Received on Wed May 27 2020 - 16:26:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC