Re: Suddenly poweroff in 11-Current r300097

From: Kevin Oberman <rkoberman_at_gmail.com>
Date: Thu, 2 Jun 2016 17:08:48 -0700
On Thu, Jun 2, 2016 at 1:46 PM, O. Hartmann <ohartman_at_zedat.fu-berlin.de>
wrote:

> Am Thu, 2 Jun 2016 10:26:22 -0700
> Kevin Oberman <rkoberman_at_gmail.com> schrieb:
>
> > On Thu, Jun 2, 2016 at 7:41 AM, Hans Petter Selasky <hps_at_selasky.org>
> wrote:
> >
> > > On 06/02/16 03:07, RayCherng Yu wrote:
> > >
> > >> I got a suddenly poweroff in r300097 (and previous revision in April
> and
> > >> May) when I built textproc/docproj.
> > >> My machine is Macbook Pro 13 2011 early. I have checked the Apple
> website.
> > >> My bios is the latest version.
> > >> Actually it also happened in 10.3-STABLE.
> > >> It happened when the machine load was heavy. Before it shutdown, the
> fan
> > >> started to run very loudly. After several seconds (20 or 30 seconds),
> my
> > >> laptop shutdown (poweroff directly) suddenly. It seems not happen
> with the
> > >> AC power supply connected.
> > >>
> > >> I installed both Mac OSX and FreeBSD (dual boot). It never happened
> in Mac
> > >> OSX.
> > >>
> > >> My dmesg:
> > >> http://pastebin.com/QjZmbGCB
> > >>
> > >> My sysctl hw.acpi:
> > >>
> > >> hw.acpi.acline: 0
> > >> hw.acpi.battery.info_expire: 5
> > >> hw.acpi.battery.units: 1
> > >> hw.acpi.battery.state: 1
> > >> hw.acpi.battery.time: 87
> > >> hw.acpi.battery.life: 59
> > >> hw.acpi.cpu.cx_lowest: C8
> > >> hw.acpi.reset_video: 0
> > >> hw.acpi.handle_reboot: 1
> > >> hw.acpi.disable_on_reboot: 0
> > >> hw.acpi.verbose: 0
> > >> hw.acpi.s4bios: 0
> > >> hw.acpi.sleep_delay: 1
> > >> hw.acpi.suspend_state: S3
> > >> hw.acpi.standby_state: NONE
> > >> hw.acpi.lid_switch_state: NONE
> > >> hw.acpi.sleep_button_state: S3
> > >> hw.acpi.power_button_state: S5
> > >> hw.acpi.supported_sleep_state: S3 S4 S5
> > >>
> > >>
> > > Hi,
> > >
> > > Do you have a temperature sysctl? Usually FreeBSD will shutdown the
> system
> > > if the ACPI temperature exceeds some value. Maybe it would be better to
> > > reduce the CPU load when the temperature goes up instead of facing a
> > > shutdown?
> > >
> > > --HPS
> >
> >
> > The relevant information is probably found in dev.cpu. That is where all
> > temperature information is located as it is per-CPU, not per-system. Of
> > particular interest is dev.cpu.0.cx_lowest, dev.cpu.0.cx_supported, and
> > dev.cpu.0.freq_levels. A snapshot of dev.cpu.0 when the fan has cranked
> up,
> > but before shutdown would be nice, too.
> >
> > I see no hw.acpi.thermal information. This is very odd. These values
> > indicate what the system will do and is doing if it starts getting too
> hot.
> >
> > Is coretemp loaded? It is required to see the core temperatures and those
> > are almost certainly significant. It may account for the lack of thermal
> > information. Finally, a dmesg might be useful as it will tell us more
> about
> > just what thermal control techniques are enabled.
> >
> > Just to explain a bit on how this should work: when the temperature
> exceeds
> > some BIOS defined point, the system should "throttle" by pausing one of
> > every 8 clock cycles. If that does not fix the problem, the it rests for
> > two of every 8 and so on until the temperature is reduced. If it
> continues
> > to rise and reaches another BIOS set point, it will initiate an emergency
> > shutdown. If it reaches a CPU defined temperature, the power will shut
> off
> > immediately. Note that this is entirely a hardware function with no BIOS
> or
> > OS involvement. It should NEVER happen in normal operation as it is
> > triggered by a significant overtemp that threatens to destroy the CPU.
> I've
> > only seen it once when the CPU heat sink came loose on an old P4 system
> > several years ago.
> >
> > I should mention that I have zero experience with Apple hardware and it
> is
> > possible that they do some things differently than I have seen on other
> > hardware.
> > --
> > Kevin Oberman, Part time kid herder and retired Network Engineer
> > E-mail: rkoberman_at_gmail.com
> > PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683
>
> I have had such  problems many times with older hardware. In most cases
> "dried out"
>  thermal conductive pad or grease was the reason overheating the CPU du to
> a ineffective
> thermal conductivity from the CPU's surface to the heat spreader/cooler. I
> had recently
> two laptops with such a phenomenon - using high-quality thermal grease
> solved the problem
> for my. In both cases, the former high-viscous thermal grease has become
> like dry mud.
> Same with pads.
>

Valid suggestion. If you have not worked with it, keep the layer of grease
as thin as possible. Use quality grease, not pads or tape. They just don't
work as well. Good silicone thermal grease should remain effective for at a
minimum of 10 years.

Also, clean your heat sinks! I clean the ones on my laptop about once a
year (I have to remove the keyboard to blow them out) and I see the
quiescent temperature drop by 10-15C and the temp under load can drop by
20C. As active cooling works on my laptop, it does not overheat, but it
does slow down on "buildworld -j6" and building ports like chromium and
libreoffice. Very significant.
--
Kevin Oberman, Part time kid herder and retired Network Engineer
E-mail: rkoberman_at_gmail.com
PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683
Received on Thu Jun 02 2016 - 22:08:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC