power/temp management on a Dell D620 w/C2D

From: Doug Barton <dougb_at_FreeBSD.org>
Date: Sat, 26 Dec 2009 17:17:45 -0800
I'm starting a new thread since my problems do not seem directly
related to the nvidia drivers, and the other thread(s) sort of
diverged. :)

I'm very grateful to everyone who has provided suggestions, and I am
happy to report that I have powerd working fine about 90% of the time
now that I have disabled throttling. About 1/10 boots I get the "X
begins but doesn't finish" problem that I've described previously, and
finally resulted in the panic that I just sent to the list.

More below ...

b. f. wrote:
> On 12/21/09, Doug Barton <dougb_at_freebsd.org> wrote:
>> Doug Barton wrote:
>>> I did, but the problem got worse. With the following: 
>>> performance_cx_lowest="C2"      # Online CPU idle state 
>>> economy_cx_lowest="C2"          # Offline CPU idle state
> 
> I don't see any obvious problems in your listings.  But since
> others have reported difficulties when using the nvidia driver with
> both throttling and powerd(8), why don't you disable throttling,
> and see what happens?:

Ok, I did not only that, but I went the whole hog on the
recommendations listed at http://wiki.freebsd.org/TuningPowerConsumption

hw.pci.do_power_nodriver=3
hint.p4tcc.0.disabled=1
hint.acpi_throttle.0.disabled=1
hint.apic.0.clock=0
kern.hz=100
hint.atrtc.0.clock=0
hint.pcm.0.buffersize=65536
hint.pcm.1.buffersize=65536
hw.snd.feeder_buffersize=65536
hw.snd.latency=7

I also have in rc.conf:

performance_cpu_freq="NONE"     # Online CPU frequency
economy_cpu_freq="NONE"         # Offline CPU frequency
performance_cx_lowest="C3"      # Online CPU idle state
economy_cx_lowest="C3"          # Offline CPU idle state

The combination of all these has resulted in the
hw.acpi.thermal.tz0.temperature at idle of around 68-72C, and in
"normal" use between 75-80C, which is back to where it used to be.

If I add powerd:
powerd_flags="-a adaptive -b adaptive -n adaptive"

It drops the temp further. It idles in the low 60s and gets up to the
high 60s - low 70s for extra work (like "compact folders" in
thunderbird). Also, the occasional "flicker" on the screen that I used
to experience with powerd is gone, I assume because it and throttling
are no longer fighting one another.

> in /boot/device.hints or /boot/loader.conf.  Make sure that 
> dev.cpu.0.freq_levels, dev.est.0.freq_settings, etc. show a
> reasonable range of frequencies,

Those 3 all show the same ranges:
2333/31000 2000/26000 1667/22000 1333/17000 1000/13000

> and that your cpus are using the lowest when lightly loaded. 

I assume that you mean dev.cpu.0.freq? That seems to stay at 1000 when
the system is lightly loaded. If I toggle debug.cpufreq.verbose and
watch the log I get something like this:

cpufreq: skipping info-only driver acpi_perf1
cpufreq: adding abs setting 2333 at head
cpufreq: adding abs setting 2000 after 2333
cpufreq: adding abs setting 1667 after 2000
cpufreq: adding abs setting 1333 after 1667
cpufreq: adding abs setting 1000 after 1333
cpufreq: setting abs freq 1000 on est1 (cpu 1)
cpufreq: get returning known freq 1000
last message repeated 123 times

Which all seems nice.

> If they don't, or if there continue to
> be problems, consider setting debug.cpufreq.lowest to remove
> problematic frequencies, as in cpufreq(4). 

That one is a bit weird, since it's currently set to 0.

> And keep an eye on the
> reported temperatures, because the computer may run hotter without
> throttling. Can you run X without problems?

Yes, I've run X on it all along.

> You may also want to try:
> 
> hint.ata.0.pm_level=1

What will that do?

> I also think that ~75C is a bit high for a lightly loaded machine. 

Agreed, the 65-70C that it is averaging now with all the settings
above is a lot healthier in my opinion. This laptop has always run
hot, it's endemic to the breed, but there is hot and then there is HOT.

> Earlier, you said that you noticed an increase in operating 
> temperatures, beginning several weeks ago.  Do you remember typical
> values for the temperatures before the increase? 

Yes, it was common for the temp to stay somewhere in the 70s for
typical light usage (X, thunderbird, firefox, pidgin) and jump into
the high 80s to low 90s when building world, especially with -j2.

> Did you increase  the machine's workload, or change BIOS settings?

No.

> What temperatures
> are reported under Windows with power-saving when the machine is
> lightly loaded? 

I found an interesting utility called "Speed Fan." It has a very
thorough list of temperatures from various things (assuming it is to
be believed). Here is an "average" reading:

GPU:	86C
HD0:	48C
Temp1:	76C
Core0:	70C
Core1:	71C
Core:	80C
DIMM:	80C
Temp4:	64C

That seemed to indicate (as someone else mentioned) that the GPU is
actually the culprit in terms of the major heat source.

> You said that you blew out the ducts and grilles,
> but did you look to see that there were no remaining obstructions
> afterward? 

I had some repair work done on this laptop almost a year ago where
they removed everything to put in a new motherboard, although I
haven't had the case open since. I visually inspected the
ducts/grilles from the outside, but I haven't had it open yet. I plan
to do that soonish, but I've been busy with the holidays, etc.

> Have you looked to see if the heat sink is firmly
> seated on the cpu, with no air gap, but only an adequate amount of
> thermal interface material between the two?

No, I don't know for sure. I will take a look at it when I have the
case open.


Thanks again,

Doug

-- 

	Improve the effectiveness of your Internet presence with
	a domain name makeover!    http://SupersetSolutions.com/
Received on Sun Dec 27 2009 - 00:17:47 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:59 UTC