Re: Runaway intr, not flash related

From: Doug Barton <dougb_at_FreeBSD.org>
Date: Sat, 14 Aug 2010 20:30:44 -0700
On 08/14/2010 09:54, b. f. wrote:
>> My "runaway intr" problem with flash has been continuing all
>> along, but since no one has been interested in helping with it I
>> haven't reported it for a while. However, today, for the first
>> time, it happened when I had not run flash at all since I booted.
>>
>> My system: Dell D620, C2D, i386, SMP, r210908
>>
>> swi4: clock is the culprit again this time, but when flash
>> triggers this problem I sometimes see hdac as the culprit, FYI.
>
> I wouldn't say that no one is interested in helping.

Yes, that was perhaps too strongly worded, sorry. What I meant to say
was something more along the lines of, "Since I've tried everything that
everyone has suggested so far, and none of it has helped, and no new
suggestions have magically appeared, but I'm still having the same
problem ...."

> (And I think you've received a few more suggestions than your other
> recent message to freebsd-developers suggests.)  For my part, I find
> it a bit difficult to track the status of your interrupt problem, and
> the interactivity problem, which may or may not be related.

Sorry that I'm making your life difficult. :)  I have actually tried to
report stuff that people have asked for, but I'll give you the full
status update here.

Before I answer all your questions below though, I wanted to say 
something about why I don't think it's hardware related (although none 
of this is conclusive, and I agree I could be wrong).

The problem happens MOST often when I'm viewing a flash video, but it
can also happen other times. Interestingly, what often happens is that
everything is fine while I'm viewing the video, but intr runs away after
I close the window. I was sort of surprised by this myself, but now I
have verified it numerous times.

It happens without running flash (as I pointed out in this message) and
last night I was able to trigger it several times without running X at all.

It DOESN'T happen with loads that produce a lot more heat than my
typical desktop workloads (like say, make -j2 buildworld).

The usual device that runs away is the clock, but sometimes (about 1 in
20) it's hdac.

Back to flash, for a while I couldn't get it to work at all from the
browser, but the hulu desktop binary worked great. Right now, the hulu
desktop app causes the runaway almost immediately, but flash is working
great in the browser.

To me this sounds a lot like software, not hardware.

> --Have you ruled out any contribution from overheating, like I think
>  you were experiencing before with this machine?

I think so, yes. I've been keeping a very close eye on the heat, and
blowing out the heat sink a lot more often. Also, see above about the
fact that higher-heat loads don't make the problem happen more often.

> At one point, you were following some of mav_at_'s suggestions for
> power-saving, but then you posted a configuration that suggested
> that you had abandoned some of these settings and returned to the
> defaults.

Right, the problem was happening with those settings, and I wanted to
make sure that those weren't the cause. I didn't experience any more
heat problems with the defaults, so I left it alone.

> So are you running hot, or being throttled now?

No, and yes. :)  I'm still using powerd, and it seems to be working as 
expected.

> Have you tried running at a kern.hz< 1000, with throttling disabled,
> to see if that ameliorates the problem?

That's how I was running with the settings mav suggested. I just tried
reducing back to 100, but didn't adjust the throttling settings. We'll
see how that works.

> --What graphics driver are you using?

For X, the nv driver from xorg. When I was on the console last night I
was using the xterm console driver.

> You were using x11/nvidia-driver, but then after the kib_at_ and alc_at_'s
> vm changes that led to problems with that driver, I thought you were
> using x11-drivers/xf86-video-nv -- is that still the case?  Does
> switching drivers seem to influence the frequency or severity of the
>  problems?

I've tried numerous versions of the nvidia driver over the last several
months, with various locking patches to make them work on -current, and
they introduce their own problems (generally freeze ups, or random
panics, as I've reported previously). Therefore I haven't been using the
nvidia driver at all recently.

> --When do you experience these problems?  Do they ever occur when you
> are _not_ running X?

Yes.

> Have you tried temporarily disabling your usb and network devices, to
> see if they are contributing to the problem?

I haven't tried the usb driver, but I have tried using a different
network card (and turning off the built-in version with the hardware
switch), no help there.

> Are you able to watch flash videos from local media, as opposed to
> those from a remote site, without problems?

I haven't tried that, but I will, thanks.

> --Did you follow mav_at_'s suggestion to use something other than your
> hpet for the eventcounter and timecounter?  The possible use of the
> hpet is one of the main differences between the new and old timing
> code, and you reported some problems with your hpet earlier.

Yes, I tried his suggestions, didn't help. If anyone has other 
suggestions I'd be glad to try them.

> --Did you follow attilio_at_'s suggestion to obtain scheduling traces
> for the interactivity problem, as described in
> src/tools/sched/schedgraph.py?

IIRC when I asked him about that Attilio said that I was actually better
off with the dtrace suggestion I also received, but it's possible that I
misunderstood. I'll take another look at this.


Doug

-- 

	Improve the effectiveness of your Internet presence with
	a domain name makeover!    http://SupersetSolutions.com/

	Computers are useless. They can only give you answers.
			-- Pablo Picasso
Received on Sun Aug 15 2010 - 01:30:47 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC