Re: X becomes unresponsive with nvidia and hardlocks with gdb (was Re: X becomes unresponsive with nvidia / xscreensaver and desktop panics)

From: Garrett Cooper <yanefbsd_at_gmail.com>
Date: Sun, 25 Jan 2009 01:18:48 -0800
On Wed, Jan 14, 2009 at 9:04 AM, Stefan Ehmann <shoesoft_at_gmx.net> wrote:
> On Tuesday 13 January 2009 15:49:07 O. Hartmann wrote:
>> Christoph Mallon wrote:
>> > O. Hartmann schrieb:
>> >> Garrett Cooper wrote:
>> >>> - I've rebuilt my xorg-server a few times and it's still claiming that
>> >>> it was built with 7.1-RC2 -_-...
>> >>> - I can get the Xorg server to go full tilt by just compiling
>> >>> something, like buildworld, via an xterm.
>> >>
>> >> I also experienced this, but not only with the mentioned 'nv' driver,
>> >> also with 'vesa'. Compiling a kernel or making buildworld, even with no
>> >> -jX option, turns the box sometimes in a state of unresponseness. Mouse
>> >> jumping, no keyboard response, sometimes for more than a minute. This
>> >> happens on a FBSD 8.0-CUR/AMD64 UP box and it also happens on a FreeBSD
>> >> 7.1-STABLE box (also amd64, 4 cores). But on SMP boxes I reralized that
>> >> the problem does not impact that harsh as seen on UP boxes.
>> >> We also had several P4 32bit machines with HTT enabled around, one of
>> >> them was built with FreeBSD 7.1-STABLE AND Xorg and I never realized the
>> >> bumpy X11, even when disabling HTT and running UP and Xorgs vesa driver.
>> >>
>> >> Well, it also seems to make no difference whether I use USB2 stack (in
>> >> FreeBSD 8) or the old one.
>> >
>> > I regularly can observe that batch jobs like large compile jobs get a
>> > lower priority number (i.e. they get preferred by the scheduler) than
>> > X on my UP machine with SCHED_ULE (7.0-STABLE from early July). Just a
>> > bit X activity (switching desktops, scrolling in a browser etc.) is
>> > enough to make its priority number higher than that of make+gcc.
>> > This also causes interesting cascades like stuttering music:
>> > - gcc preferred over X
>> > - X cannot redraw xterm fast enough
>> > - buffer of xterm fills
>> > - mplayer cannot write its status line to xterm and blocks
>> > - because mplayer blocks it cannot feed more data to the sound device
>> > - music stutters
>>
>> ... try moving/draging a xterm rapidly over your screen while playing
>> music, copying a file or encoding, decoding or even compiling something.
>> In my case, suddenly those activities stop running. It is sometimes only
>> noticable when listening to music.
>> I realised those ghost-stops also without X11 - when high disk I/O
>> and/or network I/O happens. This is even harsh on a NFS-server. As I
>> mentioned, this is significantly on UP boxes, but can also be watched on
>> some slower/older SMP hardware (both with FreeBSD 7.1-STABLE AND FreeBSD
>> 8.0-CURRENT).
>
> I've been observing this since 7.0 IIRC. With 6.x  I never noticed this.
>
> When performing a portupgrade or running mencoder everything becomes very
> sluggish sometimes. E.g. Redrawing windows takes several second, even
> moderately sized videos can't be played back smoothly anymore. 4BSD seems to
> be better than ULE but still not perfect.
>
> As a workaround I run load intensive tasks with idprio(1), nice(1) doesn't
> really help that much.
>
> Unfortunately I've never been able to reduce this to a simple reproducible
> test case. E.g. if Xorg has to be involved and what kind of load causes the
> problems.
>
> But my conclusions were similar to your findings. Simply using lots of CPU
> doesn't affect the responsiveness of the system. But when high disk or network
> I/O is involved the problem occurs.
>
> Also, I figured this would not happen on every system. Otherwise there would
> have been more complaints on the lists. This also seems to correlate with your
> statement that the problem is significant on UP hardware, I run a 4+ years old
> Athlon XP CPU.

I just upgraded Xorg and manually patched the nvidia-driver Makefile
to pick up the latest version from nvidia's site and I'm no longer
seeing the high CPU issues with compiling like I used to, so this very
problem may have gone away... The release notes for the newest driver
hint at a performance regression which was caught after the fact, as
well as the fact that they enabled more card features and fixed a few
bugs.

I submitted a ports PR for this, so hopefully it'll hit the tree in a few days.

The other nvidia-driver sets should probably be updated as well.

Cheers :),
-Garrett
Received on Sun Jan 25 2009 - 08:19:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:41 UTC