Re: nvidia-driver crashing kernel on head

From: David Naylor <naylor.b.david_at_gmail.com>
Date: Sat, 17 Jul 2010 16:24:54 +0200
On Sunday 11 July 2010 22:14:44 Doug Barton wrote:
> On 07/08/10 14:52, Rene Ladan wrote:
> > On 08-07-2010 22:09, Doug Barton wrote:
> >> On Thu, 8 Jul 2010, John Baldwin wrote:
> >>> These freezes and panics are due to the driver using a spin mutex
> >>> instead of a
> >>> regular mutex for the per-file descriptor event_mtx.  If you patch the
> >>> driver
> >>> to change it to be a regular mutex I think that should fix the
> >>> problems.
> >> 
> >> Can you give an example? :) I don't mind creating a patch for all of
> >> them if you can illustrate what needs to be changed.
> > 
> > See the attached patch
> 
> In order to use 195.36.15 it was necessary to use the patch Rene sent,
> the suggestion from jhb previously to remove some locks, plus a bit
> more. The patch that got it working on HEAD for me (specifically
> r209633) is attached. With that patch I could start X, and run it for a
> while, but performance was very poor, even in comparison with the stock
> nv driver, and it crashed a couple times (although not nearly as bad as
> previously).
> 
> So based on other suggestions I tried the newest release version at
> nvidia, 256.35. Some of the same locking stuff was needed to patch it, a
> patch for the port which includes the locking patch is also attached. If
> you are running an amd64 system you'll have to type 'make makesum' after
> applying this patch to the port. I'm not sure this patch is complete, or
> what Alexey might want to do with the update, but it does create an
> accurate plist which means you can cleanly deinstall/pkg_delete when
> you're done.
> 
> With 256.35 performance and stability have both been quite good,
> comparable even to before the the drama started. The only concern I have
> at this point is that I'm periodically getting a strange sort of "flash"
> popping up on my screen that I didn't get while I was running the nv
> driver recently. It looks sort of like the default X background (the
> tiny gray crosshatch) is popping through for just a split second.

I've been getting these messages on the console:

NVRM: Xid (0001:00): 16, Head 00000000 Count 000218d5
NVRM: Xid (0001:00): 8, Channel 00000000
NVRM: Xid (0001:00): 16, Head 00000000 Count 000218d6
NVRM: Xid (0001:00): 8, Channel 00000002

This is preceded by X locking hard.  I cannot VT switch to a normal console 
and sometimes the computer needs a hard reset (i.e. does not respond to power 
button).  It appears to only trigger when under heavy load.  eg 
make -C /usr/src -j8 buildworld

This seems to be messing with interrupts with other subsystems as my network 
drivers are less than reliable of late.  (Watchdog timeouts).  

This happens with 195.36.15 unpatched and 256.35 patched.  

I have not checked if booting with WITNESS enabled works.  

Regards

Received on Sat Jul 17 2010 - 12:25:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:05 UTC