Re: nvidia drivers mutex lock

From: blubee blubeeme <gurenchan_at_gmail.com>
Date: Thu, 8 Jun 2017 02:27:51 +0800
I was just looking through dmesg and noticed these:

Jun  6 21:40:52 blubee kernel: nvidia-modeset: Allocated GPU:0
(GPU-54a7b304-c99d-efee-0117-0ce119063cd6) _at_ PCI:0000:01:00.0
Jun  6 21:41:05 blubee kernel: NVRM: GPU at PCI:0000:01:00:
GPU-54a7b304-c99d-efee-0117-0ce119063cd6
Jun  6 21:41:05 blubee kernel: NVRM: GPU Board Serial Number:
Jun  6 21:41:05 blubee kernel: NVRM: Xid (PCI:0000:01:00): 79, GPU has
fallen off the bus.
Jun  6 21:41:05 blubee kernel:
Jun  6 21:41:05 blubee kernel: NVRM: GPU at 0000:01:00.0 has fallen off the
bus.
Jun  6 21:41:05 blubee kernel: NVRM: GPU is on Board .
Jun  6 21:41:05 blubee kernel: NVRM: A GPU crash dump has been created. If
possible, please run
Jun  6 21:41:05 blubee kernel: NVRM: nvidia-bug-report.sh as root to
collect this data before
Jun  6 21:41:05 blubee kernel: NVRM: the NVIDIA kernel module is unloaded.
Jun  6 21:41:05 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
query display engine channel state: 0x0000927c:0:0:0x0000000f
Jun  6 21:41:05 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
query display engine channel state: 0x0000927c:0:0:0x0000000f
Jun  6 21:41:05 blubee kernel: vgapci0: child nvidia0 requested
pci_enable_io
Jun  6 21:41:05 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
query display engine channel state: 0x0000927c:0:0:0x0000000f
Jun  6 21:41:06 blubee kernel: nvidia-modeset: ERROR: GPU:0: Failed to
query display engine channel state: 0x0000927c:0:0:0x0000000f
Jun  6 21:41:22 blubee kernel: .

then that lead me to this nvidia forum thread:
https://devtalk.nvidia.com/default/topic/985037/gtx-1070-quot-gpu-has-fallen-off-the-bus-quot-running-3d-games-in-arch-linux-/

maybe it could help somehow?

Best,
Owen

On Tue, Jun 6, 2017 at 10:08 PM, blubee blubeeme <gurenchan_at_gmail.com>
wrote:

> This is getting out of hand. I can't even keep x going for ten minutes
> sometimes.
> I've tested all the suggestions in this thread and they just don't work.
>
> I have put out a print of sysctl hw. here : https://paste2.org/
>
> With this CPU: hw.model: Intel(R) Core(TM) i7-6700HQ CPU _at_ 2.60GHz
> The bios on this laptop I can either set graphics to discrete or mshybrid.
>
> I've tried in the past to disable discrete and run mshybrid but that
> always comes up with 0 screens found. Even just doing Xorg -configure.
>
> Anyone have some tips on disabling nvidia drivers, running this cpu with
> igpu for a while?
>
> Best,
> Owen
>
> On Sun, Jun 4, 2017, 18:11 blubee blubeeme <gurenchan_at_gmail.com> wrote:
>
>> Thanks a lot! I'll give it a shot in a bit.
>>
>> Best,
>> Owen
>>
>> On Sun, Jun 4, 2017, 16:59 Tomoaki AOKI <junchoon_at_dec.sakura.ne.jp>
>> wrote:
>>
>>> Yes. FreeBSD patches in x11/nvidia-drivers/files are applied as usual.
>>>
>>> But beware! Sometimes upstream changes make any of FreeBSD patches not
>>> applicable (incorporating any of these, incompatible modifies, ...).
>>>
>>> For 381.22, current patchset applies and builds fine for me.
>>>
>>>
>>> On Sun, 04 Jun 2017 08:04:50 +0000
>>> blubee blubeeme <gurenchan_at_gmail.com> wrote:
>>>
>>> > I'm running with svn and I build by make.
>>> > If in use these steps, the BSD related patches will be applied, etc?
>>> >
>>> > Best,
>>> > Owen
>>> >
>>> > On Sun, Jun 4, 2017, 15:53 Tomoaki AOKI <junchoon_at_dec.sakura.ne.jp>
>>> wrote:
>>> >
>>> > > Hi.
>>> > >
>>> > > Not in ports tree, but easily overridden by adding
>>> > >
>>> > >   DISTVERSION=381.22 -DNO_CHECKSUM
>>> > >
>>> > > on make command line. Makefile of x11/nvidia-driver has a mechanism
>>> > > to do so for someone requires newer version (newer GPU support,
>>> etc.).
>>> > >
>>> > > If you're using portupgrade,
>>> > >
>>> > >   portupgrade -m 'DISTVERSION=381.22 -DNO_CHECKSUM' -f
>>> x11/nvidia-driver
>>> > >
>>> > > would do the same.
>>> > >
>>> > > If you installed it via pkg, there's no way to try. :-(
>>> > > (As it's pre-built.)
>>> > >
>>> > >
>>> > > On Sun, 04 Jun 2017 07:04:01 +0000
>>> > > blubee blubeeme <gurenchan_at_gmail.com> wrote:
>>> > >
>>> > > > Hi _at_tomoaki
>>> > > > Is that version of nvidia drivers currently in the ports tree? I
>>> just
>>> > > > checked but it seems not to be.
>>> > > >
>>> > > > _at_jeffrey
>>> > > > I just generated a new xorg based on the force composition
>>> setting. I
>>> > > > merged it with my previous xorg I'll reboot, see if it gives better
>>> > > > performance.
>>> > > >
>>> > > > It seems like my system is locking up more frequently now.
>>> Sometimes
>>> > > right
>>> > > > after a reboot the system, the screen locks and it's reboot and
>>> pray.
>>> > > >
>>> > > > Best,
>>> > > > Owen
>>> > > >
>>> > > > On Sat, Jun 3, 2017, 21:59 Jeffrey Bouquet <
>>> jeffreybouquet_at_yahoo.com>
>>> > > wrote:
>>> > > >
>>> > > > > SOME LINES BOTTOM POSTED, SEE...
>>> > > > > --------------------------------------------
>>> > > > > On Fri, 6/2/17, Tomoaki AOKI <junchoon_at_dec.sakura.ne.jp> wrote:
>>> > > > >
>>> > > > >  Subject: Re: nvidia drivers mutex lock
>>> > > > >  To: freebsd-current_at_freebsd.org
>>> > > > >  Cc: "Jeffrey Bouquet" <jeffreybouquet_at_yahoo.com>, "blubee
>>> blubeeme" <
>>> > > > > gurenchan_at_gmail.com>
>>> > > > >  Date: Friday, June 2, 2017, 11:25 PM
>>> > > > >
>>> > > > >  Hi.
>>> > > > >  Version
>>> > > > >  381.22 (5 days newer than 375.66) of the driver states...
>>> > > > >  [1]
>>> > > > >
>>> > > > >   Fixed hangs and
>>> > > > >  crashes that could occur when an OpenGL context is
>>> > > > >   created while the system is out of available
>>> > > > >  memory.
>>> > > > >
>>> > > > >  Can this be related
>>> > > > >  with your hang?
>>> > > > >
>>> > > > >  IMHO,
>>> > > > >  possibly allocating new resource (using os.lock_mtx
>>> > > > >  guard)
>>> > > > >  without checking the lock first while
>>> > > > >  previous request is waiting for
>>> > > > >  another can
>>> > > > >  cause the duplicated lock situation. And high memory
>>> > > > >  pressure would easily cause the situation.
>>> > > > >
>>> > > > >   [1] http://www.nvidia.com/Download
>>> /driverResults.aspx/118527/en-us
>>> > > > >
>>> > > > >  Hope it helps.
>>> > > > >
>>> > > > >
>>> > > > >  On Thu, 1 Jun
>>> > > > >  2017 22:35:46 +0000 (UTC)
>>> > > > >  Jeffrey Bouquet
>>> > > > >  <jeffreybouquet_at_yahoo.com>
>>> > > > >  wrote:
>>> > > > >
>>> > > > >  > I see the same
>>> > > > >  message, upon load, ...
>>> > > > >  >
>>> > > > >  --------------------------------------------
>>> > > > >  > On Thu, 6/1/17, blubee blubeeme <gurenchan_at_gmail.com>
>>> > > > >  wrote:
>>> > > > >  >
>>> > > > >  >  Subject:
>>> > > > >  nvidia drivers mutex lock
>>> > > > >  >  To: freebsd-ports_at_freebsd.org,
>>> > > > >  freebsd-current_at_freebsd.org
>>> > > > >  >  Date: Thursday, June 1, 2017, 11:35
>>> > > > >  AM
>>> > > > >  >
>>> > > > >  >  I'm
>>> > > > >  running nvidia-drivers 375.66 with a GTX
>>> > > > >  >  1070 on FreeBSD-Current
>>> > > > >  >
>>> > > > >  >  This problem
>>> > > > >  just started happening
>>> > > > >  >  recently but,
>>> > > > >  every so often my laptop
>>> > > > >  >  screen will
>>> > > > >  just blank out and then I
>>> > > > >  >  have to
>>> > > > >  power cycle to get the
>>> > > > >  >  machine up and
>>> > > > >  running again.
>>> > > > >  >
>>> > > > >  >  It seems to be a problem with nvidia
>>> > > > >  >  drivers acquiring duplicate lock. Any
>>> > > > >  >  info on this?
>>> > > > >  >
>>> > > > >  >  Jun〓 2 02:29:41 blubee kernel:
>>> > > > >  >  acquiring duplicate lock of same
>>> > > > >  type:
>>> > > > >  >  "os.lock_mtx"
>>> > > > >  >  Jun〓 2 02:29:41 blubee kernel: 1st
>>> > > > >  >  os.lock_mtx _at_ nvidia_os.c:841
>>> > > > >  >  Jun〓 2 02:29:41 blubee kernel: 2nd
>>> > > > >  >  os.lock_mtx _at_ nvidia_os.c:841
>>> > > > >  >  Jun〓 2 02:29:41 blubee kernel:
>>> > > > >  >  stack backtrace:
>>> > > > >  >
>>> > > > >  Jun〓 2 02:29:41 blubee kernel: #0
>>> > > > >  >
>>> > > > >  0xffffffff80ab7770 at
>>> > > > >  >
>>> > > > >  witness_debugger+0x70
>>> > > > >  >  Jun〓 2
>>> > > > >  02:29:41 blubee kernel: #1
>>> > > > >  >
>>> > > > >  0xffffffff80ab7663 at
>>> > > > >  >
>>> > > > >  witness_checkorder+0xe23
>>> > > > >  >  Jun〓 2
>>> > > > >  02:29:41 blubee kernel: #2
>>> > > > >  >
>>> > > > >  0xffffffff80a35b93 at
>>> > > > >  >
>>> > > > >  __mtx_lock_flags+0x93
>>> > > > >  >  Jun〓 2
>>> > > > >  02:29:41 blubee kernel: #3
>>> > > > >  >
>>> > > > >  0xffffffff82f4397b at
>>> > > > >  >
>>> > > > >  os_acquire_spinlock+0x1b
>>> > > > >  >  Jun〓 2
>>> > > > >  02:29:41 blubee kernel: #4
>>> > > > >  >
>>> > > > >  0xffffffff82c48b15 at _nv012002rm+0x185
>>> > > > >  >  Jun〓 2 02:29:41 blubee kernel:
>>> > > > >  >  ACPI Warning:
>>> > > > >  \_SB.PCI0.PEG0.PEGP._DSM:
>>> > > > >  >  Argument #4
>>> > > > >  type mismatch - Found
>>> > > > >  >  [Buffer], ACPI
>>> > > > >  requires [Package]
>>> > > > >  >
>>> > > > >  (20170303/nsarguments-205)
>>> > > > >  >  Jun〓 2
>>> > > > >  02:29:42 blubee kernel:
>>> > > > >  >
>>> > > > >  nvidia-modeset: Allocated GPU:0
>>> > > > >  >
>>> > > > >  (GPU-54a7b304-c99d-efee-0117-0ce119063cd6) _at_
>>> > > > >  >  PCI:0000:01:00.0
>>> > > > >  >
>>> > > > >
>>> > > > >  >  Best,
>>> > > > >  >  Owen
>>> > > > >  >
>>> > > > >  _______________________________________________
>>> > > > >  >  freebsd-ports_at_freebsd.org
>>> > > > >  >  mailing list
>>> > > > >  >  https://lists.freebsd.org/mailman/listinfo/freebsd-ports
>>> > > > >  >  To unsubscribe, send any mail to
>>> > > > >  "freebsd-ports-unsubscribe_at_freebsd.org"
>>> > > > >  >
>>> > > > >  >
>>> > > > >  >
>>> > > > >  > ... then Xorg will
>>> > > > >  run happily twelve hours or so.  The lockups here happen
>>> > > > >  usually
>>> > > > >  > when too large or too many of
>>> > > > >  number of tabs/ large web pages with complex CSS etc
>>> > > > >  > are opened at a time.
>>> > > > >  >     So no help, just a 'me
>>> > > > >  too'.
>>> > > > >  >
>>> > > > >  _______________________________________________
>>> > > > >  > freebsd-current_at_freebsd.org
>>> > > > >  mailing list
>>> > > > >  > https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> > > > >  >
>>> > > > >  To unsubscribe, send any mail to "
>>> > > freebsd-current-unsubscribe_at_freebsd.org
>>> > > > > "
>>> > > > >  >
>>> > > > >  >
>>> > > > >
>>> > > > >
>>> > > > >  --
>>> > > > >  Tomoaki
>>> > > > >  AOKI    <junchoon_at_dec.sakura.ne.jp>
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > ........................
>>> > > > > might be a workaround
>>> > > > > Xorg/nvidia ran all night with this:
>>> > > > >    nvidia-settings >>  X server display configuration >>
>>> Advanced >>
>>> > > Force
>>> > > > > Full Composition Pipeline
>>> > > > > ... for the laptop freezing.  Could not hurt to try.  " merge
>>> with
>>> > > > > Xorg.conf " from nvidia-settings...
>>> > > > > ......................
>>> > > > > 18 hours uptime so far, even past
>>> > > > > the 3 am periodic scripts.   Have not rebooted out of the Xorg
>>> though
>>> > > so
>>> > > > > may require edit-out of
>>> > > > > xorg.conf if that is the case, in other words differing from
>>> real-time
>>> > > > > apply and
>>> > > > > xorg initially start applies.
>>> > > > > ........
>>> > > > >
>>> > > > >
>>> > > > _______________________________________________
>>> > > > freebsd-current_at_freebsd.org mailing list
>>> > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> > > > To unsubscribe, send any mail to "
>>> > > freebsd-current-unsubscribe_at_freebsd.org"
>>> > > >
>>> > > >
>>> > >
>>> > >
>>> > > --
>>> > > Tomoaki AOKI    <junchoon_at_dec.sakura.ne.jp>
>>> > >
>>>
>>>
>>> --
>>> Tomoaki AOKI    <junchoon_at_dec.sakura.ne.jp>
>>>
>>
Received on Wed Jun 07 2017 - 16:27:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC