On Thu, Dec 7, 2017 at 2:33 AM, Andriy Gapon <avg_at_freebsd.org> wrote: > > [cc-ing current_at_ to raise more awareness] > > On 05/12/2017 16:03, Alexey Dokuchaev wrote: > > On Fri, Nov 24, 2017 at 11:31:51AM +0200, Andriy Gapon wrote: > >> > >> I have reported a couple of nvidia-driver issues in the FreeBSD section > >> of the nVidia developer forum, but no replies so far. > >> > >> Well, the first issue is not with the driver, but with a utility that > >> comes with it, nvidia-smi: > >> https://devtalk.nvidia.com/default/topic/1026589/freebsd/ > nvidia-smi-query-gpu-spins-forever-on-freebsd-head-amd64-/ > >> I wonder if I am the only one affected or if I see the problem because > >> I am on head or something else. > >> I am pretty sure that the problem is caused by a programming bug related > >> to strtok_r. > > > > I'll try to reproduce it and report back. > > I've done some work with a debugger and it seems that there is code that > does > something like this: > > char *last = NULL; > > while (1) { > if (last == NULL) > p = strtok_r(str, sep, &last); > else > p = strtok_r(NULL, sep, &last); > if (p == NULL) > break; > ... > } > > The problem is that when 'p' points to the last token, 'last' is NULL (in > FreeBSD implementation of strtok_r). That means that when we go to the > next > iteration the parsing starts all over again leading to the endless loop. > The code is incorrect from the standards point of view, because the value > of > 'last' is completely opaque and should not be used for anything else but > passing > it back to strtok_r. > > I used gdb -w to change the logic to: > > char *last = 1; > > While (1) { > if (last == 1) > p = strtok_r(str, sep, &last); > else > p = strtok_r(NULL, sep, &last); > ... > } > > Where 1 is used as an "impossible" pointer value which is neither NULL nor > a > valid pointer that can be set by strtok_r. It's not ideal, but binary code > editing is not as easy as that of source code. > > The binary patch is here: https://people.freebsd.org/~ > avg/nvidia-smi.bsdiff > > >> The second issue is with the FreeBSD support for the kernel driver: > >> https://devtalk.nvidia.com/default/topic/1026645/freebsd/ > panic-related-to-nvkms_timers-lock-sx-lock-/ > >> I would like to get some feedback on my analysis. > >> I am testing this patch right now: > >> https://people.freebsd.org/~avg/extra-patch-src_nvidia- > modeset_nvidia-modeset-freebsd.c > > > > Unfortunately, I'm not an expert on kernel locking primitives to give you > > a proper review, let's see what others have to say. > > It's been a while since I posted the patch and there are no comments yet. > I can only add that I am running an INVARIANTS and WITNESS enabled kernel > all > the time and before the patch I was getting kernel panics every now and > then. > Since I started using the patch I haven't had a single nvidia panic yet. > > >> Also, what's the best place or who are the best people with whom to > >> discuss such issues? > > > > Yes, this is a problem now: since Christian Zander had left nVidia, he > > could not tell me who'd be their next liaison to talk to from FreeBSD > > community. :-( > > Oh, I didn't know about Christian's departure. > So, we are not in a very good position now. How about Aaron Plattner (CC'd). Aaron, are you still working on FreeBSD driver issues? -AlanReceived on Thu Dec 07 2017 - 14:35:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:14 UTC