On Tue, 8 Dec 2020 19:10:26 +0100 Alban Hertroys <haramrae_at_gmail.com> wrote: > > On 8 Dec 2020, at 16:40, John Kennedy <warlock_at_phouka.net> wrote: > > > > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: > >> This seems to have gotten lost in the moderate queue, but after a > >> week I am no closer to a solution, so here???s a resend: > >> > >> I???ve been trying to get a fresh world running (for the eventual > >> purpose of running amdgpu against my recent graphics adapter), but > >> I run into trouble with core loadable kernel modules, such as > >> zfs.ko from the subject. It also happens with other modules that I > >> tried randomly, for example, geom_mirror.ko. > >> > >> I updated to the latest current using svn up in /usr/src, then: > >> make clean > >> make buildworld kernel -j12 > >> shutdown -r now > >> > >> boot to single user mode > >> > >> kldload zfs > > > > I'm not sure you've provided enough information for a one-shot > > armchair diagnosis, but some things seem factually wrong. For > > example, my normal rebuild procedure is: > > > > cd /usr/src && make buildworld && make buildkernel > > make installkernel > > shutdown -r now > > > > cd /usr/src && mergemaster -pi > > make installworld > > mergemaster -Fi > > make -DBATCH_DELETE_OLD_FILES delete-old > > Aha! So that’s how to prevent having to press ‘y’ for every > deprecated file! > > > shutdown -r now > > > > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs > > > > (I'm on a desktop system here. You haven't described your setup.) > > This is also a desktop system. > > > You didn't say that you've installed the new kernel, which at least > > starts you down the road towards a driver/kernel mismatch. You > > presumably have a non-ZFS boot+root. > > I’m fairly sure I did, actually. > > Last time I checked, "make buildworld buildkernel" was equivalent to > "make buildworld && make buildkernel", while "make kernel” is a > shorthand for “make buildkernel && make installkernel” > > So, unless I’m mistaken, “make buildworld kernel” should be > equivalent to your first two lines. > > Nevertheless, I retried without these assumptions, the result was the > same. I forgot to “make delete-old” though, I rarely remember to do > that… > > > Did you mess around with the ZFS from ports (ZoL -> ZoF) > > at some point so you're not using the kernel's ZFS drivers? What > > ZFS entries do you have in /etc/loader.conf, /etc/rc.conf, and some > > of the varients that may get dragged in? (see rc.conf(5) for > > possibilities) > > Nope, stock modules here. > > > At the bottom of your email, you say / is UFS and /usr is ZFS, but > > I guess we have the extra fun of wondering what is under /usr on > > your /? If you have a pre-ZFS /usr that is populated by something > > now presumably very old (because all the new, current stuff went > > onto ZFS /usr, now unavailable). > > There is no populated directory /usr on the UFS file-system. This > install was created on a fresh NVME disk based on an existing install > on a spinning platter. The install happened with /usr mounted at the > ZFS file-system. > > I had to copy over several files from /etc and /usr/local/etc and > re-installed the most important packages. This was admittedly a bit > messy, it is possible that I forgot to copy something over. > (Originally my intention was to dd the contents of the spinning disk > over, but apparently that disk has a few wonky sectors, dd failed > after a few device timeouts) > > > I did sort-of manage to fix things, but recent kernels keep causing > the same issue: > > I noticed that uname -a said I was at revision 366335, while I had > the source tree up-to-date. For a test, I reverted back to that > revision and went through: make buildworld make buildkernel > > Which broke on /usr/local/sys/drm-current-kmod, which I turned out to > have installed through pkg. There have been changes to the linux_kpi > shortly after above revision - probably what broke compatibility > between HEAD and r366335. > > After removing that pkg, the kernel built and installed, world > installed fine too and I have a working system again, with kernel and > world in sync. > > So I tried again to move to HEAD: > > cd /usr/src > svn up > make buildworld -j12 > make buildkernel -j12 > make installkernel > shutdown -r now > <single user mode> > mount -u / > zpool import -Nf system (my /usr FS) > > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type > > > >> Which results in dmesg messages: > >> > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > > > > Be sure to check out /var/log/messages for extra issues. For > > example, with the bug I mentioned below, I couldn't load my nvidia > > driver and that manifested as one driver having issues because it > > depended on another, which had the root of the problem. > > I forgot to look there. If I find anything suspicious there, I’ll let > you know. That system doesn’t have a convenient mail client yet, so > for now its copying output to files and scp-ing that to the Mac. > > >> I can load the zfs kernel module from kernel.old just fine: > >> > >> ZFS filesystem version: 5 > >> ZFS storage pool version: features support (5000) > > > > I kicked my more bleeding-edge system over from 12.2-rel (r366954) > > up into 13.0-current (r367044, 1300123) on 2020/10/26. OpenZFS > > kicked in 2020/8/24? I think the CFT was ~2018/8/21, not sure when > > we had the OpenZFS ports. Current bumps the ABI version pretty > > frequently so I'd think you'd have tripped across versioning issues > > a long time ago if you had some drivers not being rebuilt. > > Having a conflict between kernel and world was what I was expecting > too, but I can’t figure out what got me into that situation. For all > I know, they should be in sync now, especially after I reverted the > tree back to rev 366335 and making world again (acc. to above method). > > > > >> This happens with any kernel module I???ve tried, such as > >> geom_mirror and amdgpu (from ports/graphics/drm-current-kmod - the > >> latter causes a kernel panic with kernel.old BTW). > >> > >> I???ve gone back as far as Oct 7 (before changes to > >> kern/elf_load_obj.c off the top of my head), looked at mailing > >> list archives and forums etc, all to no avail. > >> > >> I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I > >> had /etc/malloc.conf with the recommended symlink from UPDATING, > >> but the same happens with that moved out of the way. Nothing seems > >> to help. > >> > >> Do I need to go back further to get into a usable state or is > >> there something else I should be doing? > > > > With very few exceptions (bug 250897, 2020/11/6), I've found > > 13-current bootable since 10/26 (up through my current system, 13.0 > > r368388 (2020/12/6). You obviously need to make sure that an extra > > drivers you add in are compiled against the kernel, but ZFS is > > typically one of those. > > I think we covered that. > > Thanks for the help and the pointers, but unfortunately the mystery > remains. > Do you have anything in /boot/modules? (wild shot) -m -- Michael GmelinReceived on Tue Dec 08 2020 - 17:19:35 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:26 UTC