> On 8 Dec 2020, at 16:40, John Kennedy <warlock_at_phouka.net> wrote: > > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: >> This seems to have gotten lost in the moderate queue, but after a week I am no closer to a solution, so here???s a resend: >> >> I???ve been trying to get a fresh world running (for the eventual purpose of running amdgpu against my recent graphics adapter), but I run into trouble with core loadable kernel modules, such as zfs.ko from the subject. It also happens with other modules that I tried randomly, for example, geom_mirror.ko. >> >> I updated to the latest current using svn up in /usr/src, then: >> make clean >> make buildworld kernel -j12 >> shutdown -r now >> >> boot to single user mode >> >> kldload zfs > > I'm not sure you've provided enough information for a one-shot armchair > diagnosis, but some things seem factually wrong. For example, my normal > rebuild procedure is: > > cd /usr/src && make buildworld && make buildkernel > make installkernel > shutdown -r now > > cd /usr/src && mergemaster -pi > make installworld > mergemaster -Fi > make -DBATCH_DELETE_OLD_FILES delete-old Aha! So that’s how to prevent having to press ‘y’ for every deprecated file! > shutdown -r now > > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs > > (I'm on a desktop system here. You haven't described your setup.) This is also a desktop system. > You didn't say that you've installed the new kernel, which at least starts > you down the road towards a driver/kernel mismatch. You presumably have a > non-ZFS boot+root. I’m fairly sure I did, actually. Last time I checked, "make buildworld buildkernel" was equivalent to "make buildworld && make buildkernel", while "make kernel” is a shorthand for “make buildkernel && make installkernel” So, unless I’m mistaken, “make buildworld kernel” should be equivalent to your first two lines. Nevertheless, I retried without these assumptions, the result was the same. I forgot to “make delete-old” though, I rarely remember to do that… > Did you mess around with the ZFS from ports (ZoL -> ZoF) > at some point so you're not using the kernel's ZFS drivers? What ZFS > entries do you have in /etc/loader.conf, /etc/rc.conf, and some of the > varients that may get dragged in? (see rc.conf(5) for possibilities) Nope, stock modules here. > At the bottom of your email, you say / is UFS and /usr is ZFS, but I guess we > have the extra fun of wondering what is under /usr on your /? If you have a > pre-ZFS /usr that is populated by something now presumably very old (because > all the new, current stuff went onto ZFS /usr, now unavailable). There is no populated directory /usr on the UFS file-system. This install was created on a fresh NVME disk based on an existing install on a spinning platter. The install happened with /usr mounted at the ZFS file-system. I had to copy over several files from /etc and /usr/local/etc and re-installed the most important packages. This was admittedly a bit messy, it is possible that I forgot to copy something over. (Originally my intention was to dd the contents of the spinning disk over, but apparently that disk has a few wonky sectors, dd failed after a few device timeouts) I did sort-of manage to fix things, but recent kernels keep causing the same issue: I noticed that uname -a said I was at revision 366335, while I had the source tree up-to-date. For a test, I reverted back to that revision and went through: make buildworld make buildkernel Which broke on /usr/local/sys/drm-current-kmod, which I turned out to have installed through pkg. There have been changes to the linux_kpi shortly after above revision - probably what broke compatibility between HEAD and r366335. After removing that pkg, the kernel built and installed, world installed fine too and I have a working system again, with kernel and world in sync. So I tried again to move to HEAD: cd /usr/src svn up make buildworld -j12 make buildkernel -j12 make installkernel shutdown -r now <single user mode> mount -u / zpool import -Nf system (my /usr FS) KLD zfs.ko: depends on kernel - not available or version mismatch linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> Which results in dmesg messages: >> >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > > Be sure to check out /var/log/messages for extra issues. For example, with > the bug I mentioned below, I couldn't load my nvidia driver and that manifested > as one driver having issues because it depended on another, which had the root > of the problem. I forgot to look there. If I find anything suspicious there, I’ll let you know. That system doesn’t have a convenient mail client yet, so for now its copying output to files and scp-ing that to the Mac. >> I can load the zfs kernel module from kernel.old just fine: >> >> ZFS filesystem version: 5 >> ZFS storage pool version: features support (5000) > > I kicked my more bleeding-edge system over from 12.2-rel (r366954) up into > 13.0-current (r367044, 1300123) on 2020/10/26. OpenZFS kicked in 2020/8/24? > I think the CFT was ~2018/8/21, not sure when we had the OpenZFS ports. > Current bumps the ABI version pretty frequently so I'd think you'd have > tripped across versioning issues a long time ago if you had some drivers not > being rebuilt. Having a conflict between kernel and world was what I was expecting too, but I can’t figure out what got me into that situation. For all I know, they should be in sync now, especially after I reverted the tree back to rev 366335 and making world again (acc. to above method). > >> This happens with any kernel module I???ve tried, such as geom_mirror and amdgpu (from ports/graphics/drm-current-kmod - the latter causes a kernel panic with kernel.old BTW). >> >> I???ve gone back as far as Oct 7 (before changes to kern/elf_load_obj.c off the top of my head), looked at mailing list archives and forums etc, all to no avail. >> >> I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I had /etc/malloc.conf with the recommended symlink from UPDATING, but the same happens with that moved out of the way. Nothing seems to help. >> >> Do I need to go back further to get into a usable state or is there something else I should be doing? > > With very few exceptions (bug 250897, 2020/11/6), I've found 13-current > bootable since 10/26 (up through my current system, 13.0 r368388 (2020/12/6). > You obviously need to make sure that an extra drivers you add in are compiled > against the kernel, but ZFS is typically one of those. I think we covered that. Thanks for the help and the pointers, but unfortunately the mystery remains. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.Received on Tue Dec 08 2020 - 17:10:32 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:26 UTC