(quoting last post for convenience; more history at http://www.usenetarticles.com/thread/952336.html) > > vnode 0xffffff00037473e0: tag devfs, type VDIR > > usecount 0, writecount 0, refcount 1 mountedhere 0xffffff0003745ca0 > > flags (VV_ROOT) > > lock type devfs: EXCL (count 1) by thread 0xffffff00010e6680 (pid 1) > > Some additional facts: > > Looking at the printouts, there is always a sequence of three or more > (three at least twice; more than three at least once) vrele():s of the same > vnode, in both the successful case and the panicing case. There are no > vrele():s of any other vnodes in either case. > > Inserting enter/exit debug printouts in mountcheckdirs() confirms that all > calls occur within the bounds of a single call to mountcheckdirs(). Does > not this imply there is some locking mismatch in the non-ZFS specific code? > I must admit I find the locking confusing; with several locking/unlocking > functions/macros intermixed at different levels in the callstack. My > (incorrect) reading was that this panic should always be happening, which > is obviously not the case. > > Running with vfs.zfs.debug=1 confirms that vdev_geom open/attach/detach is > happening prior to any vrele() even in the panicing case (i.e., zfs pool > discovery seems to complete). > > In the case of an expected provider not being found, vd->vdev_devid is NULL > in vdev_geom_open(), based on the "provider not found" debug printout > (perhaps normal?). I *think* I just experienced the same problem on 7.0-BETA3, except the kernel does not have WITNESS/INVARIANTS so I just get a hack instead of a panic. I wanted to post with the information I have for completeness; I realize what follows is a bunch of anecdotal mumbo-jumbo. The boot-up process hangs right before the would-be 'trying to mount root from....", after all the glabel tasting has completed. This was on a completely different system than the one in the original post, but it also has root-on-zfs (this time on a 5 disk raidz2). It's a dual core amd64 machine with a low-end mobo and low-end SATA controllers (SiI and some built-in nVidia chipset). It all started when I was booting back into FreeBSD after having Windows booted for a while. It wouldn't boot. If fiddled some wiht vfs.zfs.debug=1, removing a cd ion the drive (in case it affected timing), but it did not help. I did not try the boot-7-live cd trick this time as I did originally on the other machine. I looked carefully to make sure all drives were detected, including geom tasting on all but one of them that are in the zfs pool. The I/O indicator leds on the respective drives that ar part of the zfs pool did not indicate any I/O after the hang. I waited 5+ minutes at least once in the hope that it was a drive timing out. After several attempts I turned off the machine and let it do a cold boot - at this point the system booted fine. This is different from before, in that previously the behavior was seemingly triggered by changes in system configuration (loss of a drive, etc). This time it was just a reboot. I *did* touch a bunch of cables in between, and blew some air on components (for reasons not relating to this) which I originally figured could explain the problem. Before this incident, the system has booted with root-on-zfs several times (at least 25, probably more like 50+) without any kind of problem, ever. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller_at_infidyne.com>' Key retrieval: Send an E-Mail to getpgpkey_at_scode.org E-Mail: peter.schuller_at_infidyne.com Web: http://www.scode.org
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:24 UTC