[I add notes about a problem that happens after the "fsck -B". Also forgot to mention: production style kernel world builds were in use. And a tried a powerpc64 build and it works the same.] On 2017-Jul-7, at 11:09 PM, Mark Millard <markmi_at_dsl-only.net> wrote: > [This note has more information than one sent with extra text > in the subject but with a partially different "to" list.] > > Peter Jeremy peter at rulingia.com wrote on > Sat Jul 8 02:00:47 UTC 2017 : > >> When did you first notice this (what SVN revision)? >> Do you know what the last good SVN revision was? >> Is this a new or old filesystem? >> Is the filesystem mounted/active or not when you dump it? >> What are the relevant parameters for the filesystem on ada0s3a? >> Are you running softupdates, journalling etc? >> Which dump(8) phase is reporting the errors? >> What are the exact dump and fsck commands you ran? > > I can add a little information with some contrast > and only "fsck -B" in use (with an unclean file > system from a prior crash), no dump use. Still: > a snapshot is involved in the below. > > Unfortunately two problems with major consequences > for my involved context limit the svn range that I > can cover for the activity, the problem version > ranges being: > > -r319722 through -r320651 (fixed by -r320652) > (actually this is why I had used "boot -s" > in what I report later: I could get to a > shell prompt that way instead of crashing > before any login prompt; the crashes left > the file system in need of repair) > > -r320509 through -r320561 (fixed by -r320570) > > So I was using -r320570 to avoid one of the > two problems. > > > > Context: 32-bit powerpc FreeBSD used on PowerMac G5 > so-called "Quad-core". (So big-endian as well.) > Softupdates, no journalling. Long-in-use file > system having lots of FreeBSD versions updates > and port rebuilds over the time. > > The following is from now, not from the time of the > example messages: > > # dumpfs / | more > magic 19540119 (UFS2) time Fri Jul 7 22:53:34 2017 > superblock location 65536 id [ <OMITTED> ] > ncg 158 size 25165823 blocks 24372006 > bsize 32768 shift 15 mask 0xffff8000 > fsize 4096 shift 12 mask 0xfffff000 > frag 8 shift 3 fsbtodb 3 > minfree 8% optim time symlinklen 120 > maxbsize 32768 maxbpg 4096 maxcontig 4 contigsumsize 4 > nbfree 2130375 ndir 65518 nifree 11769796 nffree 425065 > bpg 20032 fpg 160256 ipg 80128 unrefs 0 > nindir 4096 inopb 128 maxfilesize 2252349704110079 > sbsize 4096 cgsize 32768 csaddr 5048 cssize 4096 > sblkno 24 cblkno 32 iblkno 40 dblkno 5048 > cgrotor 127 fmod 0 ronly 0 clean 0 > metaspace 6408 avgfpdir 64 avgfilesize 16384 > flags soft-updates trim > fsmnt / > volname FBSDG4Srootfs swuid 0 providersize 25165823 > . . . > > > > What I had done that produced the messages was: > > <Prior failed multi-user boot from system problem > leaves root (only) file system not marked clean > so fsck -B will actually do something below> > > boot -s (so: single user mode) > # The next 3 lines are the content of a generic, manually-run script. > mount -u / > mount -a -t ufs (but there is no other file system) > swapon -a (there is a swap partition) > # > fsck -B > > That "fsck -B" caused the same kinds of lines > reported by Michael Butler, happening as fsck > makes a snapshot for the background processing > to use. (I have camera pictures and could type > in some of the lines if needed.) > > After those lines was text like (typed in from > an example camera picture): > > ** //.snap/fsck_snapshot > ** Last Mount on / > ** Root file system > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > ** Phase 5 - Check Cyl groups > Reclaimed: 0 directories, 1 files, 22680 fragments > 780914 files, 4797127 used, 19552199 free (443479 frags, 3288590 blocks, 1.8% fragmentation) > > ***** FILE SYSTEM MARKED CLEAN ***** [I forgot or mention that the context was a production style kernel and world build, no invariants or other such.] Since I'm running a patched -r320570 for the issue: -r319722 through -r320651 (fixed by -r320652) I went back and forced a power-off without shutdown and did the sequence: boot -s (so: single user mode) # The next 3 lines are the content of a generic, manually-run script. mount -u / mount -a -t ufs (but there is no other file system) swapon -a (there is a swap partition) # fsck -B but always waited briefly after the fsck -B finished. Like before the following happens as it tries to trim: (typed in from camera picture) panic: ffs_blkfree_cq: freeing free block cpuid = 2 (varies, of course) time = (varies) KDB: stack backtrace (stack addresses can vary: just an example here) 0xd23b17e0: at kdb_backtrace+0x5c 0xd23b1850: at vpanic+0x1e8 0xd23b18c0: at panic+0x54 0xd23b1910: at ffs_blkfree_cq+0x278 0xd23b1980: at ffs_blkfree_trim_task+0x60 0xd23b19b0: at taskqueue_run_locked+0x10 0xd23b1a10: at taskqueue_thread_loop+0x174 0xd23b1a50: at fork_exit+0xf4 0xd23b1a80: at fork_trampoline+0xc KDB: enter: panic [ thread pid 0 tid 1000082 ] Stopped at kdb_enter_0x70: addi r0,r0,0x0 I've tried this on a powerpc64 and it works the same, complete with the "freeing free block" issue. === Mark Millard markmi at dsl-only.netReceived on Sat Jul 08 2017 - 14:45:11 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC