[Just adding back traces. ffs_mount use of uma_zcreate is involved in the kernel thread, as is uma_reclaim_worker's use of uma_reclaim_locked in the uma thread.] On 2017-Dec-12, at 4:20 PM, Mark Millard <markmi at dsl-only.net> wrote: > On 2017-Dec-12, at 3:19 PM, Mark Millard <markmi at dsl-only.net> wrote: > >> On 2017-Dec-12, at 2:02 PM, Andreas Schwarz <freebsd.asc at strcmp.org> wrote: >> >>> On 12.12.17, Mark Millard wrote: >>> >>>> I initially jumped from -r326192 to -r326726 and >>>> ended up with a rpi2 that would normally hang >>>> somewhere around release APs being displayed. >>>> (I have had a couple of completed boots but many >>>> dozens of hung-up attempts.) Both a debug kernel >>>> and a non-debug kernel hang the same way. >>>> >>>> Bisecting the kernel (holding world -r326726 >>>> constant) showed: >>>> >>>> -r326346 did not hang (nor did before) >>>> -r326347 and later hung. >>> >>> JFYI, the latest kernel (and world) running at one of my >>> RPI2-B is r326631, without any issues. >> >> Interesting. (By the way: My context >> is with a V1.1 Cortex-A7 based rpi2, >> not V1.2 and Cortex-A53.) >> >> I've almost always run the root file >> system being on a USB SSD instead of >> on mmcsd0 . I wonder if that is >> somehow involved since it may be >> unusual. UFS file system. >> >> The USB SSD is on a powered hub that >> is in turn plugged into the rpi2. >> >> [I had the hang problem before the >> following and after.] >> >> The mechanism for holding mmcsd0 in >> failed recently but the ejection >> mechanism still works. So I hold >> in mmcsd0 until after I get a USB >> SSD boot now. (Interrupt boot, unload, >> boot/autoboot, picks up the kernel >> from the USB SSD.) >> >> This means that I effectively can >> not avoid the USB SSD any more >> unless I get my hands on a different >> V1.1 rpi2. > > Looks like I'll get my hands on a different > rpi2 V1.1 in a few days. So I should then > be able to do reasonable mmcsd0-only > experiments. At least once I find the time. > > FYI, in case boot details are involved > in reproducing the problem. . . > > On the mmcsd0 I have /boot/loader.conf with: > > kern.cam.boot_delay="10000" > vfs.mountroot.timeout="10" > > and /etc/fstab with: > > /dev/ufs/RPI2rootfs / ufs rw,noatime 1 1 > /dev/label/RPI2Aswap none swap sw 0 0 > /dev/label/RPI2Aboot /boot/msdos msdosfs rw,noatime 0 0 > > where the /dev/ufs/RPI2rootfs was the USB SSD. > > However, I interrupt the loader and unload and > then boot or autoboot. (But the hangs started > before this extra sequence was involved.) > > On the USB SSD I have /boot/loader.conf with: > > kern.cam.boot_delay="10000" > vfs.mountroot.timeout="10" > > and /etc/fstab with: > > /dev/da0p1 / ufs rw,noatime 1 1 > /dev/da0p2 none swap sw 0 0 > > > What db> showed does point to things that > -r326347 involve: > > chain 32: > thread 100001 (pid 1, kernel) blocked on sx "umadrain" XLOCK > thread 100077 (pid 23, uma) is on a run queue > > But for all I know -r326347 could be depending > on something required to be true but not correct > elsewhere in the rpi2 support. I'm not claiming > that -r326347 is wrong, just that it is involved. > I've way to little knowledge to claim to know > what is wrong on the evidence that I have. > > I've not yet tried a bpi-m3 Cortex-A7 context or > a pine64+ 2GB or rpi3 Cortex-A53 context. Nor > powerpc64 nor powerpc. At some point I'll get the > time for one or more of these. I've not had amd64 > problems in this area. > > I may not be able to test the bpi-m3: its barrel > connector for power seems flaky and it is > difficult to keep the board powered for long > periods in recent times. For this new example -r326347 kernel based boot-hang: chain 35: thread 100001 (pid 1, kernel) blocked on sx "umadrain" XLOCK thread 100073 (pid 23, uma) is on a run queue db> bt Tracing pid 1 tid 100001 td 0xd429f000 cpu_switch() at cpu_switch+0x18 pc = 0xc0584f80 lr = 0xc0299ad8 (sched_switch+0x5c0) sp = 0xd42a56b8 fp = 0xd42a56e8 sched_switch() at sched_switch+0x5c0 pc = 0xc0299ad8 lr = 0xc0275f28 (mi_switch+0x258) sp = 0xd42a56f0 fp = 0xd42a5718 r4 = 0x0001f0f5 r5 = 0x00000000 r6 = 0xd429f000 r7 = 0x0006612d r8 = 0x00000000 r9 = 0x00000104 r10 = 0xc08073f0 mi_switch() at mi_switch+0x258 pc = 0xc0275f28 lr = 0xc02c1508 (sleepq_switch+0x1c0) sp = 0xd42a5720 fp = 0xd42a5748 r4 = 0xd429f000 r5 = 0xc0947dc4 r6 = 0xc09a9c68 r7 = 0xc0947dc0 r8 = 0x00000000 r9 = 0xc09440c0 r10 = 0x000000f4 sleepq_switch() at sleepq_switch+0x1c0 pc = 0xc02c1508 lr = 0xc02c1308 (sleepq_wait+0x48) sp = 0xd42a5750 fp = 0xd42a5760 r4 = 0xd429f000 r5 = 0x00000000 r6 = 0xc09a9c68 r7 = 0xc06b415c r8 = 0x00000001 r9 = 0xc09a9c78 r10 = 0x00000000 sleepq_wait() at sleepq_wait+0x48 pc = 0xc02c1308 lr = 0xc02748b0 (_sx_slock_hard+0x298) sp = 0xd42a5768 fp = 0xd42a57c8 r4 = 0xc09a9c68 r5 = 0xc0823ac0 r6 = 0xc06ae146 r7 = 0x00000000 _sx_slock_hard() at _sx_slock_hard+0x298 pc = 0xc02748b0 lr = 0xc027457c (_sx_slock_int+0x140) sp = 0xd42a57d0 fp = 0xd42a57f8 r4 = 0x00000078 r5 = 0x00000765 r6 = 0xc09a9c68 r7 = 0x00000765 r8 = 0xc09a9c78 r9 = 0xd7ad8740 r10 = 0x00000000 _sx_slock_int() at _sx_slock_int+0x140 pc = 0xc027457c lr = 0xc052b024 (uma_zcreate+0x10c) sp = 0xd42a5800 fp = 0xd42a5850 r4 = 0x00000078 r5 = 0xc09a9c68 r6 = 0xc06df86e r7 = 0xc06dd153 r8 = 0x00000000 r9 = 0x00000000 r10 = 0x00000000 uma_zcreate() at uma_zcreate+0x10c pc = 0xc052b024 lr = 0xc050cc90 (ffs_mount+0x80) sp = 0xd42a5858 fp = 0xd42a5980 r4 = 0xd7844000 r5 = 0x00000000 r6 = 0x00000003 r7 = 0xc09a99c0 r8 = 0x00000000 r9 = 0xd429f000 r10 = 0xd828c400 ffs_mount() at ffs_mount+0x80 pc = 0xc050cc90 lr = 0xc032e1d4 (vfs_donmount+0xeec) sp = 0xd42a5988 fp = 0xd42a5b38 r4 = 0xffffffff r5 = 0xd74ec120 r6 = 0xd429f000 r7 = 0x00000000 r8 = 0x00000000 r9 = 0xc425acd0 r10 = 0xd828c400 vfs_donmount() at vfs_donmount+0xeec pc = 0xc032e1d4 lr = 0xc0330ea8 (kernel_mount+0x70) sp = 0xd42a5b40 fp = 0xd42a5b78 r4 = 0xd82a9000 r5 = 0x00000000 r6 = 0x00004000 r7 = 0x00000000 r8 = 0xd87e2050 r9 = 0xd87e2040 r10 = 0x00000000 kernel_mount() at kernel_mount+0x70 pc = 0xc0330ea8 lr = 0xc03335dc (parse_mount+0x458) sp = 0xd42a5b80 fp = 0xd42a5c60 r4 = 0xd4263500 r5 = 0xd87e2054 r6 = 0xd82a9000 r7 = 0x00000000 parse_mount() at parse_mount+0x458 pc = 0xc03335dc lr = 0xc0331a44 ($a.2+0x28) sp = 0xd42a5c68 fp = 0xd42a5dc8 r4 = 0xd7845018 r5 = 0x00000000 r6 = 0xd7845018 r7 = 0xd78404da r8 = 0xc06bfbd4 r9 = 0xfffffff7 r10 = 0xd7836380 $a.2() at $a.2+0x28 pc = 0xc0331a44 lr = 0xc020a82c (start_init+0x5c) sp = 0xd42a5dd0 fp = 0xd42a5e20 r4 = 0xc06a370c r5 = 0xc0823ad0 r6 = 0x00000000 r7 = 0x00000000 r8 = 0xd42a5e48 r9 = 0x00000000 r10 = 0xd429b000 start_init() at start_init+0x5c pc = 0xc020a82c lr = 0xc0230f10 (fork_exit+0xa0) sp = 0xd42a5e28 fp = 0xd42a5e40 r4 = 0xd429f000 r5 = 0xd429b000 r6 = 0xc020a7d0 r7 = 0x00000000 r8 = 0xd42a5e48 r9 = 0x00000000 r10 = 0x00000000 fork_exit() at fork_exit+0xa0 pc = 0xc0230f10 lr = 0xc0564c8c (swi_exit) sp = 0xd42a5e48 fp = 0x00000000 r4 = 0xc020a7d0 r5 = 0x00000000 r6 = 0x00000000 r7 = 0x00000000 r8 = 0x00000000 r10 = 0x00000000 swi_exit() at swi_exit pc = 0xc0564c8c lr = 0xc0564c8c (swi_exit) sp = 0xd42a5e48 fp = 0x00000000 db> bt Tracing pid 23 tid 100073 td 0xd7ad8740 cpu_switch() at cpu_switch+0x18 pc = 0xc0584f80 lr = 0xc0299ad8 (sched_switch+0x5c0) sp = 0xd87dad58 fp = 0xd87dad88 sched_switch() at sched_switch+0x5c0 pc = 0xc0299ad8 lr = 0xc0275f28 (mi_switch+0x258) sp = 0xd87dad90 fp = 0xd87dadb8 r4 = 0x00000025 r5 = 0x00000000 r6 = 0xd7ad8740 r7 = 0x000240ea r8 = 0x00000000 r9 = 0x00000100 r10 = 0xc08073f0 mi_switch() at mi_switch+0x258 pc = 0xc0275f28 lr = 0xc052d9f8 (uma_reclaim_locked+0x200) sp = 0xd87dadc0 fp = 0xd87dade8 r4 = 0x00000003 r5 = 0xc08073f0 r6 = 0xc08073f0 r7 = 0x00000000 r8 = 0x00000000 r9 = 0xc0824310 r10 = 0xc06df86e uma_reclaim_locked() at uma_reclaim_locked+0x200 pc = 0xc052d9f8 lr = 0xc052de70 (uma_reclaim_worker+0x4c) sp = 0xd87dadf0 fp = 0xd87dae20 r4 = 0xc09a9c68 r5 = 0xc4242d80 r6 = 0xc06df86e r7 = 0x00000000 r8 = 0x00000100 r9 = 0xc09b3d08 r10 = 0xc09a9c7c uma_reclaim_worker() at uma_reclaim_worker+0x4c pc = 0xc052de70 lr = 0xc0230f10 (fork_exit+0xa0) sp = 0xd87dae28 fp = 0xd87dae40 r4 = 0xd7ad8740 r5 = 0xd70ce000 r6 = 0xc052de24 r7 = 0x00000000 r8 = 0xd87dae48 r9 = 0xd7ada3a0 r10 = 0xc0824e8c fork_exit() at fork_exit+0xa0 pc = 0xc0230f10 lr = 0xc0564c8c (swi_exit) sp = 0xd87dae48 fp = 0x00000000 r4 = 0xc052de24 r5 = 0x00000000 r6 = 0x7ff6d83f r7 = 0xd6f583a0 r8 = 0xc0824dcc r10 = 0xc0824e8c swi_exit() at swi_exit pc = 0xc0564c8c lr = 0xc0564c8c (swi_exit) sp = 0xd87dae48 fp = 0x00000000 db> show thread 100001 Thread 100001 at 0xd429f000: proc (pid 1): 0xd429b000 name: kernel stack: 0xd42a4000-0xd42a5fff flags: 0x4 pflags: 0x20000000 state: INHIBITED: {SLEEPING} wmesg: umadrain wchan: 0xc09a9c68 sleeptimo 0. 0 (curr 26. e3fd787f00000000) priority: 84 container lock: sleepq chain (0xc0947dc4) last voluntary switch: 26523 ms ago last involuntary switch: 26695 ms ago db> show thread 100073 Thread 100073 at 0xd7ad8740: proc (pid 23): 0xd70ce000 name: uma stack: 0xd87d9000-0xd87dafff flags: 0x4 pflags: 0x200000 state: RUNQ priority: 84 container lock: sched lock 3 (0xc0942ec0) last voluntary switch: 26694 ms ago db> show lock 0xc0942ec0 class: spin mutex name: sched lock 3 flags: {SPIN, RECURSE} state: {UNOWNED} === Mark Millard markmi at dsl-only.netReceived on Wed Dec 13 2017 - 00:46:07 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:14 UTC