On 2017-Apr-27, at 10:26 PM, Mark Millard <markmi at dsl-only.net> wrote: > [As the text does not really follow from the earlier text > I'd sent directly I'm top posting a hypothesis about where > so much active memory came from to be so low in available > memory to get an reclaim_pv_chunk attempt.] > > My hypothesis is that the "mdconfig -d"s from vm_copy_base > (from /usr/src/release/tools/vmimage.subr ) did not actually > release the memory resources involved (from vnode backed > mdconfig use): I watched with "vmstat 1" and "mdconfig -d" did release memory (including RAM) each time. > . . . > From the prior top report for the failure, > partially repeated here: > > . . . > Mem: 1618M Active, 17M Inact, 315M Wired, 204M Buf, 15M Free > Swap: 6144M Total, 34M Used, 6110M Free, 348K Out > > PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME CPU COMMAND > 48988 root 4 31 0 651M 27048K 0K RUN 0 0:03 87.60% xz -T 0 /usr/obj/DESTDIRs/vmimage-aarch64/vmimages/FreeBSD-12.0-CURRENT- > . . . > > The combination 1618M Mem:Active but 34M Swap:Used > and 651M xz:SIZE but 27M xz:RES and 0K xz:SWAP just > seems very odd, like it should not happen. The 17M > Mem:Inact is odd for the context as well. (Mem:Wired, > Mem:Buf, and Mem:Free look normal.) > > An alternate hypothesis would be the memory "leak" > is from mkimg not having it memory-use cleaned up. > This happens after vm_copy_base but before the cp/xz > sequence and is what produces vm.raw. mkimg also release its memory (including RAM) each time. But the later "cp -p" of the large vm.raw that I was producing ended up without leaving the free memory that is expected after it finished. I worked around the issue with (just a personal workaround that helps in other respects as well): Index: Makefile.vm =================================================================== --- Makefile.vm (revision 317015) +++ Makefile.vm (working copy) _at__at_ -119,15 +119,20 _at__at_ vm-install: .if defined(WITH_VMIMAGES) && !empty(WITH_VMIMAGES) mkdir -p ${DESTDIR}/vmimages -. for FORMAT in ${VMFORMATS} - cp -p ${VMBASE}.${FORMAT} \ - ${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT} -. endfor . if defined(WITH_COMPRESSED_VMIMAGES) && !empty(WITH_COMPRESSED_VMIMAGES) . for FORMAT in ${VMFORMATS} - # Don't keep the originals. There is a copy in ${.OBJDIR} if needed. - ${XZ_CMD} ${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT} + # Tradeoff "cp -p" property for less memory, I/O, and time use. + # Also: -3 got > 30 MiByte/sec effective (source file) for 32 GiByte, mostly-zero image + # on a Pine64+ 2GB and used about 600 MiBytes of "active virtual memory". The about 16 + # minutes was vastly faster than the "cp -p" --and the sha512 and sha256 are vastly + # faster on the compressed file as well. + ${XZ_CMD} -T 0 -3 --stdout --memlimit-compress=50% -v ${VMBASE}.${FORMAT} > ${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT}.xz . endfor +. else +. for FORMAT in ${VMFORMATS} + cp -p ${VMBASE}.${FORMAT} \ + ${DESTDIR}/vmimages/${OSRELEASE}.${FORMAT} +. endfor . endif cd ${DESTDIR}/vmimages && sha512 ${OSRELEASE}* > \ ${DESTDIR}/vmimages/CHECKSUM.SHA512 and using WITH_COMPRESSED_VMIMAGES to avoid a "cp -p" based copy of the large file. Prior reports from capturing text: On 2017-Apr-27, at 7:31 PM, Mark Millard <markmi at dsl-only.net> wrote: > [Another example panic. Again no dump. But I have what > a top -PCwaopid froze at this time.] > > On 2017-Apr-27, at 4:22 PM, Mark Millard <markmi at dsl-only.net> wrote: > >> Unfortunately for this FYI the attempt to produce a dump >> failed and so all the information that I have is what I >> first captured from the console output: a backtrace. >> >> The context was head -r317015 on a Pine64+ 2GB. At the >> time I was experimenting with trying to build a vm.raw >> from my own build of FreeBSD. The (root) file system >> is on a USB SSD off of a powered USB hub. >> >> panic: ARM64TODO: reclaim_pv_chunk >> cpuid = 1 >> time = 1493332968 >> KDB: stack backtrace: >> db_trace_self() at db_trace_self_wrapper+0x28 >> pc = 0xffff000000605cc0 lr = 0xffff0000000869cc >> sp = 0xffff000083ba4f00 fp = 0xffff000083ba5110 >> >> db_trace_self_wrapper() at vpanic+0x164 >> pc = 0xffff0000000869cc lr = 0xffff00000031d464 >> sp = 0xffff000083ba5120 fp = 0xffff000083ba5190 >> >> vpanic() at panic+0x4c >> pc = 0xffff00000031d464 lr = 0xffff00000031d2fc >> sp = 0xffff000083ba51a0 fp = 0xffff000083ba5220 >> >> panic() at reclaim_pv_chunk+0x10 >> pc = 0xffff00000031d2fc lr = 0xffff00000061a234 >> sp = 0xffff000083ba5230 fp = 0xffff000083ba5230 >> >> reclaim_pv_chunk() at get_pv_entry+0x240 >> pc = 0xffff00000061a234 lr = 0xffff000000616184 >> sp = 0xffff000083ba5240 fp = 0xffff000083ba5260 >> >> get_pv_entry() at pmap_enter+0x694 >> pc = 0xffff000000616184 lr = 0xffff0000006156a0 >> sp = 0xffff000083ba5270 fp = 0xffff000083ba5300 >> >> pmap_enter() at vm_fault_hold+0x28c >> pc = 0xffff0000006156a0 lr = 0xffff0000005b9740 >> sp = 0xffff000083ba5310 fp = 0xffff000083ba5460 >> >> vm_fault_hold() at vm_fault+0x70 >> pc = 0xffff0000005b9740 lr = 0xffff0000005b9464 >> sp = 0xffff000083ba5470 fp = 0xffff000083ba54a0 >> >> vm_fault() at data_abort+0xe0 >> pc = 0xffff0000005b9464 lr = 0xffff00000061ad94 >> sp = 0xffff000083ba54b0 fp = 0xffff000083ba5560 >> >> data_abort() at handle_el1h_sync+0x70 >> pc = 0xffff00000061ad94 lr = 0xffff000000607870 >> sp = 0xffff000083ba5570 fp = 0xffff000083ba5680 >> >> handle_el1h_sync() at kern_select+0x9fc >> pc = 0xffff000000607870 lr = 0xffff00000037db3c >> sp = 0xffff000083ba5690 fp = 0xffff000083ba58f0 >> >> kern_select() at sys_select+0x5c >> pc = 0xffff00000037db3c lr = 0xffff00000037dc58 >> sp = 0xffff000083ba5900 fp = 0xffff000083ba5930 >> >> sys_select() at do_el0_sync+0xa48 >> pc = 0xffff00000037dc58 lr = 0xffff00000061b91c >> sp = 0xffff000083ba5940 fp = 0xffff000083ba5a70 >> >> do_el0_sync() at handle_el0_sync+0x6c >> pc = 0xffff00000061b91c lr = 0xffff0000006079e8 >> sp = 0xffff000083ba5a80 fp = 0xffff000083ba5b90 >> >> handle_el0_sync() at 0x4948c >> pc = 0xffff0000006079e8 lr = 0x000000000004948c >> sp = 0xffff000083ba5ba0 fp = 0x0000ffffffffd960 > > > This time I got to record from top: > (swap is on a swap partition) > (pid 49888's SIZE vs. RES and SWAP might be interesting) > (as might the Active figure) > > last pid: 48988; load averages: 0.64, 0.44, 0.38 up 0+04:21:01 19:19:50 > 32 processes: 2 running, 30 sleeping > CPU 0: 13.2% user, 0.0% nice, 23.2% system, 0.3% interrupt, 63.3% idle > CPU 1: 4.6% user, 0.0% nice, 23.9% system, 0.0% interrupt, 71.5% idle > CPU 2: 2.1% user, 0.0% nice, 23.2% system, 0.0% interrupt, 74.8% idle > CPU 3: 3.3% user, 0.0% nice, 23.8% system, 0.0% interrupt, 72.8% idle > Mem: 1618M Active, 17M Inact, 315M Wired, 204M Buf, 15M Free > Swap: 6144M Total, 34M Used, 6110M Free, 348K Out > > PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME CPU COMMAND > 48988 root 4 31 0 651M 27048K 0K RUN 0 0:03 87.60% xz -T 0 /usr/obj/DESTDIRs/vmimage-aarch64/vmimages/FreeBSD-12.0-CURRENT-arm64-aarch64.raw > 11983 root 1 22 0 5068K 0K 0K wait 3 0:00 0.00% make vm-image vm-install DESTDIR=/usr/obj/DESTDIRs/vmimage-aarch64 (<make>) > 11981 root 1 42 0 7320K 0K 1516K wait 1 0:00 0.00% sh /root/sys_build_scripts.aarch64-host/make_noscript_aarch64_nodebug_clang_bootstrap-aarch64-host.sh vm-image vm-install > 11980 root 1 20 0 6656K 1548K 0K select 0 0:02 0.00% [script] > 11977 root 1 30 0 7320K 0K 1516K wait 3 0:00 0.00% /bin/sh /root/sys_build_scripts.aarch64-host/make_aarch64_nodebug_clang_bootstrap-aarch64-host.sh vm-image vm-install DEST > 2694 root 1 20 0 8804K 2072K 0K CPU2 2 0:07 0.17% top -PCwaopid > 827 root 1 20 0 7320K 0K 360K wait 0 0:00 0.00% su (<sh>) > 826 markmi 1 22 0 10372K 0K 1532K wait 3 0:00 0.00% su (<su>) > 820 markmi 1 24 0 7320K 0K 1516K wait 1 0:00 0.00% -sh (<sh>) > 819 markmi 1 20 0 18416K 1152K 0K select 1 0:21 0.00% sshd: markmi_at_pts/1 (sshd) > 816 root 1 20 0 18416K 3276K 0K select 0 0:00 0.00% sshd: markmi [priv] (sshd) > 765 root 1 20 0 7320K 0K 224K wait 2 0:00 0.00% su (<sh>) > 764 markmi 1 23 0 10372K 0K 1532K wait 0 0:00 0.00% su (<su>) > 758 markmi 1 31 0 7320K 0K 1516K wait 1 0:00 0.00% -sh (<sh>) > 757 markmi 1 20 0 18416K 228K 904K select 3 0:01 0.01% sshd: markmi_at_pts/0 (sshd) > 754 root 1 25 0 18416K 3276K 0K select 1 0:00 0.00% sshd: markmi [priv] (sshd) > 746 root 1 27 0 7320K 1532K 0K ttyin 0 0:00 0.00% -sh (sh) > 745 root 1 20 0 10372K 0K 1532K wait 1 0:00 0.00% login [pam] (<login>) > 700 root 1 20 0 6948K 0K 168K nanslp 1 0:00 0.00% /usr/sbin/cron -s (<cron>) > 696 smmsp 1 20 0 10460K 0K 184K pause 0 0:00 0.00% sendmail: Queue runner_at_00:30:00 for /var/spool/clientmqueue (<sendmail>) > 693 root 1 20 0 10460K 1392K 0K select 1 0:00 0.03% sendmail: accepting connections (sendmail) > 690 root 1 20 0 15800K 968K 0K select 2 0:00 0.00% /usr/sbin/sshd > 661 root 1 20 0 6656K 344K 0K select 2 0:01 0.00% /usr/sbin/powerd > 658 root 2 20 0 12788K 12672K 0K select 0 0:02 0.01% /usr/sbin/ntpd -g -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/db/ntpd.drift > 620 root 32 52 0 6384K 1100K 0K rpcsvc 1 0:00 0.00% nfsd: server (nfsd) > 619 root 1 52 0 6384K 704K 0K select 1 0:00 0.00% nfsd: master (nfsd) > 617 root 1 20 0 6684K 688K 0K select 1 0:00 0.00% /usr/sbin/mountd -r > 478 root 1 20 0 6676K 596K 0K select 3 0:00 0.00% /usr/sbin/rpcbind > 469 root 1 20 0 6680K 572K 0K select 2 0:00 0.00% /usr/sbin/syslogd -s > 396 root 1 20 0 9580K 32K 0K select 0 0:00 0.00% /sbin/devd > 308 _dhcp 1 20 0 6800K 532K 0K select 2 0:00 0.00% dhclient: awg0 (dhclient) > 307 root 1 52 0 6800K 424K 0K select 2 0:00 0.00% dhclient: awg0 [priv] (dhclient) > > And here is the backtrace: > > timeout stopping cpus > panic: ARM64TODO: reclaim_pv_chunk > cpuid = 0 > time = 1493345992 > KDB: stack backtrace: > db_trace_self() at db_trace_self_wrapper+0x28 > pc = 0xffff000000605cc0 lr = 0xffff0000000869cc > sp = 0xffff000083d301d0 fp = 0xffff000083d303e0 > > db_trace_self_wrapper() at vpanic+0x164 > pc = 0xffff0000000869cc lr = 0xffff00000031d464 > sp = 0xffff000083d303f0 fp = 0xffff000083d30460 > > vpanic() at panic+0x4c > pc = 0xffff00000031d464 lr = 0xffff00000031d2fc > sp = 0xffff000083d30470 fp = 0xffff000083d304f0 > > panic() at reclaim_pv_chunk+0x10 > pc = 0xffff00000031d2fc lr = 0xffff00000061a234 > sp = 0xffff000083d30500 fp = 0xffff000083d30500 > > reclaim_pv_chunk() at get_pv_entry+0x240 > pc = 0xffff00000061a234 lr = 0xffff000000616184 > sp = 0xffff000083d30510 fp = 0xffff000083d30530 > > get_pv_entry() at pmap_enter+0x694 > pc = 0xffff000000616184 lr = 0xffff0000006156a0 > sp = 0xffff000083d30540 fp = 0xffff000083d305d0 > > pmap_enter() at vm_fault_hold+0x28c > pc = 0xffff0000006156a0 lr = 0xffff0000005b9740 > sp = 0xffff000083d305e0 fp = 0xffff000083d30730 > > vm_fault_hold() at vm_fault+0x70 > pc = 0xffff0000005b9740 lr = 0xffff0000005b9464 > sp = 0xffff000083d30740 fp = 0xffff000083d30770 > > vm_fault() at data_abort+0xe0 > pc = 0xffff0000005b9464 lr = 0xffff00000061ad94 > sp = 0xffff000083d30780 fp = 0xffff000083d30830 > > data_abort() at handle_el0_sync+0x6c > pc = 0xffff00000061ad94 lr = 0xffff0000006079e8 > sp = 0xffff000083d30840 fp = 0xffff000083d30950 > > handle_el0_sync() at 0x400a3de4 > pc = 0xffff0000006079e8 lr = 0x00000000400a3de4 > sp = 0xffff000083d30960 fp = 0x0000ffffbfdfcd30 > > KDB: enter: panic > [ thread pid 48988 tid 100230 ] > Stopped at kdb_enter+0x44: undefined d4200000 > === Mark Millard markmi at dsl-only.net _______________________________________________ freebsd-arm_at_freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-arm To unsubscribe, send any mail to "freebsd-arm-unsubscribe_at_freebsd.org"Received on Sun Apr 30 2017 - 23:31:07 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC