Konstantin Belousov wrote: >On Wed, May 20, 2020 at 11:58:50PM -0700, Ryan Libby wrote: >> On Wed, May 20, 2020 at 6:04 PM Rick Macklem <rmacklem_at_uoguelph.ca> wrote: >> > >> > Hi, >> > >> > Since I hadn't upgraded a kernel through the winter, it took me a while >> > to bisect this, but r358252 seems to be the culprit. >> > >> > If I do a kernel build over NFS using my not so big Pentium 4 (single core, >> > 1.25Gbytes RAM, i386), about every second attempt will hang. >> > When I do a "ps" in the debugger, I see processes sleeping on btalloc. >> > If I revert to r358251, I cannot reproduce this. >> > >> > Any ideas? >> > >> > I can easily test any change you might suggest to see if it fixes the >> > problem. >> > >> > If you want more debug info, let me know, since I can easily >> > reproduce it. >> > >> > Thanks, rick >> >> Nothing obvious to me. I can maybe try a repro on a VM... >> >> ddb ps, acttrace, alltrace, show all vmem, show page would be welcome. >> >> "btalloc" is "We're either out of address space or lost a fill race." > >Yes, I would be not surprised to be out of something on 1G i386 machine. >Please also add 'show alllocks'. Ok, I used an up to date head kernel and it took longer to reproduce a hang. This time, none of the processes are stuck on "btalloc". I'll try and give you most of the above, but since I have to type it in by hand from the screen, I might not get it all. (I'm no real typist;-) > show alllocks exclusive lockmgr ufs (ufs) r = 0 locked _at_ kern/vfs_subr.c: 3259 exclusive lockmgr nfs (nfs) r = 0 locked _at_ kern/vfs_lookup.c:737 exclusive sleep mutex kernel area domain (kernel arena domain) r = 0 locked _at_ kern/subr_vmem.c:1343 exclusive lockmgr bufwait (bufwait) r = 0 locked _at_ kern/vfs_bio.c:1663 exclusive lockmgr ufs (ufs) r = 0 locked _at_ kern/vfs_subr.c:2930 exclusive lockmgr syncer (syncer) r = 0 locked _at_ kern/vfs_subr.c:2474 Process 12 (intr) thread 0x.. (1000008) exclusive sleep mutex Giant (Giant) r = 0 locked _at_ kern/kern_intr.c:1152 > ps - Not going to list them all, but here are the ones that seem interesting... 18 0 0 0 DL vlruwt 0x11d939cc [vnlru] 16 0 0 0 DL (threaded) [bufdaemon] 100069 D qsleep [bufdaemon] 100074 D - [bufspacedaemon-0] 100084 D sdflush 0x11923284 [/ worker] - and more of these for the other UFS file systems 9 0 0 0 DL psleep 0x1e2f830 [vmdaemon] 8 0 0 0 DL (threaded) [pagedaemon] 100067 D psleep 0x1e2e95c [dom0] 100072 D launds 0x1e2e968 [laundry: dom0] 100073 D umarcl 0x12cc720 [uma] … a bunch of usb and cam ones 100025 D - 0x1b2ee40 [doneq0] … 12 0 0 0 RL (threaded) [intr] 100007 I [swi6: task queue] 100008 Run CPU 0 [swi6: Giant taskq] … 100000 D swapin 0x1d96dfc [swapper] - and a bunch more in D state. Does this mean the swapper was trying to swap in? > acttrace - just shows the keyboard kdb_enter() at kdb_enter+0x35/frame vt_kbdevent() at vt_kdbevent+0x329/frame kdbmux_intr() at kbdmux_intr+0x19/frame taskqueue_run_locked() at taskqueue_run_locked+0x175/frame taskqueue_run() at taskqueue_run+0x44/frame taskqueue_swi_giant_run(0) at taskqueue_swi_giant_run+0xe/frame ithread_loop() at ithread_loop+0x237/frame fork_exit() at fork_exit+0x6c/frame fork_trampoline() at 0x../frame > show all vmem vmem 0x.. 'transient arena' quantum: 4096 size: 23592960 inuse: 0 free: 23592960 busy tags: 0 free tags: 2 inuse size free size 16777216 0 0 1 23592960 vmem 0x.. 'buffer arena' quantum: 4096 size: 94683136 inuse: 94502912 free: 180224 busy tags: 1463 free tags: 3 inuse size free size 16384 2 32768 1 16384 32768 39 1277952 1 32768 65536 1422 93192192 0 0 131072 0 0 1 131072 vmem 0x.. 'i386trampoline' quantum: 1 size: 24576 inuse: 20860 free: 3716 busy tags: 9 free tags: 3 inuse size free size 32 1 48 1 52 64 2 208 0 0 128 2 280 0 0 2048 1 2048 1 3664 4096 2 8192 0 0 8192 1 10084 0 0 vmem 0x.. 'kernel rwx arena' quantum: 4096 size: 0 inuse: 0 free: 0 busy tags: 0 free tags: 0 vmem 0x.. 'kernel area dom' quantum: 4096 size: 56623104 inuse: 56582144 free: 40960 busy tags: 11224 free tags: 3 inuse size free size 4096 11091 45428736 0 0 8192 63 516096 0 0 16384 12 196608 0 0 32768 6 196608 0 0 40960 2 81920 1 40960 65536 16 1048576 0 0 94208 1 94208 0 0 110592 1 110592 0 0 131072 15 2441216 0 0 262144 15 3997696 0 0 524288 1 524288 0 0 1048576 1 1945600 0 0 vmem 0x.. 'kernel arena' quantum: 4096 size: 390070272 inuse: 386613248 free: 3457024 busy tags: 873 free tags: 3 inuse size free size 4096 35 143360 1 4096 8192 2 16384 2 16384 12288 1 12288 0 0 16384 30 491520 0 0 20480 140 2867200 0 0 65536 1 65536 0 0 131072 631 82706432 0 0 1048576 0 0 1 1339392 2097152 27 56623104 1 2097152 8388608 1 13774848 0 0 16777216 3 74883072 0 0 33554432 1 36753408 0 0 67108864 1 118276096 0 0 > alltrace - I can't face typing too much more, but I'll put a few here that look interesting - for csh sched_switch() mi_switch() kern_yield() getblkx() breadn_flags() ffs_update() ufs_inactive() VOP_INACTIVE() vinactivef() vput_final() vm_object_deallocate() vm_map_process_deferred() kern_munmap() sys_munmap() - For cc sched_switch() mi_switch() sleepq_switch() sleepq_timedwait() _sleep() pause_sbt() vmem_bt_alloc() keg_alloc_slab() zone_import() cache_alloc() cache_alloc_retry() uma_zalloc_arg() bt_fill() vmem_xalloc() vmem_alloc() kmem_alloc() kmem_malloc_domainset() page_alloc() keg_alloc_slab() zone_import() cache_alloc() cache_alloc_retry() uma_zalloc_arg() nfscl_nget() nfs_create() vop_sigdefer() nfs_vnodeops_bypass() VOP_CREATE_APV() vn_open_cred() vn_open() kern_openat() sys_openat() Then there are a bunch of these... for cc, make sched_switch() mi_switch() sleepq_switch() sleepq_catch_signals() sleepq_wait_sig() kern_wait6() sys_wait4() - for vnlru sched_switch() mi_switch() sleepq_switch() sleepq_timedwait() _sleep() vnlru_proc() fork_exit() fork_trampoline() - for syncer sched_switch() mi_switch() critical_exit_preempt() intr_event_handle() intr_execute_handlers() lapic_handle_intr() Xapic_isr1() - interrupt memset() cache_alloc() cache_alloc_retry() uma_zalloc_arg() vmem_xalloc() vmem_bt_alloc() keg_alloc_slab() zone_import() cache_alloc() cache_alloc_retry() uma_zalloc_arg() bt_fill() vmem_xalloc() vmem_alloc() bufkva_alloc() getnewbuf() getblkx() breadn_flags() ffs_update() ffs_sync() sync_fsync() VOP_FSYNC_APV() sched_sync() fork_exit() fork_trampoline() - For bufdaemon (a bunch of these) sched_switch() mi_switch() sleepq_switch() sleepq_timedwait() _sleep() buf_daemon() fork_exit() fork_trampoline() vmdaemon and pagedaemon are basically just like above, sleeping in vm_daemon() or vm_pageout_worker() or vm_pageout_laundry_worker() or uma_reclaim_worker() That's all the typing I can take right now. I can probably make this happen again if you want more specific stuff. rickReceived on Fri May 22 2020 - 21:46:30 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC