Konstantin Belousov kostikbel at gmail.com wrote on Fri Sep 13 05:53:41 UTC 2019 : > Basically, page fault handler waits for vm.pfault_oom_wait * > vm.pfault_oom_attempts for a page allocation before killing the process. > Default is 30 secs, and if you cannot get a page for 30 secs, there is > something very wrong with the machine. The following was not for something like a Ryzen, but for an armv7 board using a USB device for the file system and swap/paging partition. Still it may be a suggestive example of writing out a large amount of laundry. There was an exchange I had with Warner L. that implied easily having long waits in the queue when trying to write out the laundry (or other such) in low end contexts. I extract some of it below. dT: 1.006s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 56 312 0 0 0.0 312 19985 142.6 0 0 0.0 99.6| da0 Note: L(q) could be a lot bigger than 56 but I work with the example figures that I used at the time and that Warner commented on. The 142.6 ms/w includes time waiting in the queue and was vastly more stable than the L(q) figures. Warner wrote, in part: QUOTE 142.6ms/write is the average of the time that the operations that completed during the polling interval took to complete. There's no estimating here. So, at 6 or 7 per second for the operation to complete, coupled with a parallel factor of 1 (typical for low end junk flash), we wind up with 56 operations in the queue taking 8-10s to complete. END QUOTE Things went on from there but part of it was based on a reporting patch that Mark Johnston had provided. Me: It appears to me that, compared to a observed capacity of roughly around 20 MiBytes/sec for writes, large amounts of bytes are being queued up to be written in a short time, for which it just takes a while for the backlog to be finished. Warner: Yes. That matches my expectation as well. In other devices, I've found that I needed to rate-limit things to more like 50-75% of the max value to keep variance in performance low. It's the whole reason I wrote the CAM I/O scheduler. Me: The following is from multiple such runs, several manually stopped but some killed because of sustained low free memory. I had left vm.pageout_oom_seq=12 in place for this, making the kills easier to get than the 120 figure would. It does not take very long generally for some sort of message to show up. (Added Note: 65s and 39s were at the large end of what I reported at the time.) . . . swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288 waited 65s for async swap write waited 65s for swap buffer waited 65s for async swap write waited 65s for async swap write waited 65s for async swap write v_free_count: 955, v_inactive_count: 1 Aug 20 06:11:49 pine64 kernel: pid 1047 (stress), uid 0, was killed: out of swap space waited 5s for async swap write waited 5s for swap buffer waited 5s for async swap write waited 5s for async swap write waited 5s for async swap write swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314021, size: 12288 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314084, size: 32768 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314856, size: 32768 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314638, size: 131072 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312518, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312416, size: 16384 waited 39s for async swap write waited 39s for swap buffer waited 39s for async swap write waited 39s for async swap write waited 39s for async swap write swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314802, size: 24576 . . . Warner: These numbers are consistent with the theory that the swap device becomes overwhelmed, spiking latency and causing crappy down-stream effects. You can use the I/O scheduler to limit the write rates at the low end. You might also be able to schedule a lower write queue depth at the top end as well, but I've not seen good ways to do that. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)Received on Sat Sep 14 2019 - 21:56:35 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC