My main server is running 8.0/amd64 from between RC1 and RC2 and I've recently had a couple of long-duration hangs on it during which time processes doing I/O will stop responding. The first time, it stopped responding for about 25 minutes and then spontaneously corrected itself. I was logged in remotely the whole time and Ctrl-T was responding throughout (claiming the process was 'runnable'). I tried starteding a second session - which got as far as reporting the SSH banner I have configured and then did nothing. The second time lasted about 5 minutes. I can't find anything in any log files or dmesg. 'vmstat -m' output looks sensible. Unfortunately, I didn't have access to the console on either occasion. The system is a dual-core Athlon with the base OS (root/usr/var) on UFS and the remainder of the filesystem ZFS. It's running SCHEDULE. It runs a pair of BOINC processes in the background. The first time, it should have been otherwise unused apart from a mairix (mail indexing tool) process that I'd just started. The second time, it would have been running a buildkernel. Based on it managing to report the ssh banner (which is stored in /etc) but not getting to a shell prompt (my home directory is ZFS), my initial suspicion was ZFS but it occurs to me that it could be a priority-inversion problem with the BOINC processes. Can anyone suggest where to go looking for a cause? -- Peter Jeremy
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC