On 2007-Aug-23 16:29:26 +0100, Robert Watson <rwatson_at_FreeBSD.org> wrote: >> all the ports and a bit of hair removal, I tracked the problem to the >> 'idprio' command in /usr/local/etc/rc.d/boinc - without that, all works >> fine. If I include that, boinc_client stops sending heartbeat messages. >When I run "idprio 20 echo hi", it seems to execute per normal as root, and >not at all as an unprivileged user, which I think is the desired symptom of >using it. The offending command in the rc.d is (as root): idprio 31 su - boinc -c '/usr/local/bin/boinc_client ...&' boinc_client forks the actual computation process (setiathome etc). The setiathome process is basically CPU bound but calls usleep() occasionally (so on an otherwise idle system, top shows it using around 95% CPU and in nanslp). boinc_client is supposed to write a watchdog flag in a shared SysV SHM block every second. The setiathome process regularly polls the SHM and if it doesn't see the watchdog for 31 seconds, it will abort. boinc_client basically sits in a loop and uses select() timeouts. I wrote a program to monitor the SHM and it shows that SHM is not being updated. It looks like the kernel isn't cleanly handling the situation where there are multiple idprio processes. I will try some more experimenting this evening. >If you're running things with idprio, is it definitely the case that your >system is sometimes idle allowing the program to run once in a while? It used to work fine even whilst doing a buildworld, now it won't work on an otherwise idle system... -- Peter Jeremy
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:16 UTC