I believe I've found a case where sched_ule can never honor cpu_set affinity, and I've attached a suggestion to fix it -- if I'm way off on how this should be implemented, please let me know! Imagine the case where sched_affinity is called and the thread is on a RUNQ on the wrong CPU -- sched_affinity simply exits, relying on something else in the scheduler to migrate the thread if need be. The next time the thread is chosen to run, it runs on the wrong CPU. Worse, if the thread never goes to sleep or is chosen to be moved by the load balancer, the thread will continue running on the wrong CPU indefinitely. Attached is a suggestion of how to change the scheduler to honor affinity in this case. With the attached diff, sched_ule will allow a thread to run on the wrong CPU for one slice. Then when the thread moves through sched_switch, if it was running on the wrong CPU, it will be migrated to the right CPU. I've written a test where one thread will bind another thread to a particular CPU (using cpusets) then waits until it's running on the right CPU before binding it to a different CPU (and continuing ad nauseam). Without the change, the test will sometimes hang waiting for the second thread to get on the correct CPU -- with the change, it works every time. Thanks! -Justin
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:42 UTC