I'm having a problem where userland threads are running with a loaned (via mutex priority propagation) priority even after they have released the mutex. This is causing the low-priority userland thread to starve high-priority interrupt threads. One scenario is(which I am seeing on FreeBSD 8.2, but the priority lending code does not seem to have changed in HEAD): 1) I have several swi threads that are pulling packets off of interfaces and forwarding them. Each swi thread is pinned to a CPU. 2) A remote user initiates an scp of a large file to this machine. 3) sshd (which handles the scp) takes a mutex on its socket 4) An interrupt or taskqueue thread belonging to the driver that is receiving the scp traffic tries to take the same mutex. It can't and so it lends its priority to the sshd thread. 5) My swi thread is woken. It is pinned to the same CPU as sshd, but it has a lower priority than the lent priority, so it is placed on the runqueue. 6) The sshd thread releases the mutex. sched_unlend_prio is called to set its priority back to a normal user priority. However, it does not check whether any higher-priority threads are waiting in the runqueue. 7) The sshd thread is allowed to run for its full quantum (100ms). This is easily long enough for my swi thread to start and I drop packets. This is similar to a problem fixed by jhb_at_ a while ago, which was discussed here: http://lists.freebsd.org/pipermail/freebsd-arch/2010-December/010835.html However in my case it is a thread at an ithread priority that is started, not a userland realtime thread. I don't see why sched_unlend_prio can't check for preemption. I know that in jhb's case we didn't want to check for preemption in userret for performance reasons. In my case, sched_unlend_prio already holds the relevant scheduler lock (and hopefully mutex contention is a rare case anyway), so I don't see that the same performance argument applies. The following proof-of-concept patch for ULE resolves the starvation for me. Any comments? diff --git a/sys/kern/sched_ule.c b/sys/kern/sched_ule.c index 01559ee..2659614 100644 --- a/sys/kern/sched_ule.c +++ b/sys/kern/sched_ule.c _at__at_ -1586,6 +1586,22 _at__at_ sched_pctcpu_update(struct td_sched *ts) ts->ts_ftick = ts->ts_ltick - SCHED_TICK_TARG; } +static void +sched_check_preempt(struct tdq *tdq, struct thread *td) +{ + + KASSERT(TD_IS_RUNNING(td), ("thread is not running")); + TDQ_LOCK_ASSERT(tdq, MA_OWNED); + KASSERT(tdq == TDQ_CPU(td->td_sched->ts_cpu), ("tdq does not td")); + + if (tdq == TDQ_SELF()) { + if (td->td_priority > tdq->tdq_lowpri && + sched_shouldpreempt(tdq->tdq_lowpri, td->td_priority, 0)) + td->td_owepreempt = 1; + } else + tdq_notify(tdq, td); +} + /* * Adjust the priority of a thread. Move it to the appropriate run-queue * if necessary. This is the back-end for several priority related _at__at_ -1635,8 +1651,10 _at__at_ sched_thread_priority(struct thread *td, u_char prio) td->td_priority = prio; if (prio < tdq->tdq_lowpri) tdq->tdq_lowpri = prio; - else if (tdq->tdq_lowpri == oldpri) + else if (tdq->tdq_lowpri == oldpri) { tdq_setlowpri(tdq, td); + sched_check_preempt(tdq, td); + } return; } td->td_priority = prio;Received on Fri Jan 25 2013 - 19:01:22 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:34 UTC