confusing KTR_SCHED traces

From: Andriy Gapon <avg_at_FreeBSD.org> Date: Fri, 17 Feb 2017 20:48:57 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC

First, an example, three consecutive entries for the same thread (from top to
bottom):
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"sleep",
attributes: prio:84, wmesg:"-", lockname:"(null)"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"spinning",
attributes: lockname:"sched lock 1"
KTRGRAPH group:"thread", id:"zio_write_intr_3 tid 100260", state:"running",
attributes: none

Any automatic analysis tool including schedgraph.py will assume that the thread
ends up in the running state.  In reality, of course, the thread is in the
sleeping state.
The confusing trace is a result of logging the thread's intention to switch out
in mi_switch() before calling sched_switch().  In ULE's sched_switch() we
acquire the "TDQ_LOCK" which could be contested.  In that case the thread spins
waiting for the lock to be released.  This is reported as "spinning" and then
"running" states.

I would like to fix that, but not sure how to do that best.
One idea is to move the mi_switch() trace closer to the cpu_switch() call
similarly to DTrace sched:cpu-off and sched:cpu-on probes.

Any suggestions are welcome.
Thanks!
-- 
Andriy Gapon