At $WORK, we're working on adding support for high-precision RTT calculations in TCP. The goal is reduce the retransmission timeout significantly to help mitigate the impact of TCP incast. This means that the retransmit callout for TCP sockets gets scheduled significantly more often with a shorter timeout period, but in the normal case it is expected to be canceled or rescheduled before it times out. What I have noticed is that when the retransmit callout is canceled or rescheduled, the callout subsystem will not reschedule its currently pending interrupt. The result is that my system takes a significant number of "spurious" timer interrupts where there are no callouts to service, which is having a significant performance impact. Unfortunately, neither the callout subsystem nor the eventtimers subsystem really seem to be designed for canceling interrupts. It's not easy to find the "next" event in the callout wheel and the current code doesn't even try when handling an interrupt; the next interrupt is scheduled at a seemingly arbitrary point in the future. I know that when the callout system was reworked the callout wheel data structure was maintained to keep insertion and deletion O(1). However I question whether that was the right decision given the fact that if callouts are frequently deleted, as in my case, we incur the signficant overhead of a spurious timer interrupt. Does anybody know if actual performance measurements were taken to justify this decision?Received on Thu Sep 29 2016 - 16:43:33 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC