Hi. This patch takes callout(9) and redesign the KPI and the implementation. The main objective of this work is making the subsystem tickless. In the last several years, this possibility has been discussed widely (http://markmail.org/message/q3xmr2ttlzpqkmae), but until now noone really implemented that. If you want a complete history of what has been done in the last months you can check the calloutng project repository http://svnweb.freebsd.org/base/projects/calloutng/ For lazy people, here's a summary: 1) callout(9) is not anymore constrained to the resolution a periodic "hz" clock can give. In order to do that, the eventtimers(4) subsystem is used as backend. 2) Conversely from what discussed in past, we maintained the callwheel as underlying data structure for keeping track of the outstading timeouts. This choice has a couple of advantages, in particular we can still take benefits from the O(1) average complexity of the wheel for all the operations. Also, we thought the code duplication that would arise from the use of a two-staged backend for callout (e.g. use wheel for coarse resolution event and another data structure, such as an heap for high resolution events), is unacceptable. In fact, as long as callout gained the ability to migrate from a cpu to another having a double backend would mean doubling the code for the migration path. 3) A way to dispatch interrupts from hardware interrupt context has been implemented, using special callout flag. This has limited applicability, but avoid the dispatching of a SWI thread for handling specific callouts, avoiding the wake up of another CPU for processing and a (relatively useless) context switch 4) As long as new callout mechanism deals with bintime and not anymore with ticks, time is specified as absolute and not relative anymore. In order to get current time binuptime() or getbinuptime() is used, and a sysctl is introduced to selectively choose the function to use, based on a precision threshold. 5) A mechanism for specifying precision tolerance has been implemented. The callout processing mechanism has been adapted and the callout data structure augmented so that the codepath can take advantage and aggregate events which overlap in time. The new proposed KPI for callout is the following: callout_reset_bt_on(..., struct bintime time, struct bintime pr, ..., int flags) where ‘time’ argument represets the time at which the callout should fire, ‘pr’ represents the precision tolerance expressed as an absolute value, and ‘flags’, which could be used to specify new features, i.e. for now, the possibility to run the callout from fast interrupt context. The old KPI has been extended introducing the callout_reset_flags() function, which is the same of callout_reset*(), but takes an additional argument ‘int flags’ that can be used in the same fashion of the ‘flags’ argument for the new KPI. Using the ‘flags’ consumers can also specify relative precision tolerance in terms of power-of-two portion of the timeout passed as ticks. Using this strategy, the new precision mechanism can be used for the existing services without major modifications. Some consumers have been ported to the new KPI, in particular nanosleep(), poll(), select(), because they take immediate advantage from the arbitrary precision offered by the new infrastructure. For some statistics about the outcome of the conversion to the new service, please refer to the end of this e-mail: http://lists.freebsd.org/pipermail/freebsd-arch/2012-July/012756.html We didn't measure any significant performance regressions with hwmpc(4), using some benckmarks programs: http://people.freebsd.org/~davide/poll_test/poll_test.c http://people.freebsd.org/~mav/testsleep.c http://people.freebsd.org/~mav/testidle.c We tested the code on amd64, MIPS and arm. Any kind of testing or comment would be really appreciated. The full diff of the work against HEAD can be found at: http://people.freebsd.org/~davide/calloutng.diff If noone have objections, we plan to merge the repository to HEAD in a week or so. Thanks, DavideReceived on Thu Dec 13 2012 - 22:12:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:33 UTC