Experiments with dummynet shown ineffective support for very short tick-based callouts. New version fixes that, allowing to get as many tick-based callout events as hz value permits, while still be able to aggregate events and generating minimum of interrupts. Also this version modifies system load average calculation to fix some cases existing in HEAD and 9 branches, that could be fixed with new direct callout functionality. http://people.freebsd.org/~mav/calloutng_12_17.patch With several important changes made last time I am going to delay commit to HEAD for another week to do more testing. Comments and new test cases are welcome. Thanks for staying tuned and commenting. On 17.12.2012 01:37, Alexander Motin wrote: > Here is one more version. Unless something new will be found/reported > this may be the last one, because me and Davide are quite satisfied with > the results. If everything will be fine, I think we could commit it to > HEAD closer to the end of the week: > http://people.freebsd.org/~mav/calloutng_12_16.patch > > Changes in this version: > -- Removed couple of redundant variables in callout implementation, > that reduced sizeof(struct callout) by two pointers and simplified some > internal code. > -- syscons driver was made to schedule only 1-2 callouts per second > instead of 20-30 before when console is in graphical mode and there are > few other things to do. Now my laptop has only about 30 interrupts per > second total during idle periods with X running. > -- i8254 eventtimer driver was optimized to work faster in disabled by > default one-shot mode. > -- Few kernel functions were added to make KPIs more complete. > -- Man pages were updated. > -- Some style fixes were made. > > On 15.12.2012 18:55, Alexander Motin wrote: >> I'm sorry to interrupt review, but as usual good ideas came during the >> final testing, causing another round. :) Here is updated patch for >> HEAD, that includes several new changes: >> http://people.freebsd.org/~mav/calloutng_12_15.patch >> >> The new changes are: >> -- Precision and event aggregation code was reworked. Instead of >> previous -prec/+prec representation, precision is now single-sided -- >> -0/+prec. It allowed to significantly improve precision on long time >> intervals for APIs which imply that event should not happen before the >> specified time. Depending on CPU activity, mistake for long time >> intervals now will never be more then 1-500ms, even if specified >> precision allows more. >> -- Some minor optimizations were made to reduce callout overhead and >> latency by 1.5-2us. Now on Core2Duo amd64 system with LAPIC eventtimer >> and TSC timecounter usleep(1) call from user-level executes in just >> 5-6us, instead of 7-8us before. Now it can do 180K cycles per second on >> single CPU with only partial CPU load. >> -- Number of kernel subsystems (dcons, syscons, yarrow, led, atkbd, >> setrlimit) were modified to reduce number of interrupts, also with event >> aggregation by explicit specification of the acceptable events >> precision. Now my Core2Duo test system has only 30 interrupts per second >> in idle. If not remaining syscons events, it could easily be 15. My >> IvyBridge ultrabook first time in its history shown 5.5 hours of battery >> time with full screen brightness and 10 hours with lid closed. >> -- Some kernel functions were added to make KPIs more complete. >> >> I've successfully tested this patch on amd64 and arm. -- Alexander MotinReceived on Tue Dec 18 2012 - 08:03:52 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:33 UTC