A little while ago I decided to write a little program with the aim of analysing the historical memory usage of my FreeBSD system based on the system account files in the /var/account. While the reasons for this are obscure and irrelevant. I was somewhat surprised to discover that about 43% of the records had no memory usage registered. Initially I through it was an error in my parser (writhing in Haskell) but it turned out not to be the case. Normal "sa -u" output gives the same result. I started digging into it and found out it quite easy to reproduce - e.g. run a "repeat 1000 /usr/bin/time -lh ls" will show 0 in most but not all output. To avoid any misunderstandings as far as I can see the observations below only impacts data presented to the end-user (through acct(2), getrusage()) and I don't think there is any issues in the kernels internal resource accounting (at least not that I can see) The reason turns out to be that around 43% of the commends (make buildworld buildkernel) simply finished before statclock() runs for the first time. I started digging into it found out what the problem is. Historically (basing this on the "Design and Implementation of The FreeBSD Operating System C3.4, P57-59" covering V5.2 shows that on a system with a assumed clock of a 100 ticks per second - the assumption was that statclock() would run at 128 ticks per second. E.g. 28% faster than hz. The current kernel seems to use 128hz or there about if using the kern.eventtimer.periodic=1 (133 vs. 127) independent of what the scheduling clock is doing. Acct(2) bases the information on the rsuage struct td_ru who's Memory information is updated in statclock() which in turn also is used by getrusage and hence /usr/bin/time shows the same issue. With periodic set the kernel shows kern.sched.quantum: 97734 kern.clockrate: { hz = 1000, tick = 1000, profhz = 8112, stathz = 133 } (stathz = 127, if the periodic flag is set to 0) kern.eventtimer.periodic: 1 kern.eventtimer.timer: HPET kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 kern.eventtimer.choice: HPET(350) HPET1(340) HPET2(340) HPET3(340) LAPIC(100) i8254(100) RTC(0) kern.eventtimer.et.HPET3.quality: 340 kern.eventtimer.et.HPET3.frequency: 14318180 kern.eventtimer.et.HPET3.flags: 3 kern.eventtimer.et.HPET2.quality: 340 kern.eventtimer.et.HPET2.frequency: 14318180 kern.eventtimer.et.HPET2.flags: 3 kern.eventtimer.et.HPET1.quality: 340 kern.eventtimer.et.HPET1.frequency: 14318180 kern.eventtimer.et.HPET1.flags: 3 kern.eventtimer.et.HPET.quality: 350 kern.eventtimer.et.HPET.frequency: 14318180 kern.eventtimer.et.HPET.flags: 3 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.LAPIC.quality: 100 kern.eventtimer.et.LAPIC.frequency: 0 kern.eventtimer.et.LAPIC.flags: 15 Whereas I would have expected the stathz to be around 1280. I don't know if anybody actually uses acct(2) and getrusage(2) for anything but the impact are interesting (done whole heap of analysis on it in R ) This shows the proportion of commands on 3 different "make buildworld buildkernel" at the system running with default values as should be clear there are 46% of all commands that terminates before sysclock() get to run - to be clear the isnull variable is true where no memory Data is in the "sa -u" output w128hz$isnull n missing distinct 645822 0 2 Value FALSE TRUE Frequency 343806 302016 Proportion 0.532 0.468 With the kern.eventtimer.periodic = 1 is shows the same proportion (but in this case just a single "make buildworld buildkernel") w133hz$isnull n missing distinct 202309 0 2 Value FALSE TRUE Frequency 103799 98510 Proportion 0.513 0.487 If this is changes to mirror the behaviour described in the book the sampling error or what we should call it falls to 4-5% rather Than 43-48% w1280hz$isnull n missing distinct 404128 0 2 Value FALSE TRUE Frequency 385411 18717 Proportion 0.954 0.046 Fiddling with hz and stathz you can get it <1% (hz=3000, stathz=3840) but that seems silly. <5% seems a sampling error margin seems acceptable. Either way >43% incorrectly showing no memory usage should atleast have a warning in the getrsuage(2) and acct(2) man pages if the decision is to keep the current 128hz statclock behaviour I have made a patch that changes the behaviour (kern_clocksource.c) but I am not sure if the edge case a caught here also Applies to the last update in kern_exit.c (I don't believe so). I have not looked at the accuracy of the other fields in strut rusage But there seem more sensible that the memory values was. The thing that nags me is that I have a recollection that this was discussed ages ago but given I have been FreeBSDing since V1 - I cannot remember when it was and some argument was made that anything above 128hz was wasteful and suboptimal. The issue with changing it is that if anybody actually uses getrusage and acct for anything it would change behaviour and potentially be a POLA - so maybe some kind of flag. The more fundamental question is whether the statclock approach is correct as such. Sorry for the long mail.Received on Wed Mar 13 2019 - 13:52:07 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:20 UTC