Re: Process accounting/timing has broken recently

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 6 Dec 2010 13:01:13 -0500
On Monday, December 06, 2010 11:38:30 am Steve Kargl wrote:
> On Mon, Dec 06, 2010 at 09:44:03AM -0500, John Baldwin wrote:
> > On Sunday, December 05, 2010 6:18:29 pm Steve Kargl wrote:
> > > Sometime in the last 7-10 days, some one made a
> > > change that has broken process accounting/timing.
> > > 
> > > laptop:kargl[42] foreach i ( 0 1 2 3 4 5 6 7 8 9 )
> > > foreach? time ./testf
> > > foreach? end
> > > Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04
> > >        69.55 real        38.39 user        30.94 sys
> > > Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04
> > >        68.82 real        40.95 user        27.60 sys
> > > 
> > > testf is a numerically intensive program that tests the
> > > accuracy of expf() in a tight loop.  User time varies
> > > by ~3 seconds on my lightly loaded 2 GHz core2 duo processor.
> > > I'm fairly certain that the code does not suddenly grow/loose
> > > 6 GFLOP of operations.
> > 
> > The user/sys thing is a hack (and has been).  We sample the PC at stathz (~128 
> > hz) to figure out a user vs sys split and use that to divide up the total 
> > runtime (which actually is fairly accurate).  All you need is for the clock 
> > ticks to fire just a bit differently between runs to get a swing in user vs 
> > system time.
> > 
> > What I would like is to keep separate raw bintime's for user vs system time in 
> > the raw data instead, but that would involve checking the CPU ticker more 
> > often (e.g. twice for each syscall, interrupt, and trap in addition to the 
> > current once per context switch).  So far folks seem to be more worried about 
> > the extra overhead rather than the loss of accuracy.
> > 
> 
> John,
> 
> Thanks for the comment.  It seems this splitting has become
> worse (for some definition of worse) in that previously the
> user time variation was on the order of tenth of a second not
> seconds.  In thinking about the issue, I recalled that some
> changes to npx.c were committed 10 days ago.  Perhaps, there
> is slightly more context switch overhead in dealing with the
> FPU registers, and this has increased the sys time.

Hmm, I wonder if the eventtimer stuff that has gone into HEAD recently could
be a factor?  It might change when statclock() is called.

-- 
John Baldwin
Received on Mon Dec 06 2010 - 17:04:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:09 UTC