Re: what is fsck's "slowdown"?

From: Dan Nelson <dnelson_at_allantgroup.com> Date: Sat, 4 Sep 2004 15:20:33 -0500 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:10 UTC

In the last episode (Sep 04), Matthew Dillon said:
> :This sort of thing was my initial thought, but the posted CPU usage
> :statistics show that fsck is burning up most of its CPU cycles in
> :userland.
> :
> :>> load: 0.99  cmd: fsck 67 [running] 15192.26u 142.30s 99% 184284k
> :Increasing MAXBUFSPACE looks like it would make the problem worse
> :because getdatablk() does a linear search.
> 
> Oh my. I  didn't even notice.  That code dates all the way back to
> 1994 so I wont bash the author too badly, but it is pretty aweful
> coding.
> 
> Hashing the buffer cache is trivial.  I'll do it for DragonFly and
> post the patch as a template for you guys to do it in FreeBSD (or you
> could just do it on your own, it really does look trivial).

In my tests, the lookup time for the cache was basically zero, and
prefetching disk blocks helped much more.  This is on mainly-static
filesystems under 80gb holding lots of small files (ports and other cvs
tree, cvs repos, etc).  The hardest part was dealing with the fact that
the bp list doesn't cache disk blocks but arbitrary-sized objects, each
any of which may be dirtied by a later write.  If you prefech N bytes,
you need to add N/objectsize separate entries to the cache.  Simply
instrumenting getdatablk to print the offest and size of each read,
plus whether it was a cache hit, should generate all the data you need
to determine what kind of optimizations are useful.

-- 
	Dan Nelson
	dnelson_at_allantgroup.com