Re: HEADS UP! KSE needs more attention

From: Peter Wemm <peter_at_wemm.org> Date: Mon, 7 Jun 2004 09:26:11 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:56 UTC

On Monday 07 June 2004 07:33 am, Thomas Moestl wrote:
> On Sun, 2004/06/06 at 14:59:21 -0700, Kris Kennaway wrote:
> > On Sun, Jun 06, 2004 at 03:49:13PM -0600, Scott Long wrote:
> > > amd64 is approaching critical mass for tier-1.  There are a
> > > number of developers that own amd64 hardware now, and a number of
> > > users who are asking about it on the mailing lists.  Peter is
> > > finishing up the last blocking item for it (kld's) not including
> > > the observed KSE problems. It's very close and I _will_ hold up
> > > the release for it to get done. amd64 is the future of commodity
> > > computing and we aren't going to ignore it for 5-STABLE.
> >
> > amd64 has a bug with swapping - when something begins to access
> > swap, the entire system becomes almost entirely unresponsive (e.g.
> > no mouse response for up to 10 seconds) until it stops.  Peter has
> > some ideas about it, but it's a serious enough bug that it forced
> > me to stop using amd64 as my desktop machine (hello, kde!).
>
> Hmmm, I have encountered a similar problem on sparc64 once; the
> reason was that vm_pageout_map_deactivate_pages() calls
> pmap_remove() for the range from the start to the end of the
> process's vm_map when a process is swapped out. Start and end
> are VM_MIN_ADDRESS and VM_MAXUSER_ADDRESS respectively, and on
> 64-bit architectures, that range is very large (128TB on ia64
> if I'm not mistaken), so the iteration in pmap_remove() must
> be carefully designed to make as large steps as possible to
> avoid long run times (or to not iterate over the range at all
> if it becomes too large, which we did on sparc64).
>
> It seems that the amd64 version of pmap_remove() will essentially
> always iterate in 2MB (level 2 page table) steps, regardless of
> whether there is mapping for the respective level 2 table in the
> table levels above; that means that in the previously mentioned case,
> the outer loop will usually run for about 67 million iterations (the
> resident count guard may not be of much use here if a stack page is
> left at the very end of the address space). Since there are a few
> memory accesses needed in each iterations, that may already be the
> cause of such a delay.

You know, this sounds spot-on!  Thanks for the tip!

-- 
Peter Wemm - peter_at_wemm.org; peter_at_FreeBSD.org; peter_at_yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5