Re: int/long confusion with maxbcache and maxswzone (fixes 6.0 on >12GB machines)

From: Max Laier <max_at_love2party.net>
Date: Wed, 12 Oct 2005 12:36:50 +0200
On Tuesday 11 October 2005 23:38, Kris Kennaway wrote:
> A few weeks ago I reported that bufinit() on sparc64 machines with
>
> >12GB of RAM goes into an infinite loop because of a 32-bit integer
>
> counter overflowing.  On 5.x it was possible to work around this with
> the kern.maxbcache tunable, but this didn't work on 6.0 or above.
>
> It turns out the problem began here:
>
> ----
> Revision 1.67 / (download) - annotate - [select for diffs], Mon Nov 8
> 18:20:02 2004 UTC (11 months ago) by des Branch: MAIN
> Changes since 1.66: +17 -17 lines
> Diff to previous 1.66 (colored)
>
> #include <vm/vm_param.h> instead of <machine/vmparam.h> (the former
> includes the latter, but also declares variables which are defined
> in kern/subr_param.c).
>
> Change som VM parameters from quad_t to unsigned long.  They refer to
> quantities (size limits for text, heap and stack segments) which must
> necessarily be smaller than the size of the address space, so long is
> adequate on all platforms.
>
> MFC after:	1 week
> ----
>
> which contained:
>
> -int	maxswzone;			/* max swmeta KVA storage */
> -int	maxbcache;			/* max buffer cache KVA storage */
> +long	maxswzone;			/* max swmeta KVA storage */
> +long	maxbcache;			/* max buffer cache KVA storage */
>
> However, des forgot to change the other definition of maxbcache in
> <sys/buf.h>:
>
> extern int      maxbcache;              /* Max KVA for buffer cache */
>
> In fact, it's a good thing he didn't.  On sparc64 if you make that
> variable a long it causes 32-bit integer overflows elsewhere, which
> lead to severe filesystem damage on systems with >12GB RAM.  With the
> above bug this is reduced to a hang at boot.

Isn't it enough to introduce the maximum values below?  I imagine that the 
ultimate goal is to get rid of the constrains, which will be easier if we 
already have enough bits.

> The hang is because maxbcache is not capped to a maximum value on
> sparc64, and a loop termination condition never occurs because of a
> 32-bit integer overflow.  On amd64 it's capped to
>
> /*
>  * Ceiling on size of buffer cache (really only effects write queueing,
>  * the VM page cache is not effected), can be changed via
>  * the kern.maxbcache /boot/loader.conf variable.
>  */
> #ifndef VM_BCACHE_SIZE_MAX
> #define VM_BCACHE_SIZE_MAX      (400 * 1024 * 1024)
> #endif
>
> so large-memory amd64 systems never see it.  ia64 and ppc would also
> hang at boot with >12GB, I think.
>
> On 5.x, the same hang exists, but you can work around it with the
> tunable.  This tunable was broken by the long/int mismatch on 6.0, so
> sparc64 systems with >12GB were unusable.
>
> This patch reverts the above int->long change, and adds definitions
> for VM_BCACHE_SIZE_MAX and VM_SWZONE_SIZE_MAX on sparc64 copied from
> amd64.  Actually, they should probably be added on other architectures
> too (ia64, ppc).
>
> Can someone please review?

-- 
/"\  Best regards,                      | mlaier_at_freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier_at_EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

Received on Wed Oct 12 2005 - 08:36:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:45 UTC