Re: 5.2-RC: odd df(1) output

From: Bruce Evans <bde_at_zeta.org.au> Date: Wed, 10 Dec 2003 18:15:38 +1100 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:33 UTC

On Wed, 10 Dec 2003, Mitsuru IWASAKI wrote:

> Hi, I've found a odd df(1) output when filesystem available space
> is negative value.
>
> % df -k /usr
> Filesystem  1K-blocks     Used  Avail Capacity  Mounted on
> /dev/ad0s1g  18084782 16722518 -84518   101%    /usr
> % df -m /usr
> Filesystem  1M-blocks  Used             Avail Capacity  Mounted on
> /dev/ad0s1g     17660 16329 36028797018963886   101%    /usr
>
> It seems that this happen in case df block size is bigger than
> fs block size(eg. df -m on 2048-byte block filesystem).
> And we can reproduce the problem with simple program, such as:

Sigh.  This sign extension morass used to be avoided by not using
unsigned types in struct statfs even when the types were smaller
so that the extra bit for unsigned types would have been more
useful.

> Quick fix is attached below.  Thanks
>
> Index: df.c
> ===================================================================
> RCS file: /home/ncvs/src/bin/df/df.c,v
> retrieving revision 1.53
> diff -u -r1.53 df.c
> --- df.c	12 Nov 2003 21:47:42 -0000	1.53
> +++ df.c	10 Dec 2003 05:26:48 -0000
> _at__at_ -398,9 +398,15 _at__at_
>   * Convert statfs returned file system size into BLOCKSIZE units.
>   * Attempts to avoid overflow for large file systems.
>   */
> -#define fsbtoblk(num, fsbs, bs) \
> -	(((fsbs) != 0 && (fsbs) < (bs)) ? \
> -		(num) / ((bs) / (fsbs)) : (num) * ((fsbs) / (bs)))

The macro was supposed to be type-generic.  If fed signed or unsigned args
then it gives a signed or unsigned result, respectively, and types are
promoted naturally too.  The bug was feeding it a mixture of signed and
unsigned args of the same size, so that it gives an unsigned result even
when its block count arg is signed; worse, it converts the signed block
count arg to unsigned before scaling it, so small negative counts become
large positive ones.

The attempt to avoid overflow has also rotted.  "Large" file systems in
this context are now ones larger than 2^63-1 bytes since the type of
`num' is int64_t or uint64_t, but such file systems are not supported
since off_t is int64_t on all supported machines.

> +static intmax_t
> +fsbtoblk(int64_t num, uint64_t fsbs, u_long bs)
> +{
> +	if (fsbs != 0 && fsbs < bs) {
> +		return (intmax_t)(num / (intmax_t)(bs / fsbs));
> +	} else {
> +		return (intmax_t)(num * (fsbs / bs));
> +	}
> +}

This is no longer type-generic.  It does the following ugly conversions:
- if the count (num) starts as an uint64_t, it is converted to an intmax_t
  divided as an intmax_t and returned as an intmax_t.  If intmax_t is
  int64_t, as it is on all supported machines, then counts larger than
  INT64_MAX won't actually work.  Not that they are likely to occur.
  This shows that it was is foot-shooting to use an unsigned type
  for block counts.
- the fs block size (fsbs) starts as a uint64_t and is used as an intmax_t.
  If intmax_t is int64_t, then block sizes larger than INTMAX_T won't
  actually work.  There are impossible anyway.  This shows that it is just
  foot-shooting to use an unsigned type for block sizes.

The unsigned type for block size is the primary cause of the bug here.
If it were signed like it used to be, then it would be neutral in the
type-generic macro (provided bs is also signed).  The `num' arg could
have any type or size relative to the (signed) block size types and
things would be promoted naturally enough to give the correct result.
The macro can be fixed by just casting its bogus size ratio type:

#define fsbtoblk(num, fsbs, bs) \
	(((fsbs) != 0 && (fsbs) < (bs)) ? \
	    (num) / (intmax_t)((bs) / (fsbs)) : \
	    (num) * (intmax_t)((fsbs) / (bs)))

exceept this might cause warnings about sign mismatches when `num' is
unsigned.

Simpler version assuming that file system sizes in bytes fit in an off_t:

#define fsbtoblk(num, fsbs, bs)	((off_t)(num) * (off_t)(fsbs) / (off_t)(bs))

This has essentially the same lack of type-genericness that I don't like
in your version :-).

Bruce