Re: 64-bit inodes (ino64) Status Update and Call for Testing

From: Conrad Meyer <cem_at_freebsd.org>
Date: Wed, 24 May 2017 20:49:04 -0700
Hi Jilles,

Thanks for bringing this up.  And of course, thanks to kib_at_ for
including the d_namlen size bump and for his work in driving the rest
of this change through to completion.

On Sun, May 21, 2017 at 5:14 AM, Jilles Tjoelker <jilles_at_stack.nl> wrote:
> We have another type in this area which is too small in some situations:
> uint8_t for struct dirent.d_namlen. For filesystems that store filenames
> as upto 255 UTF-16 code units, the name to be stored in d_name may be
> upto 765 bytes long in UTF-8. This was reported in PR 204643. The code
> currently handles this by returning the short (8.3) name, but this name
> may not be present or usable, leaving the file inaccessible.

We've been working to add such support to our FreeBSD-derivative
product.  A big piece of it is expanding d_namlen out to 16 bits.
We've also been trying to divorce system-wide constants like MAXNAMLEN
/ NAME_MAX and MAXPATHLEN / PATH_MAX from filesystem-specific
limitations (UFS' limit of 255 bytes).  And push that upstream when
possible, e.g., r313475, r316509.

Bumping d_namlen in FreeBSD reduces the amount of ABI breakage we have
to introduce in our product relative to FreeBSD, and leaves open the
possibility of supporting 255-unicode-character filesystems natively
in FreeBSD down the road.

> Actually allowing longer names seems too complicated to add to the ino64
> change, but changing d_namlen to uint16_t (using d_pad0 space) and
> skipping entries with d_namlen > 255 in libc may be helpful.
>
> Note that applications using the deprecated readdir_r() will not be able
> to read such long names, since the API does not allow specifying that a
> larger buffer has been provided. (This could be avoided by making struct
> dirent.d_name 766 bytes long instead of 256.)

We're looking at 255 Unicode code points, which can be 4 bytes a piece
in UTF8, or 1020 bytes potentially.

> Unfortunately, the existence of readdir_r() also prevents changing
> struct dirent.d_name to the more correct flexible array.

Best,
Conrad
Received on Thu May 25 2017 - 01:55:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC