Re: 64-bit inodes (ino64) Status Update and Call for Testing

From: Jilles Tjoelker <jilles_at_stack.nl>
Date: Sun, 21 May 2017 14:14:56 +0200
On Thu, Apr 20, 2017 at 10:43:14PM +0300, Konstantin Belousov wrote:
> Inodes are data structures corresponding to objects in a file system,
> such as files and directories. FreeBSD has historically used 32-bit
> values to identify inodes, which limits file systems to somewhat under
> 2^32 objects. Many modern file systems internally use 64-bit identifiers
> and FreeBSD needs to follow suit to properly and fully support these
> file systems.

> The 64-bit inode project, also known as ino64, started life many years
> ago as a project by Gleb Kurtsou (gleb_at_).  After that time several
> people have had a hand in updating it and addressing regressions, after
> mckusick_at_ picked up and updated the patch, and acted as a flag-waver.

> Sponsored by the FreeBSD Foundation I have spent a significant effort
> on outstanding issues and integration -- fixing compat32 ABI, NFS and
> ZFS, addressing ABI compat issues and investigating and fixing ports
> failures.  rmacklem_at_ provided feedback on NFS changes, emaste_at_ and
> jhb_at_ provided feedback and review on the ABI transition support. pho_at_
> performed extensive testing and identified a number of issues that
> have now been fixed.  kris_at_ performed an initial ports investigation
> followed by an exp-run by antoine_at_. emaste_at_ helped with organization
> of the process.

> This note explains how to perform useful testing of the ino64 branch,
> beyond typical smoke tests.

> 1. Overview.

> The ino64 branch extends the basic system types ino_t and dev_t from
> 32-bit to 64-bit, and nlink_t from 16-bit to 64-bit.  The struct dirent
> layout is modified due to the larger size of ino_t, and also gains a
> d_off (directory offset) member. As ino64 implies an ABI change anyway
> the struct statfs f_mntfromname[] and f_mntonname[] array length
> MNAMELEN is increased from 88 to 1024, to allow for longer mount path
> names.

> ABI breakage is mitigated by providing compatibility using versioned
> symbols, ingenious use of the existing padding in structures, and by
> employing other tricks.  Unfortunately, not everything can be fixed,
> especially outside the base system.  For instance, third-party APIs
> which pass struct stat around are broken in backward and forward-
> incompatible way.

We have another type in this area which is too small in some situations:
uint8_t for struct dirent.d_namlen. For filesystems that store filenames
as upto 255 UTF-16 code units, the name to be stored in d_name may be
upto 765 bytes long in UTF-8. This was reported in PR 204643. The code
currently handles this by returning the short (8.3) name, but this name
may not be present or usable, leaving the file inaccessible.

Actually allowing longer names seems too complicated to add to the ino64
change, but changing d_namlen to uint16_t (using d_pad0 space) and
skipping entries with d_namlen > 255 in libc may be helpful.

Note that applications using the deprecated readdir_r() will not be able
to read such long names, since the API does not allow specifying that a
larger buffer has been provided. (This could be avoided by making struct
dirent.d_name 766 bytes long instead of 256.)

Unfortunately, the existence of readdir_r() also prevents changing
struct dirent.d_name to the more correct flexible array.

-- 
Jilles Tjoelker
Received on Sun May 21 2017 - 10:15:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC