[Moved to -current for further discussion.] David Schultz wrote: >Tim Kientzle wrote: >>Pawel Jakub Dawidek wrote: >> >>> Introduce new field 'fts_bignum' ... >>> This work is part of the BigDisk project: >>> http://www.FreeBSD.org/projects/bigdisk/ >> >>Any plans to deal with other fts limits ... ? > > Removing FTS_LOGICAL's path length limitation is nontrivial, but > given your experience with bsdtar, I'd say you're an ideal person > to do it. ;-) As it happens, I started tinkering with these ideas a while ago but haven't found time to finish it all. Here's a snapshot of the current WIP: http://people.freebsd.org/~kientzle/libarchive/src/tree.tgz This includes a simple main() test function that just does the rough equivalent of "find .". > .. fts() effectively uses chdir("..") to > climb up one level in the tree in chdir mode. If it chdir'd > across a symlink earlier, this would put it in the wrong place. > Perhaps you have a better solution, but my best idea is to keep > the parent directory open when chdiring ... The tree package does exactly this. It just keeps a flag with each ancestor node indicating the type of traversal that's needed. > A more uniform approach would be to ... always keep the > parent open when descending a level. Unfortunately, this limits > the traversal depth to the number of available file descriptors. The tree package does not do this for exactly this reason. I have some ideas for handling the case where the number of symlinks exceeds the available file descriptors, but that doesn't seem particularly pressing at the moment. <grin> > (On the other hand, I would argue that anyone with a directory > tree a few thousand levels deep is asking for trouble.) I thought you *wanted* to support big disks! ;-) I started this work partly because I wanted to really stress the long pathname support in bsdtar. I've archived directory trees with megabyte pathnames (several thousand directory levels crossing several hundred symlinks) in testing. Of course, I can't yet *dearchive* such deep trees. That seems to be a harder problem. The tree package also keeps a *lot* less data in memory than fts. It has no trouble with million-entry directories, for example. In comparison, the current ls crashes on such large directories in part because of the memory required for fts. The tree package is quite a bit different in many respects: * The traversal is in a very different order, for instance. * It has a completely different API than fts. It's fully opaque (so should be easier to change in the future, unlike fts). * It takes a very different approach to determine when to visit a child. In particular, instead of the client specifying a mode and optionally inhibiting the descent through a "prune" request, the tree package has the client specifically request descent. If you request descent for *every* item, you'll end up with a logical traversal; if you request descent for every dir, you'll end up with a physical traversal. (The tree package is smart enough to ignore any descent request that isn't for a dir or a link to a dir.) Feedback, suggestions, etc. appreciated. TimReceived on Thu Jan 13 2005 - 23:48:52 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:26 UTC