> > --lrZ03NoBR/3+SXJZ > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > Content-Transfer-Encoding: quoted-printable > > On Wed, Jul 21, 2004 at 05:14:27PM +0200, Daniel Lang wrote: > > Hi, > >=20 > > Jan Grant wrote on Wed, Jul 21, 2004 at 02:44:42PM +0100: > > [..] > > > You're correct, in that filesystem semantics don't require an archiver= > =20 > > > to recreate holes. There are storage efficiency gains to be made in=20 > > > identifying holes, that's true - particularly in the case of absolutely= > =20 > > > whopping but extremely sparse files. In those cases, a simple=20 > > > userland-view-of-the-filesystem-semantics approach to ideentifying area= > s=20 > > > that _might_ be holes (just for archive efficiency) can still be=20 > > > expensive and might involve the scanning of multiple gigabytes of=20 > > > "virtual" zeroes. > > >=20 > > > Solaris offers an fcntl to identify holes (IIRC) for just this purpose.= > =20 > > > If the underlying filesystem can't be made to support it, there's an=20 > > > efficiency loss but otherwise it's no great shakes. > >=20 > > I don't get it. > >=20 > > I assume, that for any consumer it is totally transparent if > > possibly existing chunks of 0-bytes are actually blocks full of > > zeroes or just non-allocated blocks, correct? > >=20 > > Second, it is true, that there is a gain in terms of occupied disk > > space, if chunks of zeroes are not allocated at all, correct? > >=20 > > So, from my point of view it is totally irrelevant, if a sparse file > > is archived and then extracted, if the areas, which contain zeroes > > are exactly in the same manner consisting of unallocated blocks > > or not. > >=20 > > So, all I guess an archiver must do is: > >=20 > > - read the file=20 > > - scan the file for consecutive blocks of zeroes > > - archive these blocks in an efficient way > > - on extraction, create a sparse file with the previously > > identified empty blocks, regardless if these blocks > > have been 'sparse' blocks in the original file or not. > >=20 > > I do not see, why it is important if the original file was sparse > > at all or maybe in different places. > > Since sparse files over commit the disk, they should only be created > deliberatly. Otherwise you can easily get in trouble if you try to use > reserved space later since it won't actually be reserved. Consider the > case of a file system image created with "dd if=3D/dev/zero ...; newfw > =2E..". If your archiver decides to be "smart" and restore a copy of that > file sparce and then you use up the availble blocks on your disk you're > going to be in a world of hurt. I wouldn't be suprised it that resulted > in a panic. If the file has 'holes' and they are read as zero, then doesn't compressing the tar file nicely reduce it? dd if=/dev/zero of=junk count=100 tar czf junk.tar.gz junk ls -ls junk* 50 -rw-r--r-- 1 danny wheel 51200 Jul 22 10:28 junk 2 -rw-r--r-- 1 danny wheel 170 Jul 22 10:33 junk.tar.gz dannyReceived on Thu Jul 22 2004 - 05:34:37 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:02 UTC