Re: Tar problem

From: Tim Kientzle <tim_at_kientzle.com>
Date: Fri, 21 May 2004 15:29:41 -0700
Pete Carah wrote:
> On Fri, May 21, 2004 at 12:10:39PM -0700, Tim Kientzle wrote:
>>Pete Carah wrote:
>>
>>>When I unpack the ports collection, the symlink 
>>>has been overwritten with the real directory
>>
>>This is deliberate; bsdtar does essentially the
>>same thing.
> 
>>o  ... absolute pathnames.  ... bsdtar removes the leading / ...
>>o  ... pathnames that include .. components.
>>o  ... exploit symbolic links to restore files to other directories.

> GNU tar didn't do this a year or two ago...

GNU tar was just one of many archiving programs that
were recently reamed on a security forum for this
type of problem.  It has, of course, been stripping
absolute paths for a long time.

Consider, for example, a malicious FreeBSD package
that happens to include one of the following:
   * A trojaned /bin/sh (stored with an absolute path)
   * A trojaned ../../../bin/sh (stored using ..)
   * A symlink bin -> /bin  followed by a trojaned bin/sh
   * A symlink foo -> /bin/sh following by a trojaned foo
A naive tar program would overwrite /bin/sh in any of
these cases.  Similar issues, of course, can arise with
any tar, cpio, zip, or other archiving program that tries
to properly handle symlinks.  Both gtar and bsdtar try to
provide default behavior that protects casual users from
such malicious archives.

One difficulty is that any tar program (in particular)
must treat each entry extracted as a separate operation.
So, if you create /usr/ports -> /other/ports, then
tar is going to have to make a decision about each
of the following as it extracts it:
     usr/ports (directory to be extracted)
     usr/ports/foo (file within that directory to be extracted)
Any reasonably safe handling of symlinks will destroy the
pre-existing symlink in the process of restoring
one of these.  If you know a strategy that doesn't,
please let me know.  (Preferably, a strategy that doesn't
require keeping a list of every directory and/or symlink restored
during the course of extraction.)  Keeping track of a "top-level dir"
isn't really an option, as there is no such concept within
a tar archive.  (Which is, of course, part of the problem.)

The best way around this is to create the ports archive by:
    cd /usr/ports && tar cjf file.tbz *
and restore it with:
    cd /usr/ports && tar xf file.tbz
Then, the /usr/ports symlink is never inspected and you
can redirect it however you want.

Similar care can handle other problems with "top-level" symlinks.

>  And I may well want to accomplish the 2nd and 3rd (e.g. restore a 
> complete system image with symlinks containing .. chars...) (ln -s ../netscape-6/netscape
> netscape in /usr/local/bin, for example...

bsdtar certainly does not screen symlink contents, so
this example would not be affected by the security checks.
It will restore such symlinks; it just won't restore
something else with a symlink as part of the path.
I.e., you could not restore the symlink you describe
and then restore a regular file called 'netscape'.
(In this case, both gtar and bsdtar will simply
remove the symlink and replace the file.  If the
symlink were an intermediate dir, gtar would
remove it and create the intermediate dirs, bsdtar
would refuse to extract the file.)

> symlinks pointing to dirs and symlinks pointing to files
> are qualitatively different ...

Not really, no.  I've constructed trojan tar archives
that exploit both symlinks to files and symlinks to dirs
to overwrite files outside of the target directory.
I constructed these to test gtar's and bsdtar's security
checks.

> Also note that my complaint involves NONE of these; the ports collection archive contains
> no symlinks.  The symlink preexists and moves only the top-level directory...

I suspect that gtar and bsdtar would both end up doing
the same thing, even though they would get there in
slightly different ways.  For bsdtar, the key is that
the "top-level directory" is being restored by the
archive, so the existing symlink will get deleted and
replaced with a real directory.

> The only completely secure way ... is to remove symlinks from the system entirely ...

Noone's suggesting that.  But, symlinks can be used to
fool unsuspecting users and, as such, need to be handled
carefully.  The tricky part is finding a good balance between
safety and flexibility that feels "natural" to as many people
as possible.

> And tar doesn't always handle hardlinks right either...  (e.g. - a hardlink on the source system which crosses
> a filesystem boundary on the target.  What tar does here is restore multiple copies.  
> A sysadmin would usually apply a symlink here instead...)

Good point.  Right now, I think bsdtar simply fails to restore
the hardlink.  Converting it to a symlink is an interesting
idea, though it carries risks as well.  Duplicating the
file is arguably less risky.  (Hardlinks are symmetric and
have the property that deleting one link does not affect
the other links. Symlinks are not symmetric; tar cannot
be expected to guess which copy should be kept and which
converted to symlinks.)

> And the gtar behavior changed from historical without 
> explanation or notice... in either the (unofficial) man 
> page or the (more official) info

If you would like to submit some diffs to the gtar.1
man page, I'd be happy to commit them.  I can't speak
to other gtar issues, though, as I'm (for obvious reasons)
trying hard to avoid reading gtar source.

Tim
Received on Fri May 21 2004 - 13:30:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:54 UTC