Re: Bsdtar and archive torture tests

From: Tim Kientzle <kientzle_at_freebsd.org>
Date: Tue, 27 Sep 2005 22:43:10 -0700
Ed,

Try the attached patch (for /usr/src/lib/libarchive) and
let me know if that fixes it for you.

libarchive was actually skipping the UTF-8 conversion when
storing the long linkname but then (correctly) converting
from UTF-8 on extraction.  The patch fixes the pax archive
writer so it does correctly convert to UTF-8.

Tim

Tim Kientzle wrote:
> Hmmm.... Looking at the internals of the generated archive
> shows that the extended attribute is definitely getting
> stored incorrectly.  I'll look into this.
> 
> If you see any other problems, please let me know!
> 
> Tim
> 
> 
> Ed Maste wrote:
> 
>> On Mon, Sep 26, 2005 at 08:16:50PM -0400, Ed Maste wrote:
>>
>>
>>> Hmm, good point.  I haven't set it to anything; locale(1) shows
>>> that the LC_ variables are set to "C".  So then I can see how this
>>> happens, but it's still surprising (to me) behaviour.
>>
>>
>>
>> Ok, now I've definately encountered some non-obvious behaviour.
>> A symlink target of 100 bytes or less keeps the same name, while
>> a target of more than 100 bytes gets munged from the converstion
>> to UTF-8 and back.
>>
>> For example, the symlink created by the following script doesn't
>> change the link target:
>>
>> #!/bin/sh
>> fname=$(printf $(jot -b \\303\\240 -s '' 50))
>> ln -fs $fname test
>> tar -cf - test | tar -tvf -
>>
>> but if the 50 in the jot command is changed to 51, the target
>> changes.  So I guess that the link target doesn't fit in the
>> standard header anymore, and needs an extended tag.  Having
>> different behaviour for the two cases does seem odd.
>>
>> -- 
>> Ed Maste, Sandvine Incorporated
>> _______________________________________________
>> freebsd-current_at_freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to 
>> "freebsd-current-unsubscribe_at_freebsd.org"
>>
>>
> 
> 
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 
> 


Index: archive_entry.c
===================================================================
RCS file: /home/ncvs/src/lib/libarchive/archive_entry.c,v
retrieving revision 1.31
diff -u -r1.31 archive_entry.c
--- archive_entry.c	21 Sep 2005 04:25:05 -0000	1.31
+++ archive_entry.c	28 Sep 2005 05:36:04 -0000
_at__at_ -203,6 +203,8 _at__at_
 static const char *
 aes_get_mbs(struct aes *aes)
 {
+	if (aes->aes_mbs == NULL && aes->aes_wcs == NULL)
+		return NULL;
 	if (aes->aes_mbs == NULL && aes->aes_wcs != NULL) {
 		/*
 		 * XXX Need to estimate the number of byte in the
_at__at_ -224,6 +226,8 _at__at_
 static const wchar_t *
 aes_get_wcs(struct aes *aes)
 {
+	if (aes->aes_wcs == NULL && aes->aes_mbs == NULL)
+		return NULL;
 	if (aes->aes_wcs == NULL && aes->aes_mbs != NULL) {
 		/*
 		 * No single byte will be more than one wide character,
_at__at_ -463,6 +467,12 _at__at_
 	return (aes_get_mbs(&entry->ae_hardlink));
 }
 
+const wchar_t *
+archive_entry_hardlink_w(struct archive_entry *entry)
+{
+	return (aes_get_wcs(&entry->ae_hardlink));
+}
+
 ino_t
 archive_entry_ino(struct archive_entry *entry)
 {
_at__at_ -536,6 +546,12 _at__at_
 	return (aes_get_mbs(&entry->ae_symlink));
 }
 
+const wchar_t *
+archive_entry_symlink_w(struct archive_entry *entry)
+{
+	return (aes_get_wcs(&entry->ae_symlink));
+}
+
 uid_t
 archive_entry_uid(struct archive_entry *entry)
 {
Index: archive_entry.h
===================================================================
RCS file: /home/ncvs/src/lib/libarchive/archive_entry.h,v
retrieving revision 1.17
diff -u -r1.17 archive_entry.h
--- archive_entry.h	10 Sep 2005 22:58:06 -0000	1.17
+++ archive_entry.h	28 Sep 2005 05:36:05 -0000
_at__at_ -80,6 +80,7 _at__at_
 gid_t			 archive_entry_gid(struct archive_entry *);
 const char		*archive_entry_gname(struct archive_entry *);
 const char		*archive_entry_hardlink(struct archive_entry *);
+const wchar_t		*archive_entry_hardlink_w(struct archive_entry *);
 ino_t			 archive_entry_ino(struct archive_entry *);
 mode_t			 archive_entry_mode(struct archive_entry *);
 time_t			 archive_entry_mtime(struct archive_entry *);
_at__at_ -92,6 +93,7 _at__at_
 int64_t			 archive_entry_size(struct archive_entry *);
 const struct stat	*archive_entry_stat(struct archive_entry *);
 const char		*archive_entry_symlink(struct archive_entry *);
+const wchar_t		*archive_entry_symlink_w(struct archive_entry *);
 uid_t			 archive_entry_uid(struct archive_entry *);
 const char		*archive_entry_uname(struct archive_entry *);
 
Index: archive_write_set_format_pax.c
===================================================================
RCS file: /home/ncvs/src/lib/libarchive/archive_write_set_format_pax.c,v
retrieving revision 1.30
diff -u -r1.30 archive_write_set_format_pax.c
--- archive_write_set_format_pax.c	21 Sep 2005 04:25:05 -0000	1.30
+++ archive_write_set_format_pax.c	28 Sep 2005 05:36:06 -0000
_at__at_ -393,11 +393,14 _at__at_
 
 	/* If link name is too long, add 'linkpath' to pax extended attrs. */
 	linkname = hardlink;
-	if (linkname == NULL)
+	if (linkname == NULL) {
 		linkname = archive_entry_symlink(entry_main);
+		wp = archive_entry_symlink_w(entry_main);
+	} else
+		wp = archive_entry_hardlink_w(entry_main);
 
 	if (linkname != NULL && strlen(linkname) > 100) {
-		add_pax_attr(&(pax->pax_header), "linkpath", linkname);
+		add_pax_attr_w(&(pax->pax_header), "linkpath", wp);
 		if (hardlink != NULL)
 			archive_entry_set_hardlink(entry_main,
 			    "././_at_LongHardLink");
Received on Wed Sep 28 2005 - 03:43:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:44 UTC