libarchive / libarchive

Multi-format archive and compression library

Home Page:http://www.libarchive.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

File name duplication in macOS pkg archives (xar archives)

Blzut3 opened this issue · comments

Version of libarchive: 3006bc5 (3.7.5 dev), also present in 3.7.2 and 3.6.0 from Ubuntu 24.04 and 22.04 respectively.

It appears that Apple's package building tools creates TOCs with duplicated name tags in files. In unsigned packages this issue is less pronounced but can be present, however should the package be signed with productsign it appears that it adds an additional redundant name tag to each file entry. This results in file names like Python_Documentation.pkgPython_Documentation.pkg/PackageInfoPython_Documentation.pkgPython_Documentation.pkg/PackageInfoPython_Documentation.pkgPython_Documentation.pkg/PackageInfo.

Apple's xar tool displays the archive contents with the expected file names without the redundancy.

Example TOC entry from a signed pkg:

   <file id="5">
    <name>PackageInfo</name>
    <name>PackageInfo</name>
    <name>PackageInfo</name>
    <type>file</type>
    <data>
     <archived-checksum style="sha1">5dd628f31fd255e29407c0ebf486b22c8c2be77b</archived-checksum>
     <extracted-checksum style="sha1">19e65900bf5204105ae8a838764127c8a5778f5c</extracted-checksum>
     <encoding style="application/x-gzip"/>
     <size>563</size>
     <offset>1502405</offset>
     <length>315</length>
    </data>
   </file>

It looks like a possible fix would be to use HAS_PATHNAME to ignore extraneous name tags, but I'm not sure if there's a better fix and I'm not sure if macOS's xar tool uses the first or last name as the canonical one.

diff --git a/libarchive/archive_read_support_format_xar.c b/libarchive/archive_read_support_format_xar.c
index cefb3641..3892a4f3 100644
--- a/libarchive/archive_read_support_format_xar.c
+++ b/libarchive/archive_read_support_format_xar.c
@@ -2707,6 +2707,7 @@ xml_data(void *userData, const char *s, int len)
 
        switch (xar->xmlsts) {
        case FILE_NAME:
+               if (xar->file->has & HAS_PATHNAME) break;
                if (xar->file->parent != NULL) {
                        archive_string_concat(&(xar->file->pathname),
                            &(xar->file->parent->pathname))

Issue can be observed with the Python packages (specifically tested python-3.12.3-macos11.pkg): https://www.python.org/downloads/macos/ But I would expect just about any pkg file for macOS should do as I noticed the issue with my own pkgs generated with CPack's productbuild generator.

This looks like a straightforward fix. It would be great to have a test for this case as well. Would you be able to submit a PR for this?