libarchive / libarchive

Multi-format archive and compression library

Home Page:http://www.libarchive.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

re-review commits

emaste opened this issue · comments

In light of the xz backdoor (https://www.cve.org/CVERecord?id=CVE-2024-3094) it seems prudent to (at least) review again commits by the same author.

I believe these are the associated commits:

f27c173 addressed by 6110e9c (see discussion in #1609)

3fec581 - valid man page typo fix.

The function definition is:

archive_read_disk_set_matching(struct archive *_a, struct archive *_ma,
    void (*_excluded_func)(struct archive *, void *, struct archive_entry *),
    void *_client_data)

The man page text is referring to the 2nd arg, but note that _ma does not appear elsewhere in the man page - that is, the declaration in the man page omits args:

     int
     archive_read_disk_set_matching(struct archive *, struct archive *,
         void
         (*excluded_func)(struct archive *, void *, struct archive entry *),
         void *);

fc4aa72 - adding missing functions to man pages. There is at least one minor issue (spurious comma after archive_read_disk_set_metadata_filter_callback)

@emaste de36667 and 02cfa8a smell to me. Worth a closer look by someone who understands the library better.

0eee3c6 appears to be the change suggested by @mmatuska in #1591

@emaste de36667 and 02cfa8a smell to me. Worth a closer look by someone who understands the library better.

02cfa8a moves from deflate to default; which could be any value in future code changes.

Also, the check for zlib header is gone.

In light of the xz backdoor (https://www.cve.org/CVERecord?id=CVE-2024-3094) it seems prudent to (at least) review again commits by the same author.

I believe these are the associated commits:

The bigger problem here is the possibility of other commits under other usernames with the same guise. Obviously no easy task, although a wider search would be advisable considering the possible implications.

54d2b66 looks innocent enough, no?

Thank you for this work, @emaste.

Theoretically, all the unverified commits could have been authored by a malicious actor as impersonation is possible.

0eee3c6 appears to be the change suggested by @mmatuska in #1591

yes, 0eee3c6 seems ok.

archive_read_disk_can_descend does the same check, and it also checks for the archive magic. However, the archive magic is checked twice as archive_check_magic() is already called just before calling archive_read_disk_can_descend().

0f744f4 should be safe. canXX() functions all do the same thing (by using a different binary).

#2104

Not sure if this is related though

54d2b66 looks innocent enough, no?

What creates a top-level bin/ directory though? Building via cmake is typically done in build/ and I don't think the autoconf build will.

54d2b66 looks innocent enough, no?

What creates a top-level bin/ directory though? Building via cmake is typically done in build/ and I don't think the autoconf build will.

A bin directory is created in the cmake working directory and it's declared here. When executing cmake without arguments in the libarchive directory, the bin directory gets created there.
On Windows the directory is also created by MSBuild.exe when building the solution.

I'll take a close look at #1598 , as I've been working in that area recently.

#1589 if I'm getting it correctly, redirects the "cannot" testcases to somewhere else - I'm not sure if that suggests it made testcases that should fail, not fail

#1589 enables some test cases that were previously disabled. This is a good thing that improves overall test coverage. I don't see any obvious problem with it.

If you look at the test suite that #1589 is touching, the suite is designed as a matrix of different combinations, not all of which are testable on all platforms. The initial implementation entirely disabled the combinations that weren't universally testable; #1589 adds checks so that those combinations can be tested on platforms where they are actually supported. It's possible that #1589 doesn't actually achieve this due to logic errors elsewhere, but I don't see anything wrong with #1589 by itself.

I've gone over #1598 pretty carefully. It's possible I missed something, but I really can't see anything nefarious here.

#1606 looks fine. Just a routine refactoring/code cleanup.

#1593 also looks innocuous.

@emaste -- Unless you've found something I've missed, I think we can probably close this now. Any remaining concerns?

#1589 does change the systemf() command in test_utils/test_main.c from lzma %s to lzma --help %s. I think this would result in the return code always being zero?

EDIT Nevermind. I looked at the actual C file. I did not realize that redirectArgs was just redirecting stdout+stderr.

@kientzle
I really don't know.

jiat75 did not act alone, and there appear to be others involved (or pseudonyms) who at some point may have committed something malicious or unprotected code to do this something in the future.

With the exception of mmatuska, Kientzle, emaste, and many others long-time trusted committers, it would be crazy to think that other authors should be reviewed as well (almost 250 commits since 2021).
Yes, crazy...maybe insane.

$ git log --since=2021-01-01 --format='%aN <%aE>' | sort | uniq -c | sort -nr |
while read count author_email; do
author_name=$(echo "$author_email" | cut -d'<' -f1 | sed 's/^[[:space:]]*//')
last_commit_date=$(git log -n 1 --author="$author_name" --format="%ad" --date=format:'%Y-%m-%d')
echo "$count commit(s) - last commit: $last_commit_date - $author_email"
done

89 commit(s) - last commit: 2022-12-07 - Martin Matuška martin@matuska.org
61 commit(s) - last commit: 2023-12-10 - Martin Matuska martin@matuska.de
35 commit(s) - last commit: 2023-12-10 - Martin Matuska martin@matuska.org
22 commit(s) - last commit: 2024-03-23 - Tim Kientzle kientzle@acm.org
21 commit(s) - last commit: 2023-12-08 - Emil Velikov emil.l.velikov@gmail.com
15 commit(s) - last commit: 2021-10-30 - jiat75 jiat0218@gmail.com
12 commit(s) - last commit: 2023-07-14 - Steve Lhomme robux4@ycbcr.xyz
12 commit(s) - last commit: 2023-12-12 - Mostyn Bramley-Moore mostyn@antipode.se
12 commit(s) - last commit: 2021-05-08 - Christos Zoulas christos@zoulas.com
9 commit(s) - last commit: 2021-03-06 - Tom Ivar Helbekkmo tih@hamartun.priv.no
7 commit(s) - last commit: 2024-03-23 - Dag-Erling Smørgrav des@des.no
6 commit(s) - last commit: 2021-08-29 - Samanta Navarro ferivoz@riseup.net
6 commit(s) - last commit: 2024-03-23 - AtariDreams 83477269+AtariDreams@users.noreply.github.com
5 commit(s) - last commit: 2023-04-25 - Sarah Gilmore sgilmore@mathworks.com
5 commit(s) - last commit: 2023-01-09 - Rosen Penev rosenp@gmail.com
4 commit(s) - last commit: 2023-05-27 - uyjulian uyjulian@gmail.com
4 commit(s) - last commit: 2021-12-19 - linear cannon dev@linear.network
4 commit(s) - last commit: 2023-06-16 - Wei-Cheng Pan legnaleurc@gmail.com
4 commit(s) - last commit: 2023-12-10 - Martin Matuska martin.matuska@axelspringer.com
4 commit(s) - last commit: 2021-11-20 - Jonas Witschel diabonas@archlinux.org
4 commit(s) - last commit: 2024-03-29 - Ed Maste emaste@freebsd.org
3 commit(s) - last commit: 2021-03-24 - Russell Mullens the_pimaster@hotmail.com
3 commit(s) - last commit: 2022-04-28 - Reshetnikov Alexandr hemn.still@gmail.com
3 commit(s) - last commit: 2023-05-27 - Mingye Wang arthur200126@gmail.com
3 commit(s) - last commit: 2023-07-19 - Michał Górny mgorny@gentoo.org
3 commit(s) - last commit: 2021-02-13 - Grzegorz Antoniak ga@anadoxin.org
3 commit(s) - last commit: 2021-10-13 - Dustin Howett duhowett@microsoft.com
3 commit(s) - last commit: 2022-02-16 - Brad King brad.king@kitware.com
3 commit(s) - last commit: 2024-03-29 - Alexandr Reshetnikov hemnstill@users.noreply.github.com
3 commit(s) - last commit: 2020-09-17 - Alex Richardson Alexander.Richardson@cl.cam.ac.uk
2 commit(s) - last commit: 2023-09-06 - tomaThomas thomas.moeller@mailbox.org
2 commit(s) - last commit: 2021-03-31 - jo620kix 81214276+jo620kix@users.noreply.github.com
2 commit(s) - last commit: 2022-02-19 - cielavenir cielartisan@gmail.com
2 commit(s) - last commit: 2022-11-18 - banjiuqingshan 63209634+banjiuqingshan@users.noreply.github.com
2 commit(s) - last commit: 2018-10-12 - Zack Weger ZWeger@StrozFriedberg.com
2 commit(s) - last commit: 2022-07-27 - Yuri Gribov tetra2005@gmail.com
2 commit(s) - last commit: 2021-12-17 - Walter Lozano walter.lozano@collabora.com
2 commit(s) - last commit: 2022-10-01 - Vincent Torri vtorri@outlook.fr
2 commit(s) - last commit: 2022-12-29 - TERESH1 svyatoslavtereshin@yandex.ru
2 commit(s) - last commit: 2021-11-19 - Ryan Libby rlibby@FreeBSD.org
2 commit(s) - last commit: 2021-08-25 - Russell Greene russellgreene8@gmail.com
2 commit(s) - last commit: 2023-01-09 - Rose 83477269+AtariDreams@users.noreply.github.com
2 commit(s) - last commit: 2023-09-11 - Pedro Nacht pnacht@google.com
2 commit(s) - last commit: 2021-05-12 - Owen W. Taylor otaylor@fishsoup.net
2 commit(s) - last commit: 2021-03-31 - Ondrej Dubaj odubaj@redhat.com
2 commit(s) - last commit: 2023-06-10 - Luke Mewburn Luke@Mewburn.net
2 commit(s) - last commit: 2022-02-08 - Jung-uk Kim jkim@FreeBSD.org
2 commit(s) - last commit: 2021-10-07 - Joerg Sonnenberger joerg@bec.de
2 commit(s) - last commit: 2022-02-07 - Jairo kidandcat@gmail.com
2 commit(s) - last commit: 2022-10-06 - Erik Olofsson erik@olofsson.info
2 commit(s) - last commit: 2023-05-12 - Enji Cooper 1574099+ngie-eign@users.noreply.github.com
2 commit(s) - last commit: 2023-11-21 - Dustin L. Howett dustin@howett.net
2 commit(s) - last commit: 2024-03-23 - Duncan Horn 40036384+dunhor@users.noreply.github.com
2 commit(s) - last commit: 2022-04-08 - Christian Hesse mail@eworm.de
2 commit(s) - last commit: 2022-04-13 - Biswapriyo Nath nathbappai@gmail.com
2 commit(s) - last commit: 2021-01-11 - Alexandre Janniaux ajanni@videolabs.io
2 commit(s) - last commit: 2023-07-22 - Adrian Vovk adrianvovk@gmail.com
2 commit(s) - last commit: 2021-05-19 - Adrian Ebeling devl@adrian-ebeling.de
1 commit(s) - last commit: 2022-04-09 - wangkerong 44842104+wangkerong@users.noreply.github.com
1 commit(s) - last commit: 2024-03-02 - tnias phil@grmr.de
1 commit(s) - last commit: 2024-03-24 - terrynini terrynini38514@gmail.com
1 commit(s) - last commit: 2022-05-21 - tarsin yuanqingxiang233@163.com
1 commit(s) - last commit: 2021-01-10 - r0ptr r0ptr@protonmail.com
1 commit(s) - last commit: 2022-07-22 - obiwac obiwac@gmail.com
1 commit(s) - last commit: 2024-02-08 - nooriro noorirogit@gmail.com
1 commit(s) - last commit: 2023-12-31 - nielash nielronash@gmail.com
1 commit(s) - last commit: - lgtm-com[bot] <43144390+lgtm-com[bot]@users.noreply.github.com>
1 commit(s) - last commit: 2023-12-11 - grembo freebsd@grem.de
1 commit(s) - last commit: 2024-03-24 - asolwa 53942085+asolwa@users.noreply.github.com
1 commit(s) - last commit: 2023-09-23 - alice alice@ayaya.dev
1 commit(s) - last commit: 2022-01-11 - Zdenek Zambersky zzambers@redhat.com
1 commit(s) - last commit: 2022-02-07 - Younes El-karama yelkarama@users.noreply.github.com
1 commit(s) - last commit: 2023-12-05 - YAMASHINA Hio 168243+hio@users.noreply.github.com
1 commit(s) - last commit: 2023-03-02 - Xin "Russell" Liu ginshio78@gmail.com
1 commit(s) - last commit: 2023-07-19 - Wong Hoi Sing Edison hswong3i@gmail.com
1 commit(s) - last commit: 2022-11-18 - Vladimir Kikhtenko kikht@yandex-team.ru
1 commit(s) - last commit: 2021-12-17 - Todd Richmond todd_richmond@hotmail.com
1 commit(s) - last commit: 2021-11-19 - Theo Buehler tb@openbsd.org
1 commit(s) - last commit: 2024-03-31 - Spacefish timo.witte@gmail.com
1 commit(s) - last commit: 2022-06-25 - Sergey Bobrenok bobrofon@gmail.com
1 commit(s) - last commit: 2022-06-22 - Sean McBride sean@rogue-research.com
1 commit(s) - last commit: 2023-07-26 - Samuel Marks 807580+SamuelMarks@users.noreply.github.com
1 commit(s) - last commit: 2021-04-06 - Rolf Eike Beer eike@sf-mail.de
1 commit(s) - last commit: 2023-09-16 - Roland Clobus rclobus@rclobus.nl
1 commit(s) - last commit: 2023-02-19 - Po-Chuan Hsieh sunpoet@sunpoet.net
1 commit(s) - last commit: 2021-12-23 - Petr Malat oss@malat.biz
1 commit(s) - last commit: 2022-12-29 - Peter Pentchev roam@ringlet.net
1 commit(s) - last commit: 2022-12-29 - Peter Pentchev roam@debian.org
1 commit(s) - last commit: 2023-04-03 - Peter Kaestle peter.kaestle@nokia.com
1 commit(s) - last commit: 2023-09-11 - Pedro Nacht pedro.k.night@gmail.com
1 commit(s) - last commit: 2023-09-17 - Pedro Kaj Kjellerup Nacht pnacht@google.com
1 commit(s) - last commit: 2021-01-27 - Oliver Ford ojford@gmail.com
1 commit(s) - last commit: 2021-01-22 - Oleg Smirnov oleg.v.smirnov@gmail.com
1 commit(s) - last commit: 2022-03-10 - Michael Osipov michael.osipov@siemens.com
1 commit(s) - last commit: 2024-02-25 - Matt Smith matt-sm@users.noreply.github.com
1 commit(s) - last commit: 2022-03-04 - Mateusz Piotrowski 0mp@FreeBSD.org
1 commit(s) - last commit: 2021-03-16 - Masalskaya, Anna anna.masalskaya@intel.com
1 commit(s) - last commit: 2024-03-22 - Mark Johnston markjdb@gmail.com
1 commit(s) - last commit: 2023-07-26 - Luke Rewega lrewega@c32.ca
1 commit(s) - last commit: 2023-02-14 - Li kunyu kunyu@nfschina.com
1 commit(s) - last commit: 2023-12-04 - Klaus Holst Jacobsen 48069914+klausholstjacobsen@users.noreply.github.com
1 commit(s) - last commit: 2022-07-25 - Khem Raj raj.khem@gmail.com
1 commit(s) - last commit: 2022-03-05 - Ken Matsui 26405363+ken-matsui@users.noreply.github.com
1 commit(s) - last commit: 2023-05-24 - Kai 2644614+Schweinepriester@users.noreply.github.com
1 commit(s) - last commit: 2022-10-13 - Julien Voisin jvoisin@google.com
1 commit(s) - last commit: 2023-07-24 - Joshua Root jmr@macports.org
1 commit(s) - last commit: 2022-10-28 - Joris Clement joris.clement@posteo.de
1 commit(s) - last commit: 2022-08-26 - John Reiser jreiser@BitWagon.com
1 commit(s) - last commit: 2022-06-30 - Joel Uckelman juckelman@strozfriedberg.co.uk
1 commit(s) - last commit: 2023-11-24 - Jeffrey Walton noloader@gmail.com
1 commit(s) - last commit: 2023-09-04 - Jarred Sumner jarred@jarredsumner.com
1 commit(s) - last commit: 2022-06-05 - Jan Starý hans@stare.cz
1 commit(s) - last commit: 2021-10-14 - JFranklin13 jfranklin13@protonmail.com
1 commit(s) - last commit: 2021-10-03 - IohannRabeson IohannRabeson@users.noreply.github.com
1 commit(s) - last commit: 2024-02-08 - Haelwenn Monnier contact+github.com@hacktivis.me
1 commit(s) - last commit: 2021-12-21 - Graham Percival gperciva@tarsnap.com
1 commit(s) - last commit: 2022-05-12 - Gaël PORTAY gael.portay@collabora.com
1 commit(s) - last commit: 2022-09-07 - Ewgeni Wolowik ewgeni.wolowik@scheer-group.com
1 commit(s) - last commit: 2022-09-30 - Eric van Gyzen eric@vangyzen.net
1 commit(s) - last commit: 2023-05-12 - Enji Cooper yaneurabeya@gmail.com
1 commit(s) - last commit: 2024-03-17 - Elvis Angelaccio elvisangelaccio@users.noreply.github.com
1 commit(s) - last commit: 2024-03-29 - Ed Maste emaste@FreeBSD.org
1 commit(s) - last commit: 2023-04-18 - Dimitry Andric dimitry@andric.com
1 commit(s) - last commit: 2022-04-17 - David Macek david.macek.0@gmail.com
1 commit(s) - last commit: 2024-03-23 - Collin Funk collin.funk1@gmail.com
1 commit(s) - last commit: 2021-11-28 - Charly C changaco@changaco.oy.lc
1 commit(s) - last commit: 2023-11-20 - Brooks Davis brooks@one-eyed-alien.net
1 commit(s) - last commit: 2022-05-03 - BogDan Vatra bogdan@kde.org
1 commit(s) - last commit: 2023-01-20 - Bernhard M. Wiedemann githubbmw2020@lsmod.de
1 commit(s) - last commit: 2023-01-20 - Bernhard M. Wiedemann bwiedemann@suse.de
1 commit(s) - last commit: 2022-07-19 - Ben Wagner bungeman@chromium.org
1 commit(s) - last commit: 2021-07-13 - Andy Brown andyjtbrown@gmail.com
1 commit(s) - last commit: 2023-11-27 - Alfred Wingate parona@protonmail.com
1 commit(s) - last commit: 2022-01-23 - Alexey Pelykh alexey.pelykh@gmail.com
1 commit(s) - last commit: 2021-09-17 - Alex Xu 351006+Hello71@users.noreply.github.com
1 commit(s) - last commit: 2023-05-24 - Albert Jin albert.jin@gmail.com
1 commit(s) - last commit: 2021-10-26 - AdamKorcz adam@adalogics.com
1 commit(s) - last commit: 2023-12-08 - Aaron Lindros lindros.aaron@gmail.com

#1589 enables some test cases that were previously disabled.

Indeed - one thought is if one of the now-enabled test cases was malicious. None of the xz, lzip, and lzma tests have "interesting" recent changes.

Closing - @kientzle has reviewed the remaining commits.

#1589 does change the systemf() command in test_utils/test_main.c from lzma %s to lzma --help %s. I think this would result in the return code always being zero?

This is fine. The point of this test is to see whether the lzma command is present or not. Adding --help ensures that it will exit cleanly and quickly if it is present.

jiat75 did not act alone, and there appear to be others involved (or pseudonyms) who at some point may have committed something malicious or unprotected code to do this something in the future.

That is a reasonable assumption to make. Perhaps more backdoors may be exposed now that folks are looking
more closely. If the code by the jia-account would not have caused spikes in CPU usage, it may not have been
discovered (or, possibly, at a later time).

I see Jia Tan has 10 Repos, most of which are folks. e.g squashfs-tools
I would think these projects should also be closely examined (by the contributors) as they may have been potential targets/training projects before settling for xz.
It's worth noting that Jia Tan is so much into TESTS, hence that would be a great place to look within those projects, regardless of the contributor's Name.

With the exception of mmatuska, Kientzle, emaste, and many others long-time trusted committers, it would be crazy to think that other authors should be reviewed as well (almost 250 commits since 2021).
Yes, crazy...maybe insane.

I took that list and filtered the e-mail addresses against haveibeenpwned.com, with the idea that throwaway addresses won’t appear in any data dumps. (The idea is borrowed off someone on mastodon, it’s not my original idea.) The resulting list comes to about 170 commits, which is only a little bit more manageable than 250 to review. Leaving the filtered list here in case someone does (or not!):

61 commit(s) - last commit: 2023-12-10 - Martin Matuska martin@matuska.de
15 commit(s) - last commit: 2021-10-30 - jiat75 jiat0218@gmail.com
12 commit(s) - last commit: 2023-12-12 - Mostyn Bramley-Moore mostyn@antipode.se
6 commit(s) - last commit: 2021-08-29 - Samanta Navarro ferivoz@riseup.net
6 commit(s) - last commit: 2024-03-23 - AtariDreams 83477269+AtariDreams@users.noreply.github.com
5 commit(s) - last commit: 2023-04-25 - Sarah Gilmore sgilmore@mathworks.com
4 commit(s) - last commit: 2021-12-19 - linear cannon dev@linear.network
4 commit(s) - last commit: 2021-11-20 - Jonas Witschel diabonas@archlinux.org
3 commit(s) - last commit: 2021-02-13 - Grzegorz Antoniak ga@anadoxin.org
3 commit(s) - last commit: 2024-03-29 - Alexandr Reshetnikov hemnstill@users.noreply.github.com
3 commit(s) - last commit: 2020-09-17 - Alex Richardson Alexander.Richardson@cl.cam.ac.uk
2 commit(s) - last commit: 2023-09-06 - tomaThomas thomas.moeller@mailbox.org
2 commit(s) - last commit: 2021-03-31 - jo620kix 81214276+jo620kix@users.noreply.github.com
2 commit(s) - last commit: 2022-11-18 - banjiuqingshan 63209634+banjiuqingshan@users.noreply.github.com
2 commit(s) - last commit: 2021-12-17 - Walter Lozano walter.lozano@collabora.com
2 commit(s) - last commit: 2022-10-01 - Vincent Torri vtorri@outlook.fr
2 commit(s) - last commit: 2021-11-19 - Ryan Libby rlibby@FreeBSD.org
2 commit(s) - last commit: 2023-01-09 - Rose 83477269+AtariDreams@users.noreply.github.com
2 commit(s) - last commit: 2023-09-11 - Pedro Nacht pnacht@google.com
2 commit(s) - last commit: 2021-03-31 - Ondrej Dubaj odubaj@redhat.com
2 commit(s) - last commit: 2023-05-12 - Enji Cooper 1574099+ngie-eign@users.noreply.github.com
2 commit(s) - last commit: 2024-03-23 - Duncan Horn 40036384+dunhor@users.noreply.github.com
2 commit(s) - last commit: 2021-01-11 - Alexandre Janniaux ajanni@videolabs.io
2 commit(s) - last commit: 2021-05-19 - Adrian Ebeling devl@adrian-ebeling.de
1 commit(s) - last commit: 2022-04-09 - wangkerong 44842104+wangkerong@users.noreply.github.com
1 commit(s) - last commit: 2024-03-02 - tnias phil@grmr.de
1 commit(s) - last commit: 2022-05-21 - tarsin yuanqingxiang233@163.com
1 commit(s) - last commit: 2021-01-10 - r0ptr r0ptr@protonmail.com
1 commit(s) - last commit: 2024-02-08 - nooriro noorirogit@gmail.com
1 commit(s) - last commit: 2023-12-31 - nielash nielronash@gmail.com
1 commit(s) - last commit: - lgtm-com[bot] <43144390+lgtm-com[bot]@users.noreply.github.com>
1 commit(s) - last commit: 2024-03-24 - asolwa 53942085+asolwa@users.noreply.github.com
1 commit(s) - last commit: 2023-09-23 - alice alice@ayaya.dev
1 commit(s) - last commit: 2022-02-07 - Younes El-karama yelkarama@users.noreply.github.com
1 commit(s) - last commit: 2023-12-05 - YAMASHINA Hio 168243+hio@users.noreply.github.com
1 commit(s) - last commit: 2023-03-02 - Xin "Russell" Liu ginshio78@gmail.com
1 commit(s) - last commit: 2022-11-18 - Vladimir Kikhtenko kikht@yandex-team.ru
1 commit(s) - last commit: 2021-11-19 - Theo Buehler tb@openbsd.org
1 commit(s) - last commit: 2023-07-26 - Samuel Marks 807580+SamuelMarks@users.noreply.github.com
1 commit(s) - last commit: 2021-12-23 - Petr Malat oss@malat.biz
1 commit(s) - last commit: 2022-12-29 - Peter Pentchev roam@debian.org
1 commit(s) - last commit: 2023-04-03 - Peter Kaestle peter.kaestle@nokia.com
1 commit(s) - last commit: 2023-09-17 - Pedro Kaj Kjellerup Nacht pnacht@google.com
1 commit(s) - last commit: 2024-02-25 - Matt Smith matt-sm@users.noreply.github.com
1 commit(s) - last commit: 2022-03-04 - Mateusz Piotrowski 0mp@FreeBSD.org
1 commit(s) - last commit: 2021-03-16 - Masalskaya, Anna anna.masalskaya@intel.com
1 commit(s) - last commit: 2023-07-26 - Luke Rewega lrewega@c32.ca
1 commit(s) - last commit: 2023-02-14 - Li kunyu kunyu@nfschina.com
1 commit(s) - last commit: 2023-12-04 - Klaus Holst Jacobsen 48069914+klausholstjacobsen@users.noreply.github.com
1 commit(s) - last commit: 2022-03-05 - Ken Matsui 26405363+ken-matsui@users.noreply.github.com
1 commit(s) - last commit: 2023-05-24 - Kai 2644614+Schweinepriester@users.noreply.github.com
1 commit(s) - last commit: 2022-10-13 - Julien Voisin jvoisin@google.com
1 commit(s) - last commit: 2022-10-28 - Joris Clement joris.clement@posteo.de
1 commit(s) - last commit: 2021-10-14 - JFranklin13 jfranklin13@protonmail.com
1 commit(s) - last commit: 2021-10-03 - IohannRabeson IohannRabeson@users.noreply.github.com
1 commit(s) - last commit: 2024-02-08 - Haelwenn Monnier contact+github.com@hacktivis.me
1 commit(s) - last commit: 2021-12-21 - Graham Percival gperciva@tarsnap.com
1 commit(s) - last commit: 2022-05-12 - Gaël PORTAY gael.portay@collabora.com
1 commit(s) - last commit: 2022-09-07 - Ewgeni Wolowik ewgeni.wolowik@scheer-group.com
1 commit(s) - last commit: 2024-03-17 - Elvis Angelaccio elvisangelaccio@users.noreply.github.com
1 commit(s) - last commit: 2024-03-23 - Collin Funk collin.funk1@gmail.com
1 commit(s) - last commit: 2023-01-20 - Bernhard M. Wiedemann githubbmw2020@lsmod.de
1 commit(s) - last commit: 2023-11-27 - Alfred Wingate parona@protonmail.com
1 commit(s) - last commit: 2021-09-17 - Alex Xu 351006+Hello71@users.noreply.github.com
1 commit(s) - last commit: 2021-10-26 - AdamKorcz adam@adalogics.com

61 commit(s) - last commit: 2023-12-10 - Martin Matuska martin@matuska.de

This is of course another email for Martin Matuška, the current and long-time maintainer of libarchive.

15 commit(s) - last commit: 2021-10-30 - jiat75 jiat0218@gmail.com

These have all been carefully re-reviewed already. The only issue found was fixed in #2101

That would narrows it down to 113 commits, the vast majority of which are only 1 or 2 from the same user. Should these be reviewed?

I would like to look carefully at any of those commits which included binary test files or changes to the build scripts. (Those are sensitive areas that have not always received careful scrutiny.)

If any of the local git scripting experts could share a way to enumerate such changes, I would appreciate the help.

I would like to look carefully at any of those commits which included binary test files or changes to the build scripts. (Those are sensitive areas that have not always received careful scrutiny.)

If any of the local git scripting experts could share a way to enumerate such changes, I would appreciate the help.

This shows all the commit messages + files changed in my commits, which makes it easy to see when test files are added/modified: git log --stat --author='Mostyn Bramley-Moore <mostyn@antipode.se>'

Since I'm at the top of the list, I can summarise the test files I added:

7911ce4 added four tests:

  1. libarchive/test/test_read_format_7zip_solid_zstd.7z.uu
  2. libarchive/test/test_read_format_7zip_zstd.7z.uu
  3. libarchive/test/test_read_format_7zip_zstd_bcj.7z.uu
  4. libarchive/test/test_read_format_7zip_zstd_nobcj.7z.uu

(1) and (2) both contain the same four small files, with some basic strings inside.
(1) was compressed with "solid" mode, which compresses all the files together within the .7z archive.
(2) was compressed in "non-solid" mode, which compresses each file separately within the .7z archive.

(3) and (4) both contain the same linux/amd64 C hello world program. I can't remember which compiler I used, nor do I have the source.
It was something like the following, which when I compile with gcc -o hw hw.c on ubuntu 23.04 produces a file that is only 8 bytes larger
than the version in the two test files.

#include <stdio.h>

int main(int argc, char *argv[]) {
printf("hello, world\n");
return 0;
}

(3) was compresssed using the BCJ filter, that is part of 7zip and IIRC LZMA.
(4) was compressed without using the BCJ filter.

a96cb07 added two test files:

  1. libarchive/test/test_read_format_7zip_lzma2_arm.7z.uu
  2. libarchive/test/test_read_format_7zip_zstd_arm.7z.uu

These both contain the same linux C hello world program, compiled for gnueabihf.

(1) Was compressed with LZMA2 and the ARM filter.
(2) Was compressed with zstandard and the ARM filter.

eb2b5ad added two test files:

  1. libarchive/test/test_read_format_7zip_deflate_arm64.7z.uu
  2. libarchive/test/test_read_format_7zip_lzma2_arm64.7z.uu

These both contain the same arm64 linux C hello world program. For some reason it's 72K, whereas if I create and compile the same code now, it produces a 16K executable. I will see if I can figure out why.

a7ea541 added one test file, with a description of the contents in the commit message (using simple strings).

  1. libarchive/test/test_read_format_7zip_win_attrib.7z.uu

eb2b5ad added two test files:

  1. libarchive/test/test_read_format_7zip_deflate_arm64.7z.uu
  2. libarchive/test/test_read_format_7zip_lzma2_arm64.7z.uu

These both contain the same arm64 linux C hello world program. For some reason it's 72K, whereas if I create and compile the > same code now, it produces a 16K executable. I will see if I can figure out why.

I think I have figured this part out.

This binary was produced on an amd64 ubuntu 23.04 machine, using aarch64-linux-gnu-gcc-13. If I compile a C hello-world program using that compiler, it produces a sparse file that takes 16Kb of space on my ext4 filesystem, with a logical filesize of 70312 bytes. I assume that the sparseness is lost during compression and/or decompression. I also see the same sparse output using a regular host gcc on an arm64 linux machine.

# Note that the original file is sparse:
$ ls -l hw
-rwxrwxr-x 1 mostyn mostyn 70312 Apr  6 23:25 hw
$ du -sh hw
16K	hw

# Copy the binary, in a way that will make the destination non-sparse:
$ cat hw > hw2
$ ls -l hw2 
-rw-rw-r-- 1 mostyn mostyn 70312 Apr  6 23:37 hw2
$ du -sh hw2 
72K	hw2

# Both copies are logically identical, despite the differing sizes on disk:
$ sha256sum hw hw2 
ef5d8311b25d349d4f2f007a7dd2505d30b463e566515f8ebfd273e26dc1b7b5  hw
ef5d8311b25d349d4f2f007a7dd2505d30b463e566515f8ebfd273e26dc1b7b5  hw2