kspalaiologos / bzip3

A better and stronger spiritual successor to BZip2.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Decompression performances

notorand-it opened this issue · comments

Why not posting also those ones?
Quite often you compress once and decompress many times.

Even if you don't post decompression numbers, it would be helpful if the etc/benchmark.png graphic linked from the top level README.md stated that the running times are for compression.

Note also that different implementations of the same file format can have significantly different performance (running times). This is for decompression, so there's only one correct output and no "speed versus size" configuration knobs, but for the gzip file format, I'll quote https://nigeltao.github.io/blog/2021/fastest-safest-png-decoder.html#gzip

The gzip file format is, roughly speaking, DEFLATE compression combined with a CRC-32 checksum. Like example/crc32, Wuffs’ example/zcat program is roughly equivalent to Debian’s /bin/zcat, other than being 3.1x faster (2.680s vs 8.389s) [emphasis added] on the same 178 MiB file and also running in a self-imposed SECCOMP_MODE_STRICT sandbox.

That's compared to /bin/zcat instead of compared to /bin/gunzip, but the point still remains that there may be faster implementations.

From https://nigeltao.github.io/blog/2021/fastest-safest-png-decoder.html#crc-32 further up in the same page

As for performance, Wuffs’ example/crc32 program is roughly equivalent to Debian’s /bin/crc32, other than being 7.3x faster (0.056s vs 0.410s) [emphasis added] on this 178 MiB file.

You may be able to re-use its SIMD techniques for bzip3's own CRC-32 implementation.

Before suggesting unnecessary optimisations, please consider profiling. You can profile bzip3 on some common test cases, like the Silesia corpus using a tool called callgrind. Then, any GUI tool (e.g. KCachegrind) can be used to visualise the results.

As such, as a result of profiling, CRC32 takes approximately 1.1% of the runtime. The timing on the Silesia corpus is 17.42s, so CRC32 took an absolutely astonishing amount of time - 170 milliseconds.

Secondly, the decompression numbers are quite obvious - LZ compressors will take a lot of time to compress the data and then quickly unpack it, but I don't want to draw attention to the fact that bzip3 can take longer to decompress the archives - as bzip2 exhibits the same behavior, like every Burrows-Wheeler transform/CM/AC based compressor. bzip3 is not a replacement for zstandard, lzma or any other LZ based compressor in all use cases. I added the benchmarks only to please people who misunderstand the point of this project and provide some sort of a scale.