Add test mode that only returns the uncompressed size and checks it and maybe also the CRC32
mxmlnkn opened this issue · comments
Getting the uncompressed size is a common use case.
Avoiding the copies for backward pointers might improve performance for this particular use case.
And marker replacement is not necessary in this case! Also, allocations can be avoided and copying out of the deflate decompressor. All together this should combine to some significant performance increases.
A --count
option similar to --count-lines
also wouldn't be out of line (pun intended).
I couldn't observe any speed improvements (see count-only branch) for the single-core case even in a highly compressed file:
m pragzip && src/tools/pragzip -v --count test-files/small/small.gz
Decompressed in total 536870912 B in 2.17922 s -> 246.36 MB/s$
m pragzip && src/tools/pragzip -v -d -o /dev/null -P 1 test-files/small/small.gz
Decompressed in total 536870912 B in 2.17986 s -> 246.287 MB/s
Decompressed in total 536870912 B in 2.73646 s -> 196.192 MB/s
Decompressed in total 536870912 B in 2.15831 s -> 248.746 MB/s
m pragzip && src/tools/pragzip -v --count test-files/small/small.gz
Decompressed in total 536870912 B in 2.08869 s -> 257.037 MB/s
m pragzip && src/tools/pragzip -v --count test-files/small/small.gz
Decompressed in total 536870912 B in 2.17042 s -> 247.358 MB/s
src/tools/pragzip -d -o /dev/null CTU-13-Dataset.tar.gz
Decompressed in total 79747543040 B in 43.8358 s -> 1819.23 MB/s
src/tools/pragzip --count CTU-13-Dataset.tar.gz
Decompressed in total 79747543040 B in 207.621 s -> 384.102 MB/s
src/tools/pragzip -d -P 1 -o /dev/null CTU-13-Dataset.tar.gz
Decompressed in total 79747543040 B in 204.375 s -> 390.202 MB/s
And the added code is not negligible even for the sequential case. Adding a specialization for the parallel case does not seem worth it.
Another aspect would be memory usage, which could be hugely reduced. This should also integrate more easily into the existing parallel gzip reader. There could simply be a flag that skips the copy of the ranges returned by deflate::Block::read
and simply accumulates the size.
Implemented with e9b27b21.