mxmlnkn / rapidgzip

Gzip Decompression and Random Access for Modern Multi-Core Machines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add test mode that only returns the uncompressed size and checks it and maybe also the CRC32

mxmlnkn opened this issue · comments

Getting the uncompressed size is a common use case.

Avoiding the copies for backward pointers might improve performance for this particular use case.

And marker replacement is not necessary in this case! Also, allocations can be avoided and copying out of the deflate decompressor. All together this should combine to some significant performance increases.

A --count option similar to --count-lines also wouldn't be out of line (pun intended).

I couldn't observe any speed improvements (see count-only branch) for the single-core case even in a highly compressed file:

m pragzip && src/tools/pragzip -v --count test-files/small/small.gz
    Decompressed in total 536870912 B in 2.17922 s -> 246.36 MB/s$ 

m pragzip && src/tools/pragzip -v -d -o /dev/null -P 1 test-files/small/small.gz
    Decompressed in total 536870912 B in 2.17986 s -> 246.287 MB/s
    Decompressed in total 536870912 B in 2.73646 s -> 196.192 MB/s
    Decompressed in total 536870912 B in 2.15831 s -> 248.746 MB/s

m pragzip && src/tools/pragzip -v --count test-files/small/small.gz
    Decompressed in total 536870912 B in 2.08869 s -> 257.037 MB/s
m pragzip && src/tools/pragzip -v --count test-files/small/small.gz
    Decompressed in total 536870912 B in 2.17042 s -> 247.358 MB/s
    
src/tools/pragzip -d -o /dev/null CTU-13-Dataset.tar.gz
    Decompressed in total 79747543040 B in 43.8358 s -> 1819.23 MB/s
src/tools/pragzip --count CTU-13-Dataset.tar.gz
    Decompressed in total 79747543040 B in 207.621 s -> 384.102 MB/s
src/tools/pragzip -d -P 1 -o /dev/null CTU-13-Dataset.tar.gz
    Decompressed in total 79747543040 B in 204.375 s -> 390.202 MB/s

And the added code is not negligible even for the sequential case. Adding a specialization for the parallel case does not seem worth it.

Another aspect would be memory usage, which could be hugely reduced. This should also integrate more easily into the existing parallel gzip reader. There could simply be a flag that skips the copy of the ranges returned by deflate::Block::read and simply accumulates the size.

Implemented with e9b27b21.