Not decompressing entire stream
coder543 opened this issue · comments
Something about the gzip encoder used to create the CommonCrawl archives doesn't play well with libflate
. It only seems to decompress the first few hundred bytes.
If I gunzip
it then gzip
it again, libflate
is able to decompress the entire file correctly... so, it's interesting.
Using flate2
, I'm able to decompress the entire file, but I had to use MultiGzDecoder, so I think the issue is that the file consists of several Gzip streams in a sequence. Supposedly, the Gzip standard allows this.
Thank you for your information.
I tried adding gzip::MultiDecoder
at the commit deadd75.