nodeca / pako

high speed zlib port to javascript, works in browser & node.js

Home Page:http://nodeca.github.io/pako/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"incorrect header check" when there's extra input data

Tschrock opened this issue · comments

Background:
I'm trying to read a record from a git packfile using pako. This format embeds zlib data inside each record, and relies on zlib to indicate when the compressed data is done. In pako 1 this is easy enough - after the decompression is done I can use inflate.strm.avail_in to check how much input was consumed so I can read the next packfile record from the right spot. In pako 2, however, I no longer get a clean indication of the end of the compressed data, and instead get a -3 "incorrect header check" error.

Issue:
When inflating a zlib stream with extra input data at the end (that's not another zlib stream), pako 2 gives me an "incorrect header check". This works fine in pako 1.

It looks like the inflate wrapper in pako 2 assumes the extra input data is another zlib stream and tries to start reading it again, which fails in this case because the extra data is unrelated.

pako/lib/inflate.js

Lines 237 to 245 in 0398fad

// Skip snyc markers if more data follows and not raw mode
while (strm.avail_in > 0 &&
status === Z_STREAM_END &&
strm.state.wrap > 0 &&
data[strm.next_in] !== 0)
{
zlib_inflate.inflateReset(strm);
status = zlib_inflate.inflate(strm, _flush_mode);
}

Removing this from the wrapper allows it to work. Maybe this could be a configuration setting? Or changed to only happen for gzip streams? The gzip spec has repeated records, but zlib does not.

Small example:

const pako = require("pako");

const dataHex = 
    "789C0B492D2E5170492C49E4020013630345" // deflate("Test Data")
    + "14303893"; // 4 bytes of extra data

const data = new Uint8Array(dataHex.match(/.{1,2}/g).map(b => parseInt(b, 16))); // hex to Uint8Array

const inflate = new pako.Inflate({ windowBits: 15 });
inflate.push(data);

console.log("avail_in", inflate.strm.avail_in);
console.log("msg", inflate.msg);
console.log("result", new TextDecoder().decode(inflate.result));

Pako 1.0.11

avail_in: 4
msg: 
result: Test Data

Pako 2.*

avail_in: 2
msg: incorrect header check
result: 

It's a problem to make wrapper universal. v2 closed many other dirty cases.

If you have time, i'd suggest to inspect node.js wrapper for zlib

May be that will give you idea what can be improved. It was used as base for v2 wrapper, but may be i missed (or could not understand) something important.

i have the same question