pieroxy / lz-string

LZ-based compression algorithm for JavaScript

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to decompress data saved to disk - loading issue?

Genhain opened this issue · comments

Information

just like #101 i'm experiencing this issue, i'm attempting to compress and then save to realm, then get it back out of realm and decompress it when i hit the issue. for Reference when i compress it and then immediately decompress it, it appears to be fine, its after i write it to realm then get it back that it seems to experience this, obviously this might be an issue with realm, i am just putting this forth in case you know why this might be happening.

Been hitting this myself for both raw compression and uint8array compression - and if it is the same then it's an issue with NodeJS loading them again - and turning the Buffer back into usable raw data. Using any of the other (current) modes works perfectly well and has no issues, so I hope that someone sees this and has a look at it (specifically bin/cli.cjs - as soon as it works the test.sh script will have them passing, as it can save data perfectly well!)

How is the grab for decompression being done? pretty much all if not all of the built in methods of reading data into the buffer sanitizes it into a "safe" string with things like control characters and such removed or escaped.

I had to write my own functions to handle this process for my projects, maybe something similar as a helper suite would be worth adding to the main library?

That's pretty much the problem, the raw.bin for tattoo (test suite) is 972 bytes, if loading in NodeJS (via readFileSync) it gives a ~1092 byte (this I cant remember so well) string, but buffer.byteLength gives the correct 972 bytes - I've tried various combinations of TextDecoder and manual iterations, and can't end up with a 972 byte String to pass to stock/decode - a lot of testing (and the internal validation) shows that everything including the file writing is correct, just that re-loading step that fails.

Have you got a public repo to show how you managed it?

Yeah the functions are defined here: https://github.com/JackalLabs/jackal.js/blob/v3-upgrade/src/utils/converters.ts#L100

stringToUint8Array and stringToUint16Array

the functions are used in a few places both on compressed strings and standard ones.

It's going the other way around that has a problem - NodeJS loads as a Buffer (basically a Uint8array), and converting that to a string has a problem :-/

I thought the on-disk saved version was the issue? anyway, this handles conversion from a buffer-like to a string.

https://github.com/JackalLabs/jackal.js/blob/v3-upgrade/src/utils/converters.ts#L70

I did some experimenting before moving on due to time crunch, but I think Node actually uses a bastardized utf. buffer <> string doesn't comply 1:1 with the behavior we see in the browser.

It does something weird - I'd actually tried that one without the spread - will give that a try later when I get a chance - I've specifically told it null for encoding so it doesn't try to utf8 it, but I don't know what it's doing :-/

Finally managed to figure it out - treating data as potential Uint8Array, but that it might have an odd number of characters - included here, and used for decompressing uint8array as that supports more, but encoding is a breaking change (as it now supports non-even -length data)