valgur / ncompress

LZW compression and decompression in Python and C++

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compressed Image Band Unable to be Decompressed

seandomal opened this issue · comments

I have some imaging data where the image is broken down into bands and some bands are compressed and some are not. For example, this band the compressed length is 291 and the original length is 32768. The imaging data itself is 16 bits and the number of bytes is two. This is the compressed band byte stream:

b"\x80\x00 P8$\x16\r\x07\x84BaP\xb8d6\x1d\x0f\x88DbQ8\xa4V-\x17\x8cFcQ\xb8\xe4v=\x1f\x90HdR9$\x96M'\x94JeR\xb9d\xb6]/\x98LfS9\xa4\xd6m7\x9cNgS\xb9\xe4\xf6}?\xa0PhT:%\x16\x8dG\xa4RiT\xbae6\x9dO\xa8TjU:\xa5V\xadW\xacVkU\xba\xe5v\xbd_\xb0XlV;%\x96\xcdg\xb4ZmV\xbbe\xb6\xddo\xb8\\nW;\xa5\xd6\xedw\xbc^oW\xbb\xe5\xf6\xfd\x7f\xc0`pX<&\x17\r\x87\xc4bqX\xbcf7\x1d\x8f\xc8drY<\xa6W-\x97\xccfsY\xbc\xe6w=\x9f\xd0htZ=&\x97M\xa7\xd4juZ\xbdf\xb7]\xaf\xd8lv[=\xa6\xd7m\xb7\xdcnw[\xbd\xe6\xf7}\xbf\xe0px\\>'\x17\x8d\xc7\xe4ry\\\xbeg7\x9d\xcf\xe8tz]>\xa7W\xad\xd7\xecv{]\xbe\xe7w\xbd\xdf\xf0x|^?'\x97\xcd\xe7\xf4z}^\xbfg\xb7\xdd\xef\xf8|~_?\xa7\xd7\xed\xf7\xfc~\x7f_\xbf\xe7\xf7\xfc\xff\xb0\x08\x08"

When I try to decompress this using your functions, it does not recognize this as LZW format. However, I know this was compressed using LZW according to documentation on the imaging data.

Would it be possible to look into adapting the decompress function to accept this? Or at least help me understand why this isn't working? @valgur

@seandomal That binary string is not a well-formed LZW-format compressed file. At the very least it should start with a \x1F\x9D file signature. Prepending the signature bytes to the string or any substring does not produce any valid-looking output either.

You will have to dig into the image format specification or look into reverse-engineering the compression format. There's a decent chance it's not even using LZW as there are many very similar compression formats around (https://en.wikipedia.org/wiki/List_of_algorithms#Lossless_compression_algorithms). I can take a closer look at the image format spec, if you can share one, to see if I can help with figuring out the compression format.

Regardless of the actual format, I would not consider adding it to this library as it is just a faithful port of the standard ncompress tool. For experimentation and hacking I recommend starting with the pure-Python version of the LZW decompression function available here: https://github.com/scivision/unlzw3/blob/main/src/unlzw3/__init__.py.