miurahr / pyppmd

pyppmd provides classes and functions for compressing and decompressing text data, using PPM (Prediction by partial matching) compression algorithm variation H and I.2. It provide an API similar to Python's zlib/bz2/lzma modules.

Home Page:https://pyppmd.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ppmd8_DecodeSymbol should be received by int, not unsigned char. Then stop decoding if <0

cielavenir opened this issue · comments

Describe the bug

https://github.com/miurahr/pyppmd/blob/v0.15.2/src/ext/_ppmdmodule.c#L1451 Ppmd8_DecodeSymbol should be received by int, not unsigned char

if Ppmd8_DecodeSymbol returns <0, decoding should terminate immediately. and char -1 and int -1 are different (this is similar to that fgetc() should be received by int, not char).

it will help:

  • to identify eof when -1 is returned
  • to identify data error when -2 is returned

Additional context

  • you can determine eof from Ppmd8_DecodeSymbol return-value; do you really need "end marker"?
  • as you will rewrite buffering, I'm glad if you take a look at this.

Additional context 2

Actually I wrote ppmd handler for zipfile but additional end-marker creates incompatible stream.

Could you propose the change for your proposal over #33 (when merged)?
It is OK to drop "end marker".

Note: An implementation of "end mark" is partially compatible with unrar does.

#33 changed to do it. (don't touch end mark code)

@miurahr

Firstly, both 7z and rar uses PPMdH and zip uses PPMdI.

Then actually ppmd has quite many dialects - although 7z and rar uses PPMdH, the rangecoders are different. It is also different from the original PPMd to unpack PPMd archive format described in http://www.compression.ru/ds/ .

So it is free to have different endmark handling [edit: across softwares].

Maybe adding option to set the endmark handling is an idea.

Now you can pass endmark option to encoder and decoder.

ref: #39

@miurahr thank you

@miurahr I confirmed the compatibility of current pyppmd and I was able to release https://pypi.org/project/zipfile-ppmd/ as zipfile module patcher, which works the same way as zipfile-zstd. Thank you.