PPMd8 decompress generates data size just one-byte less than expected
miurahr opened this issue · comments
Describe the bug
PPMd8 decompression, with restore_method == CUTOFF, on python 3.8 on Windows sometimes generate one-byte smaller than expected
To Reproduce
https://github.com/miurahr/pyppmd/runs/3322119476?check_suite_focus=true
Environment (please complete the following information):
- OS: Windows 10, Linux
- Python: CPython 3.7
- project version: v0.16.0
Additional context
2021-08-13T12:53:50.1591006Z tests/test_ppmd8.py::test_ppmd8_encode_decode[1048576-1] FAILED
2021-08-13T12:53:50.1591880Z
2021-08-13T12:53:50.1592431Z ================================== FAILURES ===================================
2021-08-13T12:53:50.1593036Z _____________________ test_ppmd8_encode_decode[1048576-1] _____________________
2021-08-13T12:53:50.1593498Z
2021-08-13T12:53:50.1594801Z tmp_path = WindowsPath('C:/Users/runneradmin/AppData/Local/Temp/pytest-of-unknown/pytest-0/test_ppmd8_encode_decode_104851')
2021-08-13T12:53:50.1596160Z mem_size = 1048576, restore_method = 1
2021-08-13T12:53:50.1597050Z
2021-08-13T12:53:50.1598047Z @pytest.mark.parametrize(
2021-08-13T12:53:50.1598706Z "mem_size, restore_method",
2021-08-13T12:53:50.1599210Z [
2021-08-13T12:53:50.1599804Z (8 << 20, pyppmd.PPMD8_RESTORE_METHOD_RESTART),
2021-08-13T12:53:50.1600803Z (8 << 20, pyppmd.PPMD8_RESTORE_METHOD_CUT_OFF),
2021-08-13T12:53:50.1601474Z (1 << 20, pyppmd.PPMD8_RESTORE_METHOD_RESTART),
2021-08-13T12:53:50.1602274Z (1 << 20, pyppmd.PPMD8_RESTORE_METHOD_CUT_OFF),
2021-08-13T12:53:50.1602817Z ],
2021-08-13T12:53:50.1603210Z )
2021-08-13T12:53:50.1603705Z @pytest.mark.timeout(20)
2021-08-13T12:53:50.1604464Z def test_ppmd8_encode_decode(tmp_path, mem_size, restore_method):
2021-08-13T12:53:50.1605133Z length = 0
2021-08-13T12:53:50.1605742Z m = hashlib.sha256()
2021-08-13T12:53:50.1606433Z with testdata_path.joinpath("10000SalesRecords.csv").open("rb") as f:
2021-08-13T12:53:50.1607227Z with tmp_path.joinpath("target.ppmd").open("wb") as target:
2021-08-13T12:53:50.1608142Z enc = pyppmd.Ppmd8Encoder(6, mem_size, restore_method=restore_method, endmark=True)
2021-08-13T12:53:50.1608921Z data = f.read(READ_BLOCKSIZE)
2021-08-13T12:53:50.1609443Z while len(data) > 0:
2021-08-13T12:53:50.1609930Z m.update(data)
2021-08-13T12:53:50.1610522Z length += len(data)
2021-08-13T12:53:50.1611141Z target.write(enc.encode(data))
2021-08-13T12:53:50.1611786Z data = f.read(READ_BLOCKSIZE)
2021-08-13T12:53:50.1612414Z target.write(enc.flush())
2021-08-13T12:53:50.1613540Z shash = m.digest()
2021-08-13T12:53:50.1614660Z m2 = hashlib.sha256()
2021-08-13T12:53:50.1615351Z assert length == 1237262
2021-08-13T12:53:50.1615937Z length = 0
2021-08-13T12:53:50.1616543Z with tmp_path.joinpath("target.ppmd").open("rb") as target:
2021-08-13T12:53:50.1617283Z with tmp_path.joinpath("target.csv").open("wb") as out:
2021-08-13T12:53:50.1618171Z dec = pyppmd.Ppmd8Decoder(6, mem_size, restore_method=restore_method, endmark=True)
2021-08-13T12:53:50.1619007Z data = target.read(READ_BLOCKSIZE)
2021-08-13T12:53:50.1619588Z while len(data) > 0 or not dec.eof:
2021-08-13T12:53:50.1620140Z res = dec.decode(data)
2021-08-13T12:53:50.1620632Z m2.update(res)
2021-08-13T12:53:50.1621116Z out.write(res)
2021-08-13T12:53:50.1621603Z length += len(res)
2021-08-13T12:53:50.1622161Z data = target.read(READ_BLOCKSIZE)
2021-08-13T12:53:50.1622711Z > assert length == 1237262
2021-08-13T12:53:50.1623150Z E assert 1237261 == 1237262
2021-08-13T12:53:50.1623560Z E +1237261
2021-08-13T12:53:50.1623925Z E -1237262
2021-08-13T12:53:50.1624222Z
2021-08-13T12:53:50.1624726Z tests\test_ppmd8.py:105: AssertionError
Hopefully #54 fix the issue here.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days