geohot / twitchctw

compression = AI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CTW compressor. How long before we beat gzip

gzip: 3728 bytes <-- not easy to beat actually
xz:   3656 bytes


## comparing to existing CTW implementations

# https://github.com/cberzan/ctw.git ctw_baseline
This is the authors imp?
gets 3484, so CTW can do better than gzip

Though this code has tons of hacks, notably MAXCOUNTS
If you remove MAXCOUNTS it's crap, but this could be other things
That's the annoying thing about this problem, little bug = slightly worse

And I don't understand it's tree depth

# https://github.com/fumin/ctw.git
gets 4616, err, not good

# https://github.com/gkassel/pyaixi
CTW implementation seems numerically unstable...

# https://github.com/mgbellemare/SkipCTS/tree/master/python
logprob of data is 2988 bytes with 16 context, haven't tried in compressor yet

Maybe this isn't as easy as I thought...

Update
===

It is a working compressor (run ./test.sh) but it's performance is quite bad.

Maybe it's bugs? One day we'll work through the proofs in the paper.

About

compression = AI


Languages

Language:Python 99.7%Language:Shell 0.3%