AlfredoSequeida / fvid

fvid is a project that aims to encode any file as a video using 1-bit color images to survive compression algorithms for data retrieval.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proposal: Cython Support

Theelx opened this issue · comments

I believe the possible optimizations (without a refactoring allowing for processing multiple pixels at a time) for the lengthy decode process are severely diminished now, and it's only twice the speed that it originally was. So, I am proposing that I make an optional Cython extension. Cython is a superset of Python that compiles to C, which compiles to machine code. If you want to take a look at what it looks like and how fast it is, check out the pybase122.pyx file in my pybase122 decoder in my starred repos. I included benchmarks for the best Python version I could write, and the best Cython version I could write (with some help). You'll notice that the Cython version I wrote ended up being at least 10x faster than the python version, and the version that I got help with is over 20x faster. This is a huge speedup. While this won't have as major a speedup because it uses less math/bit operations, I bet you that I can get at least a 4x speedup for the section inside the two for loops in the get_file_from_image (I think that's the name) function.

The main obstacle to including Cython is that it needs to be compiled with every change, and you need to pip install Cython to compile. However, luckily, it is possible to fall back to a Python version of the code should a user not want to install Cython or compile it. This is why I'm proposing that I only Cythonize the slow part inside the loop, and add a check for if the user's system supports Cython before it actually tries running the Cython code.

For an example of a Cython library, google "pomegranate Cython". Pomegranate is a machine learning library that is blazingly fast because it uses Cython to compile its algorithms. It's also open-source, on GitHub, so you can check the internals.

I'm marking this as a proposal because I don't currently want to do the work if you're not comfortable with it.

Update:
I did the work because I have extra time, and I made a Cython version that decodes about 5-6x as fast as the current Python. I'll make a PR if you say you'll consider it.

@Theelgirl Absolutely! Make the PR and I will take a look at it!

Ok, making now.
Edit: Made PR, it's ready for compatibility testing. I haven't tested if the pip install properly compiles, only a custom compiler, so lets hope it does.

It benchmarks at about 4x faster on the Lenna image than Python, however fvid fails to decode the resulting mp4 with either version, producing an unusable file.bin
Is this a known bug?
I did a dumb, the bug is the only other open issue lmao

@Theelgirl So to clarify, did decoding work?

@AlfredoSequeida Yes, using dobro's version fixed it. The problem wasn't with Cython, as it happened on the normal version also, it was with something else.

Closing because the PR has been ready for a while.