phoboslab / qoi

The “Quite OK Image Format” for fast, lossless image compression

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Couple of potential improvements to the algorithm, including space-filling curves and removing (sometimes) unnecessary QOI-OP-RGB marks

AZMCode opened this issue · comments

While as of immediately now I don't have the time to code an implementation of these suggestions, here's a couple of them I thought might help the algorithm perform better. I'll take some time off tonight to begin work on these suggestions to see if they improve compression efficiency.

  • Space-Filling Curves
    • As of now, the algorithm scans LTR, TTB each row of pixels, which means that often it'll try to encode a pixel to the left of the image as difference of a pixel from the right of the image. One simple solution to this problem would be to re-arrange the pixels before encoding according to a space-filling curve. This would mean that every next pixel fed into the algorithm is guaranteed to lie nearby the previous one, diminishing the need to reencode a pixel in full. A unique space-filling curve could be calculated on the fly for each possible width and height of an image, stored on the image header.
  • Removing unnecessary QOI-OP-RGB headers.
    • By storing a specification version on the image header, we can know which bytes correspond to different encoding modes, like difference or accessing the hash table. I suggest this, combined with the alpha information on the image header, could be used to not insert the QOI-OP-RGB header at all, if the red channel of the pixel to be encoded in full doesn't interfere with any other OPs. Any byte starting with unknown starting bits for the current format version would be considered the red channel of a new fully-encoded pixel, potentially storing the pixel without any expansion in the representation.

If my explanation isn't too clear, I'll try to include a PR to better explain these ideas as code. These ideas are both meant to compress the representation of fully-encoded pixels, as well as trying to reduce their need in the first place in organic -looking images.

If these changes turn out to decrease the efficiency of the algorithm I'll gladly close the issue, just thought they might be useful.

commented

While from a logistical point of view the first suggestion might yield better coherence due to fewer pixels having a large difference from the previous one, this is usually a nightmare for data cache performance as CPUs and MMUs are optimized for linear access patterns.

Also: Both of the suggested changes are breaking changes to the format and thus unlikely to be adopted unless a completely new revision of the format is created. As it currently stands this is very unlikely. Furthermore both of the above suggestions introduce quite a bit of processing complexity which qoi currently des not have.

Both of the suggested changes are breaking changes to the format and thus unlikely to be adopted

Yes. I also just updated the README to clarify that this format will not change.

Space-Filling Curves

These have been discussed before and are quite interesting! However, so far the experiments over on qoi2-bikeshed with space-filling curves didn't yield much of an improvement in compression ratio, IIRC.

Any byte starting with unknown starting bits for the current format version would be considered the red channel of a new fully-encoded pixel

How would that work? All four possible combinations of the 2bit tag are already occupied. I.e. there is no red-color that doesn't start with any of the existing tags.

Excuse me then, I appear to have misunderstood the specification. The red-channel idea would indeed not work at all. Also thanks for the notification on space-filling curves not yielding much of an improvement. I find it kinda surprising, but numbers don't lie. If I may ask, what kinds of space-filling curves were tried? I can't seem to find anything in the repo.
Closing for now.

Btw, thanks for the quick and polite reply. Not as much can be said for many other repos. Sorry for the extraneous issue.

Here is some discussion: nigeltao/qoi2-bikeshed#6

I also remember someone tried a simple block encoding with 8x8 pixels, where it encoded the first row of 8 pixels left to right, the second row right to left etc. That gave some improvements. But I can't find the thread right now, sorry.