koaning / embetter

just a bunch of useful embeddings

Home Page:https://koaning.github.io/embetter/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Color Histograms - Additional Tricks

koaning opened this issue · comments

This approach could work pretty well as an implementation:
https://danielmuellerkomorowska.com/2020/06/17/analyzing-image-histograms-with-scikit-image/

To do something similar to what is explained here:
https://www.pinecone.io/learn/color-histograms/

I have a couple of suggestions for possible extensions for this. The first is a simple trick: instead of histograms for each channel build an eCDF for each channel and concatenate those together. Why would you do this? For a one-dimensional distributions it turns out that the Wasserstein-1 distance between the distributions is the L1 distance between the cumulative distribution functions of the distributions. Doing that over each channel independently means the L1 distance over the concatenated vector is a kind of sliced-Wasserstein distance of the 3D RGB-space histograms. SO there's some layers of approximation there, but the end result is that it will be approximating the standard metric for measuring distance between images using colour histograms.

The second suggestion involves a little more work. It is possible to transform colours into a perceptual colour space and do the histogramming there. The catch then is binning, but it seems reasonable to just quantize colours in that space, and a simple K-Means on a set of training pixels (from a set of training images) should do a decent job of that. That would provide adaptive histogram bins based on perceptual distances between colours for a set of training images. If that actually sounds interesting I could be happy to code it up and put in a PR to add it as an alternative histogram transformer.

This issue should have been closed since there is already a component here 😅, that said ...

I think adding an eCDF option for the ColorHistogramEncoder could be fine, I'm just curious about application where this'd make a big difference. Do you know of any?

The perceptual color space intruiges me, but I think it'll need to be another component. I also noticed your https://github.com/lmcinnes/glasbey project, so I can imagine there's some tricks top of mind for you here. Out of curiosity; do you have an application in mind for these kinds of features? I don't mind adding new components, but I prefer to be a bit conservative and only add items that have an interesting use-case.

I don't really have any specific use cases in mind. It was more a matter of running across the ColorHistogramTransformer and immediately having ideas about slightly more advanced was to do it. So I would anticipate that these would work better on image problems for which the current ColorHistogramTransformer works okay currently. I certainly understand the desire to keep things simple -- I assumed it was best to get a sense of whether you thought an extra transformer was going to be worth having. Let's just put this on hold for now and I can circle back if I ever arrive at a good use case.

Sure thing. Will close the issue for now, but feel free to re-open.