generalize the ColorMixture concept
johnnychen94 opened this issue · comments
IIUC this package provides a pixel version of the deprecated ColorizedArray. Now I can see that this is a useful concept in fluorophore imaging. Compared to defining the Color
subtypes manually for every combination, the benefit of this ColorMixture
concept is to provide a user-friendly interface to design a new color type, which is great.
But fluorophore image is not the only field that has this concept; this can be generalized to hyperspectral imaging where there can be as many channels as the hardware supports.
Thus I'm wondering if we can 1) separate this concept into a standalone package, e.g., MixedColors.jl, and, 2) generalize this concept and allow the user library (e.g., FluorophoreColors) to define its domain-specific color types.
A draft type design for this can be:
MixedColors.jl
struct MixedColorInfo{CS, F}
cs::CS
to_rgb::F
end
struct MixedColor{T,N,INFO} <: Color{T,N}
channels::NTuple{N,T}
end
User library:
# FluorophoreColors.jl
_to_rgb(channels, cs) = sum(map(*, channels, cs))
const _FRGB_INFO = MixedColorInfo(cs, _to_rgb)
const FRGB{T,N} = MixedColor{T,N,_FRGB_INFO}
# with other Fluorophore-specific definitions, e.g., loading the definition from csv file.
...
I believe the parametrization trick on values Cs
(instead of types typeof(Cs)
) works because the types of fluorephore colors are limited and RGB{N0f8}
is internally tuple of integers. Thus here I extend it to a struct
, but I'm not sure if parametrization on the function to_rgb
works reliably; I guess it works because we also have things like map(::typeof(+), ...)
usage.
I like the idea of generalizing it to hyperspectral imaging. I think we could still call the package MixedColors while also providing lookup tables for fluorophores: it's just two fairly small CSV files, and I think creating a standalone package just for the constants isn't worth it.
I also like your design (as usual) separating the color info from the instance. From the standpoint of pixel storage, the two should be equivalent (same amount of memory, both reinterpret
able, etc), which is exactly what we need. Let's try a demo to check viability:
julia> using FluorophoreColors
julia> cs = (fluorophore_rgb["EGFP"], fluorophore_rgb["tdTomato"])
(RGB{N0f8}(0.0,0.925,0.365), RGB{N0f8}(1.0,0.859,0.0))
julia> struct MixedColorInfo{CS, F}
cs::CS
to_rgb::F
end
julia> struct MixedColor{T,N,INFO} <: Color{T,N}
channels::NTuple{N,T}
end
julia> @inline _to_rgb(channels, cs) = sum(map(*, channels, cs))
_to_rgb (generic function with 1 method)
julia> const _FRGB_INFO = MixedColorInfo(cs, _to_rgb)
MixedColorInfo{Tuple{RGB{N0f8}, RGB{N0f8}}, typeof(_to_rgb)}((RGB{N0f8}(0.0,0.925,0.365), RGB{N0f8}(1.0,0.859,0.0)), _to_rgb)
julia> const FRGB{T,N} = MixedColor{T,N,_FRGB_INFO}
FRGB{T, N} where {T, N} (alias for MixedColor{T, N, MixedColorInfo{Tuple{RGB{Normed{UInt8, 8}}, RGB{Normed{UInt8, 8}}}, typeof(_to_rgb)}((RGB{N0f8}(0.0,0.925,0.365), RGB{N0f8}(1.0,0.859,0.0)), _to_rgb)} where {T, N})
julia> @inline Base.convert(::Type{RGB}, c::MixedColor{T,N,INFO}) where {T,N,INFO} = INFO.to_rgb(c.channels, INFO.cs)
julia> c = FRGB{N0f16,2}((0.2, 0.8));
julia> sizeof(c)
4
julia> convert(RGB, c)
RGB{N0f16}(0.8,0.87214,0.07294)
julia> f(c) = convert(RGB, c)
f (generic function with 1 method)
and to compare with the current implementation of this package:
julia> ctemplate = ColorMixture{N0f16}(cs);
julia> c2 = ctemplate(c.channels...)
(0.2N0f16₁, 0.8N0f16₂)
julia> f(c2) === f(c)
true
All good so far. But there's a small performance gap that I don't yet understand:
julia> using BenchmarkTools
julia> @btime f($c)
2.957 ns (0 allocations: 0 bytes)
RGB{N0f16}(0.8,0.87214,0.07294)
julia> @btime f($c2)
2.218 ns (0 allocations: 0 bytes)
RGB{N0f16}(0.8,0.87214,0.07294)
What's odd is that in general terms the code_typed
of the two are very, very similar: AFAICT the only differences are small changes in the ordering of certain operations. (You can easily compare the two with open(filename, "w") do io; print(io, @code_typed f(args...)); end
and then using a graphical diff tool like meld
to visualize the differences.) One observation is that without the @inline
s above the gap is much larger because the type-inferred code still has some invoke
s in it. So perhaps it's a limitation of the optimizer under forced inlining?
In the end of the day I think the functionality of your proposal merits the change, but it would be lovely to eliminate this performance gap.
Most such image representation uses the struct of array layout because the relationship between different channels are usually weak. If we consider this fact, an optimized algorithm can just provide a specialization on the StructArray (or related trait types) as JuliaImages/ImageCore.jl#180
Thus in most cases, this pixel-wise conversion only serves the "making the fallback generic solution work and work quite okay" purpose. For this reason, I won't be too worried about the performance. I could be wrong, of course.
this pixel-wise conversion only serves the "making the fallback generic solution work and work quite okay" purpose
This makes me rethink: how much flexibility do we need, and how much do we want? I'm beginning to return to thinking that perhaps my slightly less-flexible design might be better, although I don't think there's a really strong pull one way or another.
First, keep in mind that this package already supports hyperspectral imaging, it just lacks the convenient name => RGB
mapping for "named" fluorophores. But you can construct ctemplate
with however many colors you need. Definitely agreed we should broaden the name of the package to reflect this reality.
Second and more fundamentally, the only real restriction here, compared to the alternative you proposed, is that my implementation only supports RGB conversions of the form sum(w .* cs)
where cs
is a list of RGB colors, whereas your implementation allows an arbitrary f
. However, folks can get that with MappedArrays
, by supplying an f(c) -> cnew
that generates a new color from c
. One advantage of not having that flexibility here is that we could display a nice summary of such an array. Currently we have
julia> A = [(rand(N0f8), rand(N0f8)) for i = 1:3, j = 1:4]
3×4 Matrix{Tuple{N0f8, N0f8}}:
(0.576, 0.345) (1.0, 0.925) (0.757, 0.843) (0.647, 0.671)
(0.871, 0.282) (0.18, 0.259) (0.204, 0.988) (0.043, 0.816)
(0.733, 0.078) (0.349, 0.439) (0.412, 0.514) (0.408, 0.757)
julia> Ac = ctemplate.(A)
3×4 Array{ColorMixture{N0f16},2} with eltype ColorMixture{N0f16, 2, (RGB{N0f16}(0.0,0.92549,0.36471), RGB{N0f16}(1.0,0.85882,0.0))}:
(0.57647₁, 0.3451₂) (1.0₁, 0.92549₂) … (0.64706₁, 0.67059₂)
(0.87059₁, 0.28235₂) (0.18039₁, 0.25882₂) (0.04314₁, 0.81569₂)
(0.73333₁, 0.07843₂) (0.34902₁, 0.43922₂) (0.40784₁, 0.75686₂)
and the summary line is a mouthful. We can customize it with Base.showarg
but the key question is what to do with the color values. A cute and actually useful form of summary might be something like this:
The nice thing is that it makes it clear that the first element is a green color channel (EGFP) and the second a longer-wavelength reporter. While it's far from a must-have, this kind of nice summary would be difficult to achieve without having a sense of per-colorchannel colors. I know yours lists the colors in cs
, but fundamentally I think your proposal is to allow an arbitrary color-synthesis function.
Second and more fundamentally, the only real restriction here, compared to the alternative you proposed, is that my implementation only supports RGB conversions of the form sum(w .* cs) where cs is a list of RGB colors, whereas your implementation allows an arbitrary f. However, folks can get that with MappedArrays, by supplying an f(c) -> cnew that generates a new color from c
This is a bit like the argument why I prefer to use StructArray
over colorview
in JuliaImages/ImageCore.jl#179. My argument here is that if we construct the pixel value with StructArray
, then it is easy to dispatch on it:
using ImageCore, StructArrays, FluorophoreColors
channels = (fluorophore_rgb["EGFP"], fluorophore_rgb["tdTomato"])
ctemplate = ColorMixture{N0f16}(channels)
img = reshape([ctemplate(rand(), rand()) for _ in 1:64], 8, 8);
data = collect(channelview(img))
img_soa = StructArray{eltype(img)}(data; dims=1)
to dispatch on it:
function f(img::AbstractArray{T}) where {T<:ColorMixture}
...
end
# optimized implementation for SOA layout
function f(img::StructArray{T}) where {T<:ColorMixture}
...
end
For this stuff, it's quite unclear what we should do with MappedArray
. I believe this is also the reason that makes you create this package.
One advantage of not having that flexibility here is that we could display a nice summary of such an array.
I think this depends on how we dispatch the Base.show
method: should we dispatch on Base.show(::IO, ::ColorMixture)
, or Base.show(::IO, ::ColorMixture{T,N,FRGB}) where {T,N}
?
Oh, I see, because FRGB
is a value so we need a helper interface to do the dispatch:
Base.show(io::IO, c::ColorMixture{T,N,info) where {T,N,info} = show_color(info, io, c)
function show_color(info::MixedColorInfo{CS, typeof(fluorophore_to_rgb)}, io::IO, c::ColorMixture) where {CS}
...
end
I guess my main point is that taking your idea even further, you don't really need the cs
field (it could be hard-coded into f
, or captured in a closure). I think the core of your proposal is to allow a generic mapping from a multi-valued color to an RGB color. Like MappedArrays, the flexibility is both a benefit (you can do whatever you want) and a disadvantage (the lack of standardization can make it harder to do other automated things). Dispatch may allow one to have the flexibility without losing functionality elsewhere, of course. So I would cast this decision as "minor" since I think we can get what we want either way.
I am already foreseeing one generalization that will be needed: in ImageView, each independent channel will need its own scaling before multiplying by the channel-assigned color, i.e. sum(w .* g .* cs)
, where g
is a tuple of "gains" that are extracted from an Observable. This is because most biological imaging these days is 16-bit, and this dynamic range is high enough that scaling to appropriate contrast ranges for computer monitors is required before sending to the display. However, g
is constant across the entire image plane.