GiovineItalia / Gadfly.jl

Crafty statistical graphics for Julia.

Home Page:http://gadflyjl.org/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Changing colors does change the "stack order" in histograms

Brinkhuis opened this issue · comments

The code below creates a nice histogram using the palatte1 colors. In this plot, Very Good is stacked on top of Fair.

using CSV, DataFrames, Gadfly, ColorSchemes

if !isfile("diamonds.csv")
    println("Downloading file ...")
    download("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/diamonds.csv", "diamonds.csv")
end
diamonds = DataFrame(CSV.File("diamonds.csv"))

palette1 = ["yellow", "orange", "red", "green", "blue"]

plot(
    diamonds, 
    x = :price, 
    color = :cut, 
    Geom.histogram(bincount=50), 
    Scale.x_log10, 
    Scale.color_discrete_manual(
        palette1..., 
        levels = ["Ideal", "Premium", "Very Good", "Good", "Fair"], 
        order = [1, 2, 3, 4, 5]
    ), 
    Theme(
        background_color = "white", 
        bar_highlight = color("black")
    ), 
)

Screenshot 2022-06-07 at 19 06 52

But when I make the exact same plot with an other palette, palette2 the stack order changes too.

palette2 = get(reverse(ColorSchemes.Purples), range(0, 1, length = 5))

plot(
    diamonds, 
    x = :price, 
    color = :cut, 
    Geom.histogram(bincount=50), 
    Scale.x_log10, 
    Scale.color_discrete_manual(
        palette2..., 
        levels = ["Ideal", "Premium", "Very Good", "Good", "Fair"], 
        order = [1, 2, 3, 4, 5]
    ), 
    Theme(
        background_color = "white", 
        bar_highlight = color("black")
    ), 
)

Screenshot 2022-06-07 at 19 10 26

In this plot Good is stacked on top of Fair.

I did run into this when trying to reproduce this seaborn plot in Julia using Gadfly. So far I have not been able to figure out how to reproduce this plot with palette2 and the right combination of the levels and order attributes of Scale.color_discrete_manual.

See also the same issue/question on StackOverflow.

seems to me the ideal behavior would be to have the order in which the bars are stacked match that in the legend. to achieve, this, i believe Stat.histogram needs to change to reference scales[:color] instead of unique(aes.color) to pull out, in your example, the palette and order you specified with color_discrete_manual. IIUC the code correctly, care will need to be taken to iterate over the Dict of colors in the proper order.

@Mattriks does this sound correct to you?

the stack order should be fixed by #1604 plus a sorting of the data with:

sort!(diamonds,
      [:cut],
      by=x->findfirst(x .== ["Ideal", "Premium", "Very Good", "Good", "Fair"]))

the explanation is that the order in the figure reflects that in the data, or it least it was meant to, and some changes in the code involving unordered Dicts was necessary to fix it.

can you please checkout that PR and let me know if it works for you?

thanks for the concise MWE btw, and sorry for taking so long to fix it.