Are densities from distributional incorrectly scaled?
bjoernholzhauer opened this issue · comments
The density that gets plotted e.g. for a standard normal when I use distributions from the distributional
package seems to mismatch (in terms of scaling of the density, the shape of the density seems okay) what I manually calculate as pdf values. Because distributional
appears to (in the example below) calculate the density correctly, I'm wondering whether this is a ggdist
problem.
Simple example:
library(tidyverse)
library(distributional)
library(ggdist)
dnorm(x=0)
density(x = dist_normal(0, 1), q=0) # Matches the base R result from the line above (0.3989423)
# Produces the plot below (seems to have the density at 0 near 0.83 or so?)
tibble(mu=0, sd=1) %>%
ggplot(aes(xdist=dist_normal(mu=mu, sigma=sd))) +
stat_halfeye()
This is with R version 4.3.1 (2023-06-16 ucrt) on Windows 10, distributional version 0.3.2 and ggdist version 3.3.0 - i.e. latest version on CRAN as per today).
The y axis labels will not correspond to density values because slabs in ggdist map densities onto a thickness
aesthetic, which is scaled to fit between other values placed on the y axis (e.g. so that when the y axis is categorical slabs can be displayed at each y value).
This scaling is controlled by the normalize
parameter and the scale
aesthetic (see here); if you want thickness values to match up with y values, you could do something like:
ggplot() +
ggdist::stat_slab(aes(xdist = distributional::dist_normal(0, 1)), scale = 1, normalize = "none")
But this disables any guarantees that the slab will fit within the space allocated to it.
At some point labeling the thickness values will be made easier by #183, which should be a better solution to this problem.
See also #52 or the second diagram in the slabinterval vignette
There is now a prototype implementation of a solution to this on the subguide
branch. See this comment: #183 (comment)