jamesotto852 / ggdensity

An R package for interpretable visualizations of bivariate density estimates

Home Page:https://jamesotto852.github.io/ggdensity/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Export the algorithm in its own function

eliocamp opened this issue · comments

There's quite a lot of novel (to me, at least) implementation in this package. It would be nice for it to be accessible outside ggplot2. Perhaps consider exporting the logic behind StatHdr$compute_group() into its own function so that non-ggplot2 user can also benefit from this algorithm.

We have consolidated the relevant statistical computations, exporting the functionality in the new get_hdr() and get_hdr_1d() in #31.

Here is a quick demonstration of each function.

library("ggdensity")
#> Loading required package: ggplot2

set.seed(1)
df <- data.frame(x = rnorm(1e4), y = rnorm(1e4))

# Numerical summary of 2d HDRs with `get_hdr()`
res <- get_hdr(df, method = "kde")
str(res)
#> List of 3
#>  $ df_est:'data.frame':  10000 obs. of  5 variables:
#>   ..$ x               : num [1:10000] -3.67 -3.6 -3.52 -3.44 -3.37 ...
#>   ..$ y               : num [1:10000] -4.3 -4.3 -4.3 -4.3 -4.3 ...
#>   ..$ fhat            : num [1:10000] 1.64e-54 2.95e-52 4.38e-50 5.41e-48 5.81e-46 ...
#>   ..$ fhat_discretized: num [1:10000] 1.01e-56 1.81e-54 2.69e-52 3.32e-50 3.56e-48 ...
#>   ..$ hdr             : num [1:10000] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ breaks: Named num [1:5] 0.00194 0.00807 0.03036 0.07558 Inf
#>   ..- attr(*, "names")= chr [1:5] "99%" "95%" "80%" "50%" ...
#>  $ data  :'data.frame':  10000 obs. of  3 variables:
#>   ..$ x             : num [1:10000] -0.626 0.184 -0.836 1.595 0.33 ...
#>   ..$ y             : num [1:10000] -0.804 -1.057 -1.035 -1.186 -0.5 ...
#>   ..$ hdr_membership: num [1:10000] 0.5 0.5 0.8 0.95 0.5 0.5 0.5 0.5 0.5 0.8 ...

# Numerical summary of 1d HDRs with `get_hdr_1d()`
res_1d <- get_hdr_1d(df$x, method = "kde")
str(res_1d)
#> List of 3
#>  $ df_est:'data.frame':  512 obs. of  4 variables:
#>   ..$ x               : num [1:512] -3.67 -3.66 -3.64 -3.63 -3.61 ...
#>   ..$ fhat            : num [1:512] 0.000561 0.000596 0.00063 0.000664 0.000697 ...
#>   ..$ fhat_discretized: num [1:512] 8.21e-06 8.72e-06 9.22e-06 9.71e-06 1.02e-05 ...
#>   ..$ hdr             : num [1:512] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ breaks: Named num [1:5] 0.0154 0.0571 0.1744 0.3029 Inf
#>   ..- attr(*, "names")= chr [1:5] "99%" "95%" "80%" "50%" ...
#>  $ data  :'data.frame':  10000 obs. of  2 variables:
#>   ..$ x             : num [1:10000] -0.626 0.184 -0.836 1.595 0.33 ...
#>   ..$ hdr_membership: num [1:10000] 0.5 0.5 0.8 0.95 0.5 0.8 0.5 0.8 0.5 0.5 ...

Created on 2023-02-10 by the reprex package (v2.0.1)

For full details, see ?get_hdr and ?get_hdr_1d.