njtierney / naniar

Tidy data structures, summaries, and visualisations for missing data

Home Page:http://naniar.njtierney.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`shadow_long` throws an error when gathering variables.

siavash-babaei opened this issue · comments

shadow_long throws an error when gathering variables: pivot_longer cannot combine variables!!!

# Library for dealing with missing values 
library(naniar)

# Load `oceanbuoys` data
data("oceanbuoys")

# Impute the mean value and track the imputations
ocean_imp_mean <- oceanbuoys |>
  naniar::nabular(only_miss = TRUE) |>
  naniar::impute_mean_all() |>
  naniar::add_label_shadow()

# Gather the imputed data: Throws an error
ocean_imp_mean |>
  naniar::shadow_long(humidity, air_temp_c)
#> Error in `tidyr::pivot_longer()`:
#> ! Can't combine `year` <double> and `any_missing` <character>.
#> Backtrace:
#>      ▆
#>   1. ├─naniar::shadow_long(ocean_imp_mean, humidity, air_temp_c)
#>   2. │ ├─tidyr::pivot_longer(...)
#>   3. │ └─tidyr:::pivot_longer.data.frame(...)
#>   4. │   └─tidyr::pivot_longer_spec(...)
#>   5. │     └─vctrs::vec_ptype_common(...)
#>   6. └─vctrs (local) `<fn>`()
#>   7.   └─vctrs::vec_default_ptype2(...)
#>   8.     ├─base::withRestarts(...)
#>   9.     │ └─base (local) withOneRestart(expr, restarts[[1L]])
#>  10.     │   └─base (local) doWithOneRestart(return(expr), restart)
#>  11.     └─vctrs::stop_incompatible_type(...)
#>  12.       └─vctrs:::stop_incompatible(...)
#>  13.         └─vctrs:::stop_vctrs(...)
#>  14.           └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))

# Gather the imputed data: Works like a charm 
ocean_imp_mean |>
  tidyr::pivot_longer(
    cols      = c("humidity", "air_temp_c"),
    names_to  = "variable",
    values_to = "value",
  ) |>
  tidyr::pivot_longer(
    cols      = c("humidity_NA", "air_temp_c_NA"),
    names_to  = "variable_NA",
    values_to = "value_NA",
  )
#> # A tibble: 2,944 × 12
#>     year latitude longit…¹ sea_t…² wind_ew wind_ns sea_t…³ any_m…⁴ varia…⁵ value
#>    <dbl>    <dbl>    <dbl>   <dbl>   <dbl>   <dbl> <fct>   <chr>   <chr>   <dbl>
#>  1  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… humidi…  79.6
#>  2  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… humidi…  79.6
#>  3  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… air_te…  27.1
#>  4  1997        0     -110    27.6   -6.40    5.40 !NA     Not Mi… air_te…  27.1
#>  5  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… humidi…  75.8
#>  6  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… humidi…  75.8
#>  7  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… air_te…  27.0
#>  8  1997        0     -110    27.5   -5.30    5.30 !NA     Not Mi… air_te…  27.0
#>  9  1997        0     -110    27.6   -5.10    4.5  !NA     Not Mi… humidi…  76.5
#> 10  1997        0     -110    27.6   -5.10    4.5  !NA     Not Mi… humidi…  76.5
#> # … with 2,934 more rows, 2 more variables: variable_NA <chr>, value_NA <fct>,
#> #   and abbreviated variable names ¹​longitude, ²​sea_temp_c, ³​sea_temp_c_NA,
#> #   ⁴​any_missing, ⁵​variable

Thanks for posting this - this is a new error introduced in the latest release, I confirm that I get the same error:

# Library for dealing with missing values 
library(naniar)

# Load `oceanbuoys` data
data("oceanbuoys")

# Impute the mean value and track the imputations
ocean_imp_mean <- oceanbuoys |>
  naniar::nabular(only_miss = TRUE) |>
  naniar::impute_mean_all() |>
  naniar::add_label_shadow()

# Gather the imputed data: Throws an error
ocean_imp_mean |>
  naniar::shadow_long(humidity, air_temp_c)
#> Error in `tidyr::pivot_longer()`:
#> ! Can't combine `year` <double> and `any_missing` <character>.

#> Backtrace:
#>      ▆
#>   1. ├─naniar::shadow_long(ocean_imp_mean, humidity, air_temp_c)
#>   2. │ ├─tidyr::pivot_longer(...)
#>   3. │ └─tidyr:::pivot_longer.data.frame(...)
#>   4. │   └─tidyr::pivot_longer_spec(...)
#>   5. │     └─vctrs::vec_ptype_common(...)
#>   6. └─vctrs (local) `<fn>`()
#>   7.   └─vctrs::vec_default_ptype2(...)
#>   8.     ├─base::withRestarts(...)
#>   9.     │ └─base (local) withOneRestart(expr, restarts[[1L]])
#>  10.     │   └─base (local) doWithOneRestart(return(expr), restart)
#>  11.     └─vctrs::stop_incompatible_type(...)
#>  12.       └─vctrs:::stop_incompatible(...)
#>  13.         └─vctrs:::stop_vctrs(...)
#>  14.           └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))

Created on 2023-03-30 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.2 (2022-10-31)
#>  os       macOS Ventura 13.2
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Australia/Hobart
#>  date     2023-03-30
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.0   2023-01-09 [1] CRAN (R 4.2.0)
#>  colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.2.0)
#>  digest        0.6.31  2022-12-11 [1] CRAN (R 4.2.0)
#>  dplyr         1.1.0   2023-01-29 [1] CRAN (R 4.2.1)
#>  evaluate      0.20    2023-01-17 [1] CRAN (R 4.2.0)
#>  fansi         1.0.4   2023-01-22 [1] CRAN (R 4.2.0)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
#>  fs            1.6.1   2023-02-06 [1] CRAN (R 4.2.0)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2       3.4.1   2023-02-10 [1] CRAN (R 4.2.0)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
#>  gtable        0.3.1   2022-09-01 [1] CRAN (R 4.2.0)
#>  htmltools     0.5.4   2022-12-07 [1] CRAN (R 4.2.0)
#>  knitr         1.42    2023-01-25 [1] CRAN (R 4.2.0)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.2.0)
#>  naniar      * 1.0.0   2023-02-02 [1] CRAN (R 4.2.0)
#>  pillar        1.8.1   2022-08-19 [1] CRAN (R 4.2.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.2.0)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.2.0)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.2.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.2.0)
#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.2.0)
#>  rmarkdown     2.20    2023-01-19 [1] CRAN (R 4.2.0)
#>  rstudioapi    0.14    2022-08-22 [1] CRAN (R 4.2.0)
#>  scales        1.2.1   2022-08-20 [1] CRAN (R 4.2.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
#>  styler        1.9.0   2023-01-15 [1] CRAN (R 4.2.0)
#>  tibble        3.1.8   2022-07-22 [1] CRAN (R 4.2.0)
#>  tidyr         1.3.0   2023-01-24 [1] CRAN (R 4.2.0)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.2.0)
#>  utf8          1.2.3   2023-01-31 [1] CRAN (R 4.2.0)
#>  vctrs         0.5.2   2023-01-23 [1] CRAN (R 4.2.0)
#>  visdat        0.6.0   2023-02-02 [1] local
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun          0.37    2023-01-31 [1] CRAN (R 4.2.0)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Thanks again for posting this, this will be fixed in an upcoming release.

Thank you again for this @siavash-babaei !

This now works, but by default changes value to character, as that is the safest way to have this always succeed. Otherwise you can specify your own coercion function to transform value values. Here's an example:

library(naniar)
aq_shadow <- nabular(airquality)

shadow_long(aq_shadow)
#> # A tibble: 918 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <chr> <chr>       <fct>   
#>  1 Ozone    41    Ozone_NA    !NA     
#>  2 Solar.R  190   Solar.R_NA  !NA     
#>  3 Wind     7.4   Wind_NA     !NA     
#>  4 Temp     67    Temp_NA     !NA     
#>  5 Month    5     Month_NA    !NA     
#>  6 Day      1     Day_NA      !NA     
#>  7 Ozone    36    Ozone_NA    !NA     
#>  8 Solar.R  118   Solar.R_NA  !NA     
#>  9 Wind     8     Wind_NA     !NA     
#> 10 Temp     72    Temp_NA     !NA     
#> # ℹ 908 more rows

# then filter only on Ozone and Solar.R
shadow_long(aq_shadow, Ozone, Solar.R)
#> # A tibble: 306 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <chr> <chr>       <fct>   
#>  1 Ozone    41    Ozone_NA    !NA     
#>  2 Solar.R  190   Solar.R_NA  !NA     
#>  3 Ozone    36    Ozone_NA    !NA     
#>  4 Solar.R  118   Solar.R_NA  !NA     
#>  5 Ozone    12    Ozone_NA    !NA     
#>  6 Solar.R  149   Solar.R_NA  !NA     
#>  7 Ozone    18    Ozone_NA    !NA     
#>  8 Solar.R  313   Solar.R_NA  !NA     
#>  9 Ozone    <NA>  Ozone_NA    NA      
#> 10 Solar.R  <NA>  Solar.R_NA  NA      
#> # ℹ 296 more rows

# ensure `value` is numeric
shadow_long(aq_shadow, fn_value_transform = as.numeric)
#> # A tibble: 918 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <dbl> <chr>       <fct>   
#>  1 Ozone     41   Ozone_NA    !NA     
#>  2 Solar.R  190   Solar.R_NA  !NA     
#>  3 Wind       7.4 Wind_NA     !NA     
#>  4 Temp      67   Temp_NA     !NA     
#>  5 Month      5   Month_NA    !NA     
#>  6 Day        1   Day_NA      !NA     
#>  7 Ozone     36   Ozone_NA    !NA     
#>  8 Solar.R  118   Solar.R_NA  !NA     
#>  9 Wind       8   Wind_NA     !NA     
#> 10 Temp      72   Temp_NA     !NA     
#> # ℹ 908 more rows
shadow_long(aq_shadow, Ozone, Solar.R, fn_value_transform = as.numeric)
#> # A tibble: 306 × 4
#>    variable value variable_NA value_NA
#>    <chr>    <dbl> <chr>       <fct>   
#>  1 Ozone       41 Ozone_NA    !NA     
#>  2 Solar.R    190 Solar.R_NA  !NA     
#>  3 Ozone       36 Ozone_NA    !NA     
#>  4 Solar.R    118 Solar.R_NA  !NA     
#>  5 Ozone       12 Ozone_NA    !NA     
#>  6 Solar.R    149 Solar.R_NA  !NA     
#>  7 Ozone       18 Ozone_NA    !NA     
#>  8 Solar.R    313 Solar.R_NA  !NA     
#>  9 Ozone       NA Ozone_NA    NA      
#> 10 Solar.R     NA Solar.R_NA  NA      
#> # ℹ 296 more rows

Created on 2023-05-01 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.0 (2023-04-21)
#>  os       macOS Ventura 13.2
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Los_Angeles
#>  date     2023-05-01
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date (UTC) lib source
#>  cli           3.6.1      2023-03-23 [1] CRAN (R 4.3.0)
#>  colorspace    2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
#>  digest        0.6.31     2022-12-11 [1] CRAN (R 4.3.0)
#>  dplyr         1.1.2      2023-04-20 [1] CRAN (R 4.3.0)
#>  evaluate      0.20       2023-01-17 [1] CRAN (R 4.3.0)
#>  fansi         1.0.4      2023-01-22 [1] CRAN (R 4.3.0)
#>  fastmap       1.1.1      2023-02-24 [1] CRAN (R 4.3.0)
#>  fs            1.6.2      2023-04-25 [1] CRAN (R 4.3.0)
#>  generics      0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
#>  ggplot2       3.4.2      2023-04-03 [1] CRAN (R 4.3.0)
#>  glue          1.6.2      2022-02-24 [1] CRAN (R 4.3.0)
#>  gtable        0.3.3      2023-03-21 [1] CRAN (R 4.3.0)
#>  htmltools     0.5.5      2023-03-23 [1] CRAN (R 4.3.0)
#>  knitr         1.42       2023-01-25 [1] CRAN (R 4.3.0)
#>  lifecycle     1.0.3      2022-10-07 [1] CRAN (R 4.3.0)
#>  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.3.0)
#>  naniar      * 1.0.0.9000 2023-05-01 [1] local
#>  pillar        1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
#>  purrr         1.0.1      2023-01-10 [1] CRAN (R 4.3.0)
#>  R.cache       0.16.0     2022-07-21 [1] CRAN (R 4.3.0)
#>  R.methodsS3   1.8.2      2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo          1.25.0     2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils       2.12.2     2022-11-11 [1] CRAN (R 4.3.0)
#>  R6            2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
#>  reprex        2.0.2      2022-08-17 [1] CRAN (R 4.3.0)
#>  rlang         1.1.0      2023-03-14 [1] CRAN (R 4.3.0)
#>  rmarkdown     2.21       2023-03-26 [1] CRAN (R 4.3.0)
#>  rstudioapi    0.14       2022-08-22 [1] CRAN (R 4.3.0)
#>  scales        1.2.1      2022-08-20 [1] CRAN (R 4.3.0)
#>  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
#>  styler        1.9.1      2023-03-04 [1] CRAN (R 4.3.0)
#>  tibble        3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
#>  tidyr         1.3.0      2023-01-24 [1] CRAN (R 4.3.0)
#>  tidyselect    1.2.0      2022-10-10 [1] CRAN (R 4.3.0)
#>  utf8          1.2.3      2023-01-31 [1] CRAN (R 4.3.0)
#>  vctrs         0.6.2      2023-04-19 [1] CRAN (R 4.3.0)
#>  visdat        0.6.0      2023-02-02 [1] CRAN (R 4.3.0)
#>  withr         2.5.0      2022-03-03 [1] CRAN (R 4.3.0)
#>  xfun          0.39       2023-04-20 [1] CRAN (R 4.3.0)
#>  yaml          2.3.7      2023-01-23 [1] CRAN (R 4.3.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────