Error when id column has an 'ID' role during tuning

Question

Error when id column has an 'ID' role during tuning

tonyk7440 opened this issue 3 years ago · comments

I was trying to figure out how to go about doing some parameter tuning with gluonts and deep_ar specifically.
When I tried to plug the model into some code I previously used I hit some errors which confused me for a few hours.

Finally found the problem was related to setting the role of a column in the recipe creation stage so I thought I'd share as doing this would be common with other modeltime models.

Reprex below

library(parsnip)
#> Warning: package 'parsnip' was built under R version 4.0.5
library(modeltime)
#> Warning: package 'modeltime' was built under R version 4.0.5
library(dials)
#> Loading required package: scales
library(recipes)
#> Loading required package: dplyr
#> Warning: package 'dplyr' was built under R version 4.0.5
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#> 
#>     step
library(tune)
#> Warning: package 'tune' was built under R version 4.0.5
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(workflows)
#> Warning: package 'workflows' was built under R version 4.0.5
library(timetk)
library(yardstick)
#> Warning: package 'yardstick' was built under R version 4.0.5
#> For binary classification, the first factor level is assumed to be the event.
#> Use the argument `event_level = "second"` to alter this as needed.
library(modeltime.gluonts)


ex_ts_cv <- time_series_cv(
  data = m750,
  initial = "10 years",
  assess = "2 years",
  skip = "2 years",
  cumulative = FALSE,
  slice_limit = 2
)
#> Using date_var: date


spec_deepar <- deep_ar(
  id                    = "id",
  freq                  = "M",
  prediction_length     = 12,
  lookback_length       = 24,
  epochs                = tune()
) %>%
  set_engine("gluonts_deepar") %>%
  set_mode(mode = "regression")

deepar_grid_spec <- grid_latin_hypercube(
  parameters(
    epochs(c(2, 4))
  ),
  size = 2
)

deepar_grid_spec
#> # A tibble: 2 x 1
#>   epochs
#>    <int>
#> 1      4
#> 2      2

# Create recipe that works
recipe_spec_ok <- recipe(value ~ id + date, data = m750)

# Construct workflow
deepar_wflw <- workflow() %>%
  add_recipe(recipe_spec_ok) %>%
  add_model(spec_deepar)

# Tune
deepar_tune_res <- deepar_wflw %>%
  tune_grid(
    resamples = ex_ts_cv,
    grid = deepar_grid_spec,
    metrics = metric_set(mae, mape, smape, mase, rmse, rsq),
    control = control_grid(verbose = TRUE)
  )
#> Warning: package 'tibble' was built under R version 4.0.5
#> Warning: package 'rsample' was built under R version 4.0.5
#> Warning: package 'rlang' was built under R version 4.0.5
#> Warning: package 'vctrs' was built under R version 4.0.5
#> Warning: package 'reticulate' was built under R version 4.0.5
#> i Slice1: preprocessor 1/1
#> v Slice1: preprocessor 1/1
#> i Slice1: preprocessor 1/1, model 1/2
#> v Slice1: preprocessor 1/1, model 1/2
#> i Slice1: preprocessor 1/1, model 1/2 (predictions)
#> i Slice1: preprocessor 1/1, model 2/2
#> v Slice1: preprocessor 1/1, model 2/2
#> i Slice1: preprocessor 1/1, model 2/2 (predictions)
#> i Slice2: preprocessor 1/1
#> v Slice2: preprocessor 1/1
#> i Slice2: preprocessor 1/1, model 1/2
#> v Slice2: preprocessor 1/1, model 1/2
#> i Slice2: preprocessor 1/1, model 1/2 (predictions)
#> i Slice2: preprocessor 1/1, model 2/2
#> v Slice2: preprocessor 1/1, model 2/2
#> i Slice2: preprocessor 1/1, model 2/2 (predictions)

# Try recipe that returns an error
recipe_spec_not_ok <- recipe(value ~ id + date, data = m750) %>%
  update_role(id, new_role = "ID")

# Construct workflow
deepar_wflw <- workflow() %>%
  add_recipe(recipe_spec_not_ok) %>%
  add_model(spec_deepar)

# Tune - error
deepar_tune_res <- deepar_wflw %>%
  tune_grid(
    resamples = ex_ts_cv,
    grid = deepar_grid_spec,
    metrics = metric_set(mae, mape, smape, mase, rmse, rsq),
    control = control_grid(verbose = TRUE)
  )
#> i Slice1: preprocessor 1/1
#> v Slice1: preprocessor 1/1
#> i Slice1: preprocessor 1/1, model 1/2
#> x Slice1: preprocessor 1/1, model 1/2: Error: Column not found: id = 'id'. Make su...
#> i Slice1: preprocessor 1/1, model 2/2
#> x Slice1: preprocessor 1/1, model 2/2: Error: Column not found: id = 'id'. Make su...
#> i Slice2: preprocessor 1/1
#> v Slice2: preprocessor 1/1
#> i Slice2: preprocessor 1/1, model 1/2
#> x Slice2: preprocessor 1/1, model 1/2: Error: Column not found: id = 'id'. Make su...
#> i Slice2: preprocessor 1/1, model 2/2
#> x Slice2: preprocessor 1/1, model 2/2: Error: Column not found: id = 'id'. Make su...
#> Warning: All models failed. See the `.notes` column.

^{Created on 2021-08-18 by the reprex package (v2.0.1)}

lg1000 · Answer 1 · Thu Aug 25 2022 16:03:11 GMT+0800 (China Standard Time)

I experienced the same issue. No difference, when I was writing my own dials parameter functions. Any luck with this?