business-science / modeltime.h2o

Forecasting with H2O AutoML. Use the H2O Automatic Machine Learning algorithm as a backend for Modeltime Time Series Forecasting.

Home Page:https://business-science.github.io/modeltime.h2o/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error in refit_tbl

ichsan2895 opened this issue · comments

I have followed this tutorial, but I got error in this syntax

https://www.business-science.io/code-tools/2021/03/15/introducing-modeltime-h2o.html

refit_tbl %>%
modeltime_forecast(
new_data = future_prepared_tbl,
actual_data = data_prepared_tbl,
keep_data = TRUE
)

Error: Problem with filter() input ..1. x object '.key' not found i Input ..1 is `.model_desc == "ACTUAL" | .key == "prediction"

My software version
Windows 10 Education 64 bit
R = 3.6.3 (64 bit)
H2O = 3.32.0.1
modeltime = 0.5.1
modeltime.h2o = 0.1.1

I just ran the tutorial. I've increased the number of models slightly to improve results. It seems to run OK for me.

library(tidymodels)
library(modeltime.h2o)
library(tidyverse)
library(timetk)

data_tbl <- walmart_sales_weekly %>%
  select(id, Date, Weekly_Sales)

splits <- time_series_split(data_tbl, assess = "3 month", cumulative = TRUE)

recipe_spec <- recipe(Weekly_Sales ~ ., data = training(splits)) %>%
  step_timeseries_signature(Date) 

train_tbl <- training(splits) %>% bake(prep(recipe_spec), .)
test_tbl  <- testing(splits) %>% bake(prep(recipe_spec), .)

h2o.init(
  nthreads = -1,
  ip       = 'localhost',
  port     = 54321
)
#>  Connection successful!
#> 
#> R is connected to the H2O cluster: 
#>     H2O cluster uptime:         3 minutes 24 seconds 
#>     H2O cluster timezone:       America/New_York 
#>     H2O data parsing timezone:  UTC 
#>     H2O cluster version:        3.32.0.1 
#>     H2O cluster version age:    6 months and 18 days !!! 
#>     H2O cluster name:           H2O_started_from_R_mdancho_rvk435 
#>     H2O cluster total nodes:    1 
#>     H2O cluster total memory:   7.96 GB 
#>     H2O cluster total cores:    12 
#>     H2O cluster allowed cores:  12 
#>     H2O cluster healthy:        TRUE 
#>     H2O Connection ip:          localhost 
#>     H2O Connection port:        54321 
#>     H2O Connection proxy:       NA 
#>     H2O Internal Security:      FALSE 
#>     H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
#>     R Version:                  R version 4.0.2 (2020-06-22)
#> Warning in h2o.clusterInfo(): 
#> Your H2O cluster version is too old (6 months and 18 days)!
#> Please download and install the latest version from http://h2o.ai/download/

# Optional - Turn off progress indicators during training runs
h2o.no_progress()


model_spec <- automl_reg(mode = 'regression') %>%
  set_engine(
    engine                     = 'h2o',
    max_runtime_secs           = 15, 
    max_runtime_secs_per_model = 15,
    max_models                 = 10,
    nfolds                     = 5,
    exclude_algos              = c("DeepLearning"),
    verbosity                  = NULL,
    seed                       = 786
  ) 

model_fitted <- model_spec %>%
  fit(Weekly_Sales ~ ., data = train_tbl)


modeltime_tbl <- modeltime_table(
  model_fitted
) 

modeltime_tbl
#> # Modeltime Table
#> # A tibble: 1 x 3
#>   .model_id .model   .model_desc                 
#>       <int> <list>   <chr>                       
#> 1         1 <fit[+]> H2O AUTOML - STACKEDENSEMBLE

modeltime_tbl %>%
  modeltime_calibrate(test_tbl) %>%
  modeltime_forecast(
    new_data    = test_tbl,
    actual_data = data_tbl,
    keep_data   = TRUE
  ) %>%
  group_by(id) %>%
  plot_modeltime_forecast(
    .facet_ncol = 2, 
    .interactive = FALSE
  )

data_prepared_tbl <- bind_rows(train_tbl, test_tbl)

future_tbl <- data_prepared_tbl %>%
  group_by(id) %>%
  future_frame(.length_out = "1 year") %>%
  ungroup()
#> .date_var is missing. Using: Date

future_prepared_tbl <- bake(prep(recipe_spec), future_tbl)

refit_tbl <- modeltime_tbl %>%
  modeltime_refit(data_prepared_tbl)

refit_tbl %>%
  modeltime_forecast(
    new_data    = future_prepared_tbl,
    actual_data = data_prepared_tbl,
    keep_data   = TRUE
  ) %>%
  group_by(id) %>%
  plot_modeltime_forecast(
    .facet_ncol  = 2,
    .interactive = FALSE
  )
#> Converting to H2OFrame...
#> Warning: Expecting the following names to be in the data frame: .conf_hi, .conf_lo. 
#> Proceeding with '.conf_interval_show = FALSE' to visualize the forecast without confidence intervals.
#> Alternatively, try using `modeltime_calibrate()` before forecasting to add confidence intervals.

Created on 2021-04-27 by the reprex package (v1.0.0)

Session Info
> devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.0.2 (2020-06-22)
 os       OS X  11.2.3                
 system   x86_64, darwin17.0          
 ui       RStudio                     
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/New_York            
 date     2021-04-27Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────
 package       * version    date       lib source                           
 assertthat      0.2.1      2019-03-21 [1] CRAN (R 4.0.2)                   
 backports       1.2.1      2020-12-09 [1] CRAN (R 4.0.2)                   
 bit             4.0.4      2020-08-04 [1] CRAN (R 4.0.2)                   
 bit64           4.0.5      2020-08-30 [1] CRAN (R 4.0.2)                   
 bitops          1.0-6      2013-08-17 [1] CRAN (R 4.0.2)                   
 broom         * 0.7.5      2021-02-19 [1] CRAN (R 4.0.2)                   
 bslib           0.2.4      2021-01-25 [1] CRAN (R 4.0.2)                   
 cachem          1.0.4      2021-02-13 [1] CRAN (R 4.0.2)                   
 callr           3.5.1      2020-10-13 [1] CRAN (R 4.0.2)                   
 cellranger      1.1.0      2016-07-27 [1] CRAN (R 4.0.2)                   
 class           7.3-18     2021-01-24 [1] CRAN (R 4.0.2)                   
 cli             2.3.1      2021-02-23 [1] CRAN (R 4.0.2)                   
 clipr           0.7.1      2020-10-08 [1] CRAN (R 4.0.2)                   
 codetools       0.2-18     2020-11-04 [1] CRAN (R 4.0.2)                   
 colorspace      2.0-0      2020-11-11 [1] CRAN (R 4.0.2)                   
 crayon          1.4.1      2021-02-08 [1] CRAN (R 4.0.2)                   
 curl            4.3        2019-12-02 [1] CRAN (R 4.0.1)                   
 data.table      1.14.0     2021-02-21 [1] CRAN (R 4.0.2)                   
 DBI             1.1.1      2021-01-15 [1] CRAN (R 4.0.2)                   
 dbplyr          2.1.0      2021-02-03 [1] CRAN (R 4.0.2)                   
 desc            1.3.0      2021-03-05 [1] CRAN (R 4.0.2)                   
 devtools        2.3.2      2020-09-18 [1] CRAN (R 4.0.2)                   
 dials         * 0.0.9.9000 2020-10-13 [1] Github (tidymodels/dials@2b79300)
 DiceDesign      1.9        2021-02-13 [1] CRAN (R 4.0.2)                   
 digest          0.6.27     2020-10-24 [1] CRAN (R 4.0.2)                   
 dplyr         * 1.0.5      2021-03-05 [1] CRAN (R 4.0.2)                   
 ellipsis        0.3.1      2020-05-15 [1] CRAN (R 4.0.2)                   
 evaluate        0.14       2019-05-28 [1] CRAN (R 4.0.1)                   
 fansi           0.4.2      2021-01-15 [1] CRAN (R 4.0.2)                   
 farver          2.1.0      2021-02-28 [1] CRAN (R 4.0.2)                   
 fastmap         1.1.0      2021-01-25 [1] CRAN (R 4.0.2)                   
 forcats       * 0.5.1      2021-01-27 [1] CRAN (R 4.0.2)                   
 foreach         1.5.1      2020-10-15 [1] CRAN (R 4.0.2)                   
 fs              1.5.0      2020-07-31 [1] CRAN (R 4.0.2)                   
 furrr           0.2.2      2021-01-29 [1] CRAN (R 4.0.2)                   
 future          1.21.0     2020-12-10 [1] CRAN (R 4.0.2)                   
 generics        0.1.0      2020-10-31 [1] CRAN (R 4.0.2)                   
 ggplot2       * 3.3.3      2020-12-30 [1] CRAN (R 4.0.2)                   
 glmnet        * 4.1-1      2021-02-21 [1] CRAN (R 4.0.2)                   
 globals         0.14.0     2020-11-22 [1] CRAN (R 4.0.2)                   
 glue            1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
 gower           0.2.2      2020-06-23 [1] CRAN (R 4.0.2)                   
 GPfit           1.0-8      2019-02-08 [1] CRAN (R 4.0.2)                   
 gtable          0.3.0      2019-03-25 [1] CRAN (R 4.0.2)                   
 h2o           * 3.32.0.1   2020-10-17 [1] CRAN (R 4.0.2)                   
 hardhat         0.1.5      2020-11-09 [1] CRAN (R 4.0.2)                   
 haven           2.3.1      2020-06-01 [1] CRAN (R 4.0.2)                   
 highr           0.8        2019-03-20 [1] CRAN (R 4.0.2)                   
 hms             1.0.0      2021-01-13 [1] CRAN (R 4.0.2)                   
 htmltools       0.5.1.1    2021-01-22 [1] CRAN (R 4.0.2)                   
 httr            1.4.2      2020-07-20 [1] CRAN (R 4.0.2)                   
 igraph          1.2.6      2020-10-06 [1] CRAN (R 4.0.2)                   
 infer         * 0.5.4      2021-01-13 [1] CRAN (R 4.0.2)                   
 ipred           0.9-11     2021-03-12 [1] CRAN (R 4.0.2)                   
 iterators       1.0.13     2020-10-15 [1] CRAN (R 4.0.2)                   
 job             0.1        2021-04-27 [1] Github (lindeloev/job@f687bf9)   
 jquerylib       0.1.3      2020-12-17 [1] CRAN (R 4.0.2)                   
 jsonlite        1.7.2      2020-12-09 [1] CRAN (R 4.0.2)                   
 kknn          * 1.3.1      2016-03-26 [1] CRAN (R 4.0.2)                   
 knitr           1.31       2021-01-27 [1] CRAN (R 4.0.2)                   
 labeling        0.4.2      2020-10-20 [1] CRAN (R 4.0.2)                   
 lattice         0.20-41    2020-04-02 [1] CRAN (R 4.0.2)                   
 lava            1.6.9      2021-03-11 [1] CRAN (R 4.0.2)                   
 lhs             1.1.1      2020-10-05 [1] CRAN (R 4.0.2)                   
 lifecycle       1.0.0      2021-02-15 [1] CRAN (R 4.0.2)                   
 listenv         0.8.0      2019-12-05 [1] CRAN (R 4.0.2)                   
 lubridate     * 1.7.10     2021-02-26 [1] CRAN (R 4.0.2)                   
 magrittr        2.0.1      2020-11-17 [1] CRAN (R 4.0.2)                   
 MASS            7.3-53.1   2021-02-12 [1] CRAN (R 4.0.2)                   
 Matrix        * 1.3-2      2021-01-06 [1] CRAN (R 4.0.2)                   
 memoise         2.0.0      2021-01-26 [1] CRAN (R 4.0.2)                   
 modeldata     * 0.1.0      2020-10-22 [1] CRAN (R 4.0.2)                   
 modelr          0.1.8      2020-05-19 [1] CRAN (R 4.0.2)                   
 modeltime     * 0.5.1.9000 2021-04-15 [1] local                            
 modeltime.h2o * 0.1.1.9000 2021-04-05 [1] local                            
 munsell         0.5.0      2018-06-12 [1] CRAN (R 4.0.2)                   
 nnet            7.3-15     2021-01-24 [1] CRAN (R 4.0.2)                   
 parallelly      1.24.0     2021-03-14 [1] CRAN (R 4.0.2)                   
 parsnip       * 0.1.5      2021-01-19 [1] CRAN (R 4.0.2)                   
 pillar          1.5.1      2021-03-05 [1] CRAN (R 4.0.2)                   
 pkgbuild        1.2.0      2020-12-15 [1] CRAN (R 4.0.2)                   
 pkgconfig       2.0.3      2019-09-22 [1] CRAN (R 4.0.2)                   
 pkgload         1.2.0      2021-02-23 [1] CRAN (R 4.0.2)                   
 plyr            1.8.6      2020-03-03 [1] CRAN (R 4.0.2)                   
 prettyunits     1.1.1      2020-01-24 [1] CRAN (R 4.0.2)                   
 pROC            1.17.0.1   2021-01-13 [1] CRAN (R 4.0.2)                   
 processx        3.4.5      2020-11-30 [1] CRAN (R 4.0.2)                   
 prodlim         2019.11.13 2019-11-17 [1] CRAN (R 4.0.2)                   
 progressr       0.7.0      2020-12-11 [1] CRAN (R 4.0.2)                   
 ps              1.6.0      2021-02-28 [1] CRAN (R 4.0.2)                   
 purrr         * 0.3.4      2020-04-17 [1] CRAN (R 4.0.2)                   
 R6              2.5.0      2020-10-28 [1] CRAN (R 4.0.2)                   
 Rcpp            1.0.6      2021-01-15 [1] CRAN (R 4.0.2)                   
 RcppParallel    5.0.3      2021-02-24 [1] CRAN (R 4.0.2)                   
 RCurl           1.98-1.2   2020-04-18 [1] CRAN (R 4.0.2)                   
 readr         * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
 readxl          1.3.1      2019-03-13 [1] CRAN (R 4.0.2)                   
 recipes       * 0.1.15     2020-11-11 [1] CRAN (R 4.0.2)                   
 remotes         2.2.0      2020-07-21 [1] CRAN (R 4.0.2)                   
 reprex          1.0.0      2021-01-27 [1] CRAN (R 4.0.2)                   
 rlang         * 0.4.10     2020-12-30 [1] CRAN (R 4.0.2)                   
 rmarkdown       2.7        2021-02-19 [1] CRAN (R 4.0.2)                   
 rpart         * 4.1-15     2019-04-12 [1] CRAN (R 4.0.2)                   
 rprojroot       2.0.2      2020-11-15 [1] CRAN (R 4.0.2)                   
 rsample       * 0.0.9      2021-02-17 [1] CRAN (R 4.0.2)                   
 rstudioapi      0.13       2020-11-12 [1] CRAN (R 4.0.2)                   
 rvest           1.0.0      2021-03-09 [1] CRAN (R 4.0.2)                   
 sass            0.3.1      2021-01-24 [1] CRAN (R 4.0.2)                   
 scales        * 1.1.1      2020-05-11 [1] CRAN (R 4.0.2)                   
 sessioninfo     1.1.1      2018-11-05 [1] CRAN (R 4.0.2)                   
 shape           1.4.5      2020-09-13 [1] CRAN (R 4.0.2)                   
 slider          0.1.5      2020-07-21 [1] CRAN (R 4.0.2)                   
 StanHeaders     2.21.0-7   2020-12-17 [1] CRAN (R 4.0.2)                   
 stringi         1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
 stringr       * 1.4.0      2019-02-10 [1] CRAN (R 4.0.2)                   
 styler          1.3.2      2020-02-23 [1] CRAN (R 4.0.2)                   
 survival        3.2-9      2021-03-14 [1] CRAN (R 4.0.2)                   
 testthat        3.0.2      2021-02-14 [1] CRAN (R 4.0.2)                   
 tibble        * 3.1.0      2021-02-25 [1] CRAN (R 4.0.2)                   
 tidymodels    * 0.1.2      2020-11-22 [1] CRAN (R 4.0.2)                   
 tidyr         * 1.1.3      2021-03-03 [1] CRAN (R 4.0.2)                   
 tidyselect      1.1.0      2020-05-11 [1] CRAN (R 4.0.2)                   
 tidyverse     * 1.3.0      2019-11-21 [1] CRAN (R 4.0.2)                   
 timeDate        3043.102   2018-02-21 [1] CRAN (R 4.0.2)                   
 timetk        * 2.6.1      2021-02-18 [1] local                            
 tune          * 0.1.3      2021-02-28 [1] CRAN (R 4.0.2)                   
 usethis         2.0.1      2021-02-10 [1] CRAN (R 4.0.2)                   
 utf8            1.2.1      2021-03-12 [1] CRAN (R 4.0.2)                   
 vctrs         * 0.3.6.9000 2021-02-19 [1] Github (r-lib/vctrs@9af59e9)     
 warp            0.2.0      2020-10-21 [1] CRAN (R 4.0.2)                   
 withr           2.4.1      2021-01-26 [1] CRAN (R 4.0.2)                   
 workflows     * 0.2.2      2021-03-10 [1] CRAN (R 4.0.2)                   
 workflowsets  * 0.0.1      2021-03-18 [1] CRAN (R 4.0.2)                   
 xfun            0.22       2021-03-11 [1] CRAN (R 4.0.2)                   
 xml2            1.3.2      2020-04-23 [1] CRAN (R 4.0.2)                   
 xts             0.12.1     2020-09-09 [1] CRAN (R 4.0.2)                   
 yaml            2.2.1      2020-02-01 [1] CRAN (R 4.0.2)                   
 yardstick     * 0.0.8      2021-03-28 [1] CRAN (R 4.0.2)                   
 zoo             1.8-9      2021-03-09 [1] CRAN (R 4.0.2)                   

[1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Hello, recently, I know where the problem is...

In this example, we must select just 3 columns
data_tbl <- My_DF %>% select(id, Date, Value)

If we doesn't select 3 columns (so, all columns in My_DF become data_tbl )
It will be error like I said before

Error: Problem with filter() input ..1. x object '.key' not found i Input ..1 is `.model_desc == "ACTUAL" | .key == "prediction"

Please try this one :
data_tbl <- walmart_sales_weekly %>% select(id, Date, Weekly_Sales) will be error if we remove %>% select(id, Date, Weekly_Sales)

I don't know, its bug or feature?

Will need to look into this.