GMModel Tutorial Bug
ericho-bbai opened this issue · comments
In the Global Model example section 2.2 (https://github.com/facebookresearch/Kats/blob/main/tutorials/kats_205_globalmodel.ipynb), the test_TSs
was generated in the same way as train_TSs
with the same start time. Is this a typo? Wouldn't there be data leakage since test_TSs
is essentially a subset of train_TSs
?
Also why does GMModel take a list of TimeSeriesData
? If we have one timeseries, are we supposed to create a list of TimeSeriesData
via the expanding window method?
Thanks for the questions!
For 1) when training a GM, say we feed in a TS with end date '2022-02-01'. Based on the model setting, it will only take the data util '2022-01-01' into the NN and use data between '2022-01-02'~'2022-02-01' to compute loss functions. In prediction stage, it will take data until '2022-02-01' to make forecast for dates after '2022-02-01'. In other words, GM does not see info after '2022-02-01', there is no concern for data leakage. In real use-case (and our evaluation), test_TSs
can be unseen time series (i.e., not appearing in train_TSs
).
2) It should be able to directly take a TimeSeriesData
object. :)