Oracen-zz / MIDAS

Multiple imputation utilising denoising autoencoder for approximate Bayesian inference

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using MIDAS with Times series or sequence data

fbertran opened this issue · comments

Greetings,

I would like to know if it is possible to impute time series or multilevel (especially two level) data with MIDAS.
If not, do you plan to extend your algorithm to that setting ? The micemd R package does this kind of imputation, Audigier, V. et al (2017) arXiv:1702.00971. Some LSTM based encoders try to cope with times series.
Best.

Hi there,

Yes, this is listed as a future feature. Adding the requisite recurrent cells significantly complicates optimisation, and so for now we've avoided doing so. LSTM is the obvious choice, but GRUs seem more stable. The problem is structuring the data in such a way that training can be done efficiently without constant IO between CPU and GPU. This is trivial if you're building an LSTM net for your own dataset, but far harder for me writing code to work on any dataset.

For the moment, it is usually enough to pass manually lagged data into the additional_data argument, and MIDAS can generally learn the relationship to past and future variables. This is kind of like manually adding a polynomial of time.

As for multilevel data, again, there is no explicit handling of this at this time. As far as I'm aware, there is very little work done on multilevel modelling with neural networks. I'd recommend concatenating the relevant hierarchical variables/passing into additional_data. This ought to work a little better in future when I add feature embeddings to categorical variables.

Regards,

Alex