hamrel-cxu / EnbPI

Official code for: Conformal prediction interval for dynamic time-series (conference, ICML 21 Long Presentation) AND Conformal prediction for time-series (journal, IEEE TPAMI)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ensemble batch prediction intervals (EnbPI)

Table of Contents

News and Notes

Open-source implementation

  • We are excited that the work has been integrated as a part of
    1. MAPIE, which is a scikit-learn-compatible module for predictive inference.
    2. Fortuna by Amazon AWS.
    3. PUNCC by the Artificial and Natural Intelligence Toulouse Institute.
    4. functime.ai for production-ready time series models.
    5. ConformalPrediction for Trustworthy AI in Julia.

Extension Works and Ideas

How to use

  • Required Dependency:
    • Basic modules: numpy, pandas, sklearn, scipy, matplotlib, seaborn.
    • Additional modules: statsmodels for implementing ARIMA, keras for building neural network and recurrent neural networks, and pyod for competing anomaly detection methods.
  • General Info and Tests: This work reproduces all experiments in Conformal Prediction Interval for Dynamic Time-series (Xu et al. 2021a). In particular,
    • tests_paper.ipynb provides an illustration of how to generate the main figures (Figure 1-4) in the paper. The code contents are nearly identical to those in tests_paper.py.
    • tests_paper+supp.py reproduces all figures, including additional ones found in the Appendix. It is written in Jupyter notebook format, so that they are meant to be executed line by line.
  • EnbPI implementation:
    • PI_class_EnbPI.py implements the class that contains EnbPI (line), Jackknife+-after-bootstrap (line, paper), Split/Inductive Conformal (line, paper), and Weighted Inductive Conformal (line, paper). We used ARIMA as another competing method.
    • Because conditional coverage (Figure 3) and anomaly detection (Figure 4) require problem-specific modifications, the code changes are not contained in "PI_class_EnbPI.py" but in their respective sections within tests_paper.py/tests_paper+supp.py.
  • Other Function Files:
    • utils_EnbPI.py primarily contain plotting functions for all figures except Figure 4.
    • PI_class_ECAD.py implements ECAD based on EnbPI (see [Xu et al. 2021a, Section 8.5, Algorithm 2]) and utils_ECAD.py contains helpers for anomaly detection.
  • Additional Files
    • The Data repository contains all dataset in our paperexcept the money laundry one for Figure 4 (due to size limit).
    • The Results repository is provided for your convenience to reproduce plots, since some experiments by neural network/recurrent neural networks can take some time to execute. It contains all .csv results files on all dataset.
  • Broad Usage: To wrap EnbPI around other regression models and/or use on other data, one should:
    • If other regression models: Make sure the model has methods .fit(X_train, Y_train) to train a predictor and .predict(X_predict) to make predictions on new data. Most models in sklearn or deep learning models built by keras/pytorch are capable of doing so.
    • If other data: We have assumed that all our datasets are save as pandas.DataFrame and convertible to numpy.array. However, such assumptions are purely computational. Please feel free to adjust the data formate as long as it can be processed by regression models of choice.

Poster Talk and Slides

  • The poster for our work is available via this link.
  • We are fortunate to pre-record a long presentation and give an oral presentation at the Proceedings of the 38th International Conference on Machine Learning (ICML 2021). The long presentation is available on Slideslive and the oral presentation will be given at the conference once the date is finalized.
  • The slide for the talk is available here.

FAQ

  1. Encountering "NotImplementedError: Cannot convert a symbolic Tensor (lstm_2/strided_slice:0) to a numpy array" when using RNN as the regression model:
  • See this github answer to resolve the problem, primarily due to numpy & python version issues.

References

  • Xu, Chen and Yao Xie (2023). Sequential Predictive Conformal Inference for Time Series. Proceedings of the 40th International Conference on Machine Learning, PMLR 202, 2023
  • Xu, Chen and Yao Xie (2023). Conformal prediction for time-series. Journal version, IEEE Transactions on Pattern Analysis and Machine Intelligence.
  • Xu, Chen and Yao Xie (2021). Conformal prediction interval for dynamic time-series. Conference version, The Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021.
  • Xu, Chen and Yao Xie (2022). Conformal prediction set for time-series. ICML 2022 Distribution-Free Uncertainty Quantification workshop.
  • Xu, Chen and Yao Xie (2021). Conformal Anomaly Detection on Spatio-Temporal Observations with Missing Data. ICML 2021 Distribution-Free Uncertainty Quantification workshop.

About

Official code for: Conformal prediction interval for dynamic time-series (conference, ICML 21 Long Presentation) AND Conformal prediction for time-series (journal, IEEE TPAMI)

License:MIT License


Languages

Language:Jupyter Notebook 74.7%Language:Python 25.3%