NaquibAlam / M5_Forecasting_Accuracy_kaggle

It contains the code and data for M5 Forecasting - Accuracy competition on Kaggle.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

M5_Forecasting_Accuracy_kaggle

It contains the code and data for M5 Forecasting - Accuracy competition on Kaggle. The details and data for this competition can be found here: https://www.kaggle.com/c/m5-forecasting-accuracy/overview

store_and_week_wise_lgbm_v1.ipynb

  • In this solution, we have built different models for different (10) stores and different (4) weeks (1-7, 8-14, 15-21, 22-28), so we are building total 40 models for each train_train_day_x (different validation periods for robust evaluation and hyper-parameter tuning).
  • Features used are as following:
    • General base features
    • General price based features
    • General calendar and time based features
    • Lag and rolling mean/std features
    • Target encoding features for categorical variables
  • Lightgbm with tweedie loss is used for modeling.
  • The more implementation details can be found here: https://www.kaggle.com/c/m5-forecasting-accuracy/discussion/163216

simple-lgbm-groupkfold-cv.ipynb

  • This notebook explores how GroupKFold CV strategy in Sklearn can be used for hyper-parameter tuning for time-series data.
  • In this notebook we haven't done any hyper-parameter tuning though, GroupKFold CV has just been used for validating the model's performance but the same methodology can be used for hyper-parameter tuning.
  • You can learn more about GroupKFold CV and how it reduces the possibility of leakage with time-series CV from the Markdown section of the notebook.
  • Custom objective function and validation metric are used which works as a proxy for WRMSSE, competition' evaluation metric.
  • Lightgbm with regression (default) loss is used for modeling.
  • The data for this notebook are available at:

About

It contains the code and data for M5 Forecasting - Accuracy competition on Kaggle.


Languages

Language:Jupyter Notebook 100.0%