Store-Item-Demand-Forecasting

Mission statement:

A data science project for demand analysis of items in stores. The data is a multiple time series data where we have 500 sets of combinations for stores and items, and we are required to analyse them and forecast their future values.

Dataset: Store Item Demand Forecasting Challenge (Kaggle)

Procedure

We use train.csv which contains 5 years of sales for 50 items in 10 stores, from 2013 to 2017. There are 10 stores and 50 items. Sales are given for each item in each store, i.e. 500 sets of sales, each of 5 years. Values of sale range from about 2 to 50.

Data Visualization

Store-wise and Item-wise sales arranged according to maximum sales

From the figures, we can say that each store and each item has trend and seasonality componenet.

Day-wise and Month-wise and Year-wise sales

From the figures, we can say that the sales increase in each year.

Item-wise and Store-wise sales

Feature selection:

Categorical Embedding

The task of entity embedding is to map discrete values to a multi-dimensional space where values with similar function output are close to each other. After we use entity embeddings to represent all categorical variables, all embedding layers and the input of all continuous variables (if any) are concatenated. The merged layer is treated like a normal input layer in neural networks and other layers can be build on top of it. With entity embedding we want to put similar values of a categorical variable closer to each other in the embedding space. After adding embedding layers for year, week of the day, day of the week and month of year, extracted from date feature. We also added embedding layers for stores and items. Then we concatenated all the embedding layers, which resulted in 62 unique features after eliminating the redundancy.

Train, Test and Validation sets:

We considered 2017 as our test data, and 2013 -2016 as our train data. The train data has 7,30,500 samples, and test data has 1,82,500 samples. For validation, we considered leave 6 out strategy, wherein 6 months is used as validation data and rest of the 42 months is used for training each set of samples. We fine tuned the results considering different 6 months in each year. In total, we have 8 sets for 2013-2016 years.

Deep models

In this category, after concatenating all the embedding layers, we applied Neural Networks, Long Short term Memory (LSTM), Temporal Convolutional Neural Network (TCN), Hybrid model (TCN +LSTM), and LSTM Autoencoder.