Heide-B / Rossman_Predictions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rossman Predictions

This repository is for the Sales Prediction API given the Rossman dataset. Here are the relevant files in this repository:

  • Processing and Model Development - Main notebook used to process the data and train a model
  • App.py - Contains the API
  • pred_model2.joblib - Contains the pickled trained model; referenced in App.py
  • preprocess.py - Contains the preprocessing functions that each new POST request undergoes; referenced in App.py
  • Store.csv - Contains data of the 1,115 Rossman Stores; referenced in App.py
  • Procfile, setup.sh, requirements.txt - Files needed to deploy the API on Heroku

To develop this API, X steps were performed. These are explained in detail in the Preprocess and Model Development notebook.

  1. Data cleaning - null values, encoding, data type formatting
  2. Feature Engineering and Reduction - Generated 4 new features and performed PCA to determine the relevant features
  3. Model Selection - Tested on 5 models with adjusted R square and RMSE as the main metrics
  4. Pickle and Export - Exported the selected trained model (Catboost) and preprocessing functions
  5. API Development - A simple API was developed to have an initial landing page and a /predict link that takes in a JSON post request and outputs the predicted Sale
  6. Deployment - The API was deployed to Heroku with the following link https://mynt-hbalcera.herokuapp.com/predict; testing was done using Postman

For any inquiries, kindly reach out to heidemlbalcera@gmail.com or https://www.linkedin.com/in/heidemae-balcera-sci/ This project is part of an application to Mynt.

About


Languages

Language:Jupyter Notebook 99.4%Language:Python 0.5%Language:Shell 0.1%