HassanSalamB / Kickstarter_Campaign_Project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ds-modeling-pipeline

Skeleton project for building a simple model in python script This is the simplest way to do it. We train a simple model in the jupyter notebook, where we select only some features and do minimal cleaning. The output is then stored in simple python scripts.

Data used is the coffee quality dataset.

Requirements:

  • pyenv with Python: 3.8.5

Environment

Same procedure as last time...

Use the requirements file in this repo to create a new environment.

make setup 

#or 

pyenv local 3.8.5
python -m venv .venv
pip install --upgrade pip
pip install -r requirements.txt

Usage

In order to train the model and store test data in the data folder and the model in models run:

#activate env
source .venv/bin/activate

python train.py  

In order to test that predict works on a test set you created run:

python predict.py models/linear_regression_model.sav data/X_test.csv data/y_test.csv

Limitations

development libraries are part of the production environment, normally these would be separate as the production code should be as slim as possible

About

License:MIT License


Languages

Language:Jupyter Notebook 100.0%Language:Makefile 0.0%