From Data Science to MLOPs workshop
For this workshop we are going to work with the following dataset:
https://kaggle.com/c/house-prices-advanced-regression-techniques/overview (Predict Prices)
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence. \
With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.
- EDA
- Feature Engineering
- Modeling
- Pipelines
- Deployment with Flask
Firt we need to create a virtual environment for the project, to keep track of every dependency, it is also useful to use and explicit version of Python
Install the package for creating a virtual environment:
$ pip install virtualenv
Create a new virtual environment
$ virtualenv venv
Activate virtual environment
$ source venv/bin/activate
Now with the virtual environment we can install the dependencies written in requirements.txt
$ pip install -r requirements.txt
After we have install all the dependencies we can now run the script in code/train.py, this script takes the input data and outputs a trained model and a pipeline for our web service.
$ python code/train.py
Finally we can test our web application by running:
$ python app.py
Now that we have our web application running, we can use the Dockerfile to create an image for running our web application inside a container
$ docker build . -t from_ds_to_mlops
And now we can test our application using Docker
$ docker run -p 8000:8000 from_ds_to_mlops
Test by using the calls in tests/example_calls.txt from the terminal