Elicilla / DSx

Hands on tutorials demonstrating the concepts of Prediction engineering, Feature engineering and automation in data science.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DSx

Hands on tutorials demonstrating the concepts of Prediction engineering, Feature engineering and automation in data science. In a series of notebooks, we show how we can build predictive models from raw data within a day - all using open source software.

Open source tools used

  • pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
  • Featuretools is a DARPA sponsored open source software that enables data scientists to automatically extract features from time varying temporal data.
  • scikit-learn is a free software machine learning library for the Python programming language.

Concepts to learn

  • Prediction engineering
  • Feature engineering

Notebooks

  • NYC-Taxi-Dataset -Learn feature engineering
  • Retail-Dataset - Learn prediction engineering

Installation

Linux

sh install_linux.sh
source venv/bin/activate
pip install -r requirements.txt
jupyter notebook

Mac

sh install_osx.sh
source venv/bin/activate
pip install -r requirements.txt
jupyter notebook

About

Hands on tutorials demonstrating the concepts of Prediction engineering, Feature engineering and automation in data science.


Languages

Language:Jupyter Notebook 95.2%Language:Python 4.6%Language:Shell 0.1%