kcompher / debugging-data-science

Materials and resources for Debugging Data Science LiveTraining

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Debugging Data Science Live Training (part 1 and part 2)

This repository contains the exercises and data for Debugging Data Science Live Training. This training provides an invaluable, hands-on guide to applying machine learning in the wild. Through an end-to-end data science example, we will walk through the process of defining an appropriate problem, building and evaluating a model, and see how to take its performance to the next level through a variety of more advanced techniques. The focus will be on debugging machine learning problems that arise during the model training process and seeing how to overcome these issues to improve the effectiveness of the model.

And/or please do not hesitate to reach out to me directly via email at jondinu@gmail.com or over twitter @jonathandinu

If you find any errors in the code or materials, please open a Github issue in this repository

Prerequisites

  • Experience with an object-oriented programming language, e.g., Python (all code demos during the training will be in Python)
  • Familiarity with the basics of supervised machine learning.
  • A working knowledge of the scientific Python libraries (pandas and scikit-learn) is helpful but not required.

Course Set-up

  1. Download the appropriate Python 3.7 Anaconda Distribution for your operating system: https://www.anaconda.com/distribution/
  2. In a Terminal: git clone https://github.com/hopelessoptimism/debugging-data-science.git
  3. cd debugging-data-science
  4. conda env create -f environment.yml
  5. conda activate debugging-data

Recommended Preparation

Recommended Follow-up

Data

About

Materials and resources for Debugging Data Science LiveTraining


Languages

Language:Jupyter Notebook 100.0%