Data Science Notebooks
Data science Python notebooks—a collection of Jupyter notebooks on machine learning, deep learning, statistical inference, data analysis and visualization.
This repo contains various Python Jupyter notebooks I have created to experiment and learn with the core libraries essential for working with data in Python and work through exercises, assignments, course works, and explore subjects that I find interesting such as machine learning and deep learning. Familiarity with Python as a language is assumed.
The essential core libraries that I will be focusing on for working with data are NumPy, Pandas, Matplotlib, PyTorch, TensorFlow, Keras, Caffe, scikit-learn, spaCy, NLTK, Gensim, and related packages.
Table of Contents
- Data Science Notebooks
- Table of Contents
- How to Use this Repo
- About
- Software
- Deep Learning
- Projects
- DL Assignments, Exercises or Course Works
- fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2018 (v2): Oct - Dec 2017
- fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2019 (v3): Oct - Dec 2018
- fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2017 (v1): Feb - Apr 2017
- fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2018 (v2): Mar - May 2018
- Machine Learning
- Libraries or Frameworks
- Kaggle Competitions
- License
How to Use this Repo
- Run the code using the Jupyter notebooks available in this repository's notebooks directory.
- Launch a live notebook server with these notebooks using binder:
About
The notebooks were written and tested with Python 3.6, though other Python versions (including Python 3.x) should work in nearly all cases.
See index.ipynb for an index of the notebooks available.
Software
The code in the notebook was tested with Python 3.6, though most (but not all) will also work correctly with Python 3.x.
The packages I used to run the code in the notebook are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use). To install the requirements using conda, run the following at the command-line:
$ conda install --file requirements.txt
To create a stand-alone environment named DSN with Python 3.6 and all the required package versions, run the following:
$ conda create -n DSN python=3.5 --file requirements.txt
You can read more about using conda environments in the Managing Environments section of the conda documentation.
Deep Learning
Projects
Notebook | Description |
---|---|
Deep Painterly Harmonization | Implement Deep Painterly Harmonization paper in PyTorch |
Language modelling in Malay language for downstream NLP tasks | Implement Universal Language Model Fine-tuning for Text Classification (ULMFiT) in PyTorch |
Not Hotdog AI Camera mobile app | Asia virtual study group project for fast.ai deep learning part 1, v3 course. Ship a convolutional neural network on Android/iOS with PyTorch and Android Studio/Xcode |
DL Assignments, Exercises or Course Works
fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2018 (v2): Oct - Dec 2017
Notebook | Description |
---|---|
lesson1, lesson1-vgg, lesson1-rxt50, keras_lesson1 |
Lesson 1 - Recognizing Cats and Dogs |
lesson2-image_models | Lesson 2 - Improving Your Image Classifier |
lesson3-rossman | Lesson 3 - Understanding Convolutions |
lesson4-imdb | Lesson 4 - Structured Time Series and Language Models |
lesson5-movielens | Lesson 5 - Collaborative Filtering; Inside the Training Loop |
lesson6-rnn, lesson6-sgd |
Lesson 6 - Interpreting Embeddings; RNNs from Scratch |
lesson7-cifar10, lesson7-CAM |
Lesson 7 - ResNets from Scratch |
fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2019 (v3): Oct - Dec 2018
Deep Learning Part 1: 2019 Edition
fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2017 (v1): Feb - Apr 2017
Deep Learning Part 2: 2017 Edition
fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2018 (v2): Mar - May 2018
Deep Learning Part 2: 2018 Edition
Machine Learning
ML Assignments, Exercises or Course Works
Andrew Ng's "Machine Learning" class on Coursera
fast.ai's machine learning course
- Lesson 1 - Random Forest
- Lesson 2 - Random Forest Interpretation
- Lesson 3 - Random Forest Foundations
- Lesson 4 - MNIST SGD
- Lesson 5 - Natural Language Processing (NLP)
Libraries or Frameworks
NumPy
Notebook | Description |
---|---|
NumPy in 10 minutes | Introduction to NumPy for deep learning in 10 minutes |
PyTorch
WIP
TensorFlow
Notebook | Description |
---|---|
Guide to TensorFlow Keras on TPUs MNIST | Guide to TensorFlow + Keras on TPU v2 for free on Google Colab |
Keras
WIP
Pandas
WIP
Matplotlib
WIP
Kaggle Competitions
Notebook | Description |
---|---|
planet_cv | Planet: Understanding the Amazon from Space—use satellite data to track the human footprint in the Amazon rainforest |
Rossmann | Rossmann Store Sales—forecast sales using store, promotion, and competitor data |
fish | The Nature Conservancy Fisheries Monitoring—Can you detect and classify species of fish? |
License
This repository contains a variety of content; some developed by Cedric Chee, and some from third-parties. The third-party content is distributed under the license provided by those parties.
I am providing code and resources in this repository to you under an open source license. Because this is my personal repository, the license you receive to my code and resources is from me and not my employer.
The content developed by Cedric Chee is distributed under the following license:
Code
The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.
Text
The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.