# Data Science Notebooks

Data science Python notebooks—a collection of Jupyter notebooks on machine learning, deep learning, statistical inference, data analysis and visualization.

This repo contains various Python Jupyter notebooks I have created to experiment and learn with the core libraries essential for working with data in Python and work through exercises, assignments, course works, and explore subjects that I find interesting such as machine learning and deep learning. Familiarity with Python as a language is assumed.

The essential core libraries that I will be focusing on for working with data are NumPy, Pandas, Matplotlib, PyTorch, TensorFlow, Keras, Caffe, scikit-learn, spaCy, NLTK, Gensim, and related packages.

## Table of Contents

- Data Science Notebooks
- Table of Contents
- How to Use this Repo
- About
- Software
- Deep Learning
- Projects
- DL Assignments, Exercises or Course Works
- fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2018 (v2): Oct - Dec 2017
- fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2019 (v3): Oct - Dec 2018
- fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2017 (v1): Feb - Apr 2017
- fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2018 (v2): Mar - May 2018

- Machine Learning
- Libraries or Frameworks
- Kaggle Competitions
- License

## How to Use this Repo

- Run the code using the Jupyter notebooks available in this repository's notebooks directory.
- Launch a live notebook server with these notebooks using binder:

## About

The notebooks were written and tested with Python 3.6, though other Python versions (including Python 3.x) should work in nearly all cases.

See index.ipynb for an index of the notebooks available.

## Software

The code in the notebook was tested with Python 3.6, though most (but not all) will also work correctly with Python 3.x.

The packages I used to run the code in the notebook are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use). To install the requirements using conda, run the following at the command-line:

`$ conda install --file requirements.txt`

To create a stand-alone environment named DSN with Python 3.6 and all the required package versions, run the following:

`$ conda create -n DSN python=3.5 --file requirements.txt`

You can read more about using conda environments in the Managing Environments section of the conda documentation.

## Deep Learning

### Projects

Notebook | Description |
---|---|

Deep Painterly Harmonization | Implement Deep Painterly Harmonization paper in PyTorch |

Language modelling in Malay language for downstream NLP tasks | Implement Universal Language Model Fine-tuning for Text Classification (ULMFiT) in PyTorch |

Not Hotdog AI Camera mobile app | Asia virtual study group project for fast.ai deep learning part 1, v3 course. Ship a convolutional neural network on Android/iOS with PyTorch and Android Studio/Xcode |

### DL Assignments, Exercises or Course Works

#### fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2018 (v2): Oct - Dec 2017

Notebook | Description |
---|---|

lesson1, lesson1-vgg, lesson1-rxt50, keras_lesson1 |
Lesson 1 - Recognizing Cats and Dogs |

lesson2-image_models | Lesson 2 - Improving Your Image Classifier |

lesson3-rossman | Lesson 3 - Understanding Convolutions |

lesson4-imdb | Lesson 4 - Structured Time Series and Language Models |

lesson5-movielens | Lesson 5 - Collaborative Filtering; Inside the Training Loop |

lesson6-rnn, lesson6-sgd |
Lesson 6 - Interpreting Embeddings; RNNs from Scratch |

lesson7-cifar10, lesson7-CAM |
Lesson 7 - ResNets from Scratch |

#### fast.ai's Deep Learning Part 1: Practical Deep Learning for Coders 2019 (v3): Oct - Dec 2018

Deep Learning Part 1: 2019 Edition

#### fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2017 (v1): Feb - Apr 2017

Deep Learning Part 2: 2017 Edition

#### fast.ai's Deep Learning Part 2: Cutting Edge Deep Learning for Coders 2018 (v2): Mar - May 2018

Deep Learning Part 2: 2018 Edition

## Machine Learning

### ML Assignments, Exercises or Course Works

#### Andrew Ng's "Machine Learning" class on Coursera

#### fast.ai's machine learning course

- Lesson 1 - Random Forest
- Lesson 2 - Random Forest Interpretation
- Lesson 3 - Random Forest Foundations
- Lesson 4 - MNIST SGD
- Lesson 5 - Natural Language Processing (NLP)

## Libraries or Frameworks

### NumPy

Notebook | Description |
---|---|

NumPy in 10 minutes | Introduction to NumPy for deep learning in 10 minutes |

### PyTorch

*WIP*

### TensorFlow

Notebook | Description |
---|---|

Guide to TensorFlow Keras on TPUs MNIST | Guide to TensorFlow + Keras on TPU v2 for free on Google Colab |

### Keras

*WIP*

### Pandas

*WIP*

### Matplotlib

*WIP*

## Kaggle Competitions

Notebook | Description |
---|---|

planet_cv | Planet: Understanding the Amazon from Space—use satellite data to track the human footprint in the Amazon rainforest |

Rossmann | Rossmann Store Sales—forecast sales using store, promotion, and competitor data |

fish | The Nature Conservancy Fisheries Monitoring—Can you detect and classify species of fish? |

## License

This repository contains a variety of content; some developed by Cedric Chee, and some from third-parties. The third-party content is distributed under the license provided by those parties.

*I am providing code and resources in this repository to you under an open source license. Because this is my personal repository, the license you receive to my code and resources is from me and not my employer.*

The content developed by Cedric Chee is distributed under the following license:

### Code

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

### Text

The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.