jshinm / Hands-On-Machine-Learning-with-Scikit-Learn-Keras-and-TensorFlow

This project was completed in partial fulfillment of the requirement for EN.601.509(50) advised by Dr. Joshua Vogelstein

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

The purpose of this assignment was to demonstrate coding competency and comprehensive understanding of scikit-learn and TensorFlow API in various bioinformatic applications. Coding practicals were followed as instructed from Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 2nd Edition, and shown in Jupyter notebook. Additionally contained in this repository are my own study notes, book equations, and helpful compilations of coding examples on numpy, pandas, and matplotlib.

This project was completed in partial fulfillment of the requirement for EN.601.509(50) advised by Dr. Joshua Vogelstein

Derivated Projects

Following short projects are derived from the use of scikit-learn API (latest first).

  1. Alzheimer's Disease genenic risk factor estimation from genetic variant fingerprints
  2. Classification of genetic mutation from word-based clinical evidence
  3. HIV status prediction from quantitative antibody profiles that consists of 11 antigen panels

How I studied the book

I initially went through the book without deeply going into mathematical background or in-depth coding implementation. As I read through the book, I made my own study notes on Jupyter notebook to refer back to what I thought was important, which I still update as I revisit the book. Following that, I carefully purused the chapter codes and disassembled them to understand the functionality of each command. On the side, I bolstered my understanding by delving into mathematical derivations behind statistical operations, thanks to Intro to Probability by Anderson and Valko. Lastly, the projects I listed above utilize what I learned from this book.

edit: I am writing sample codes to demonstrate a complete library list of scikit-learn API in the book. This part is still work in progress. (5/22/2020)

My thoughts

This is definitely a great reference for anyone who wants to introduce ML into their work scheme. Particularly, this has been a fabulous introduction to TensorFlow.V2 which now allows more freedom to users than what V1 had permitted. Whether it be a simple regressor training or extensive MLP with a huge dataset, this book can guide you through mathematical complexity to build your own ML algorithm in a most effortless manner. On that note, this has become my ultimate guideline of how to ML with scikit-learn and TF, and simply my favorite book on ML. My other favorite book in the field of AI is a book called Deep Learning (Adaptive Computation and Machine Learning series) by Goodfellow.

Content Table

I. The Fundamentals of Machine Learning

  1. The Machine Learning Landscape
  2. End-to-End Machine Learning Project
  3. Classification
  4. Training Models
  5. Support Vector Machines
  6. Decision Trees
  7. Ensemble Learning and Random Forests
  8. Dimensionality Reduction
  9. Unsupervised Learning Techniques

II. Neural Networks and Deep Learning

  1. Introduction to Artificial Neural Networks with Keras
  2. Training Deep Neural Networks
  3. Custom Models and Training with TensorFlow
  4. Loading and Preprocessing Data with TensorFlow
  5. Deep Computer Vision Using Convolutional Neural Networks
  6. Processing Sequences Using RNNs and CNNs
  7. Natural Language Processing with RNNs and Attention
  8. Representation Learning and Generative Learning Using Autoencoders and GANs
  9. Reinforcement Learning
  10. Training and Deploying TensorFlow Models at Scale

About

This project was completed in partial fulfillment of the requirement for EN.601.509(50) advised by Dr. Joshua Vogelstein


Languages

Language:Jupyter Notebook 100.0%