Machine Learning Refined: Notes, Exercises, and Jupyter notebooks

Publisher: Cambridge University Press

First edition: November 2016
Second edition: January 2020 (expected)

A little sampler first
What is in this book?
Who is this book for?
What is in the repo?
Notes
Installation
Creators

A little sampler first

Many machine learning concepts - like convergence of an algorithm, evolution of a model from an underfitting one all the way to an overfitting model, etc. - can be illustrated and intuited best using animations (as opposed to static figures). You'll find a large number of interactive widgets - which you can modify yourself too - in this book/repo. Here are just a few examples:


Cross-validation (regression)	Cross-validation (two-class classification)	Cross-validation (multi-class classification)


K-means clustering	Feature normalization	Normalized gradient descent


Rotation	Convexification	Dogification!


A nonlinear transformation	Weighted classification	The moving average


Batch normalization	Logistic regression


Polynomials vs. NNs vs. Trees (regression)	Polynomials vs. NNs vs. Trees (classification)


Changing gradient descent's steplength (1d)	Changing gradient descent's steplength (2d)


Convex combination of two functions	Taylor series approximation


Feature selection via regularization	Secant planes


Function approximation with a neural network	A regression tree

What is in this book?

(Back to top)

We believe that understanding machine learning is impossible without having a firm grasp of its underlying mathematical machiney. But we also believe that the bulk of learning the subject takes place when learners "get their hands dirty" and code things up for themselves. That's why in this book we discuss both how to derive machine learnig models mathematically and how to implement them from scratch (using numpy, matplotlib, and autograd libraries) - and yes, this includes multi-layer neural networks as well!

Who is this book for?

(Back to top)

This text aims to bridge the existing gap between practicality and rigor in machine learning education, in a market saturated with books that are either mathematically rigorous but not practical, or vice versa. Conventional textbooks usually place little to no emphasis on coding, leaving the reader struggling to put what they learned into practice. On the other hand the more hands-on books in the market typically lack rigor, leaving machine learning a 'black box' to the reader.

If you're looking for a practical yet rigorous treatment of machine learning, then this book is for you.

What is in the repo?

(Back to top)

1. Interatcive html notes

These notes - listed here - served as an early draft for the second edition of the text. You can also find them in the notes directory. Here's an example:

2. Accompanying Jupyter notebooks (used to create the html notes)

Feel free to take a peek under the hood, tweak the models, explore new datasets, etc. Here's an example:

3. Coding exercises (1st edition)

In the exercises directory you can find starting wrappers for coding exercises from the first edition of the text in Python and MATLAB. Exercises for the 2nd edition will be added soon.

Notes

(Back to top)

Chapter 11: Principles of feature learning

11.1 Introduction
11.2 Universal approximators
11.3 Universal approximation of real data
11.4 Naive cross-validation
11.5 Efficient cross-validation via boosting
11.6 Efficient cross-validation via regularization
11.7 Testing data
11.8 Which universal approximator works best in practice?
11.9 Bagging cross-validated models
11.10 K-folds cross-validation
11.11 When feature learning fails
11.12 Conclusion

Chapter 12: Kernels

12.1 Introduction
12.2 The variety of kernel-based learners
12.3 The kernel trick
12.4 Kernels as similarity measures
12.5 Scaling kernels

Chapter 13: Fully connected networks

13.1 Introduction
13.2 Fully connected networks
13.3 Optimization issues
13.4 Activation functions
13.5 Backpropogation
13.6 Batch normalization
13.7 Early-stopping

Chapter 14: Tree-based learners

14.1 Introduction
14.2 Varieties of tree-based learners
14.3 Regression trees
14.4 Classification trees
14.5 Gradient boosting
14.6 Random forests
14.7 Cross-validating individual trees

Installation

(Back to top)

To successfully run the Jupyter notebooks contained in this repo we highly recommend downloading the Anaconda Python 3 distribution. Many of these notebooks also employ the Automatic Differentiator autograd which can be installed by typing the following command at your terminal

  pip install autograd

With minor adjustment users can also run these notebooks using the GPU/TPU extended version of autograd JAX.

Creators

(Back to top)

This repository is in active development by Jeremy Watt and Reza Borhani - please do not hesitate to reach out with comments, questions, typos, etc.

nmarwen / machine_learning_refined

Machine Learning Refined: Notes, Exercises, and Jupyter notebooks

Table of contents

A little sampler first

What is in this book?

Who is this book for?

What is in the repo?

1. Interatcive html notes

2. Accompanying Jupyter notebooks (used to create the html notes)

3. Coding exercises (1st edition)

Notes

Chapter 2: Zero order / derivative free optimization

Chapter 3: First order optimization methods

Chapter 4: Second order optimization methods

Chapter 5: Linear regression

Chapter 6: Linear two-class classification

Chapter 7: Linear multi-class classification

Chapter 8: Unsupervised learning

Chapter 9: Principles of feature selection and engineering

Chapter 10: Introduction to nonlinear learning

Chapter 11: Principles of feature learning

Chapter 12: Kernels

Chapter 13: Fully connected networks

Chapter 14: Tree-based learners

Installation

Creators

About

Languages