jsharpna / DavisSML

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UC Davis Statistics 208 : Statistical Machine Learning

A Course on the Principles of Statistical Machine Learning with Examples in Python

Machine learning is how to get computers to automatically learn and improve with experience. Experience comes in the form of data, improvement is with respect to some performance metric, and learning is done by a learning algorithm. There are always computational constraints, such as the architecture, computation time, bandwidth limitations, and so on. So we can more precisely restate the goal thus: to construct learning algorithms that use data to improve with respect to a performance metric and do so under computational constraints.

We will focus on principles of statistical machine learning in the prediction problems, regression and classification. Conspicuously absent is most Bayesian methodology and advanced concepts such as reinforcement learning. This course is not a broad overview of all of machine learning, but rather a tour of the key ideas in machine learning as told through these prediction tasks. Typically, I have students tell me something along the lines of "I thought machine learning was about [insert random methodology here]". Machine learning is a field, like physical chemistry or creative literature. It is not defined by a couple of methods or a single task, and cannot be taught in a single quarter. With that said, I want this course to lay the foundation for a rich understanding of machine learning.

Instructions: The lectures will mostly be jupyter notebooks. To follow along with the slides use the following command in the lecture folder.

jupyter nbconvert lecture[# here].ipynb --to slides --post serve

Due to Covid-19, I have recorded all of my lectures and am uploading them to Youtube. They will be linked as they become available, and will be organized into playlists which correspond to a single lecture (between 1-2 hrs of content). I have structured the lecture so that there are exercises that you can do on your own as you listen to the lecture. Most of the exercises are at the end of a video and will be answered at the beginning of the next, but sometimes I will ask you to pause the video. After I ask the question you should take the time to complete the exercise, then go on to hear the answer.

Lecture Notes

Introduction to Machine Learning

Principles: Bias-Variance, training and testing, losses, OLS and KNN
Reading: ESL Chapter 2

SetupVideosData science workflow installations, first project repository
Lecture 1Notebook-VideosIntroduction to machine learning
Lecture 2Notebook-VideosModel selection and bias-variance tradeoff

Regression (beyond Ordinary Least Squares)

Principles: Convex relaxation, computational intractability in subset selection
Reading: ESL Chapter 3, Boyd Chapter 1

Lecture 3Notebook-VideosOLS, Matrix Decompositions, Subset selection and ridge regression
Lecture 4Notebook-VideosConvex optimization, first order methods
Lecture 5Notebook-VideosThe Lasso

Classification

Principles: Surrogate losses, generative and discriminative methods
Reading: ESL Chapter 4

Lecture 6Notebook-VideosGenerative methods, naive Bayes, discriminant analysis, ROC, PR
Lecture 7Notebook-VideosLogistic regression, support vector machines, surrogate losses
Lecture 8Notebook-VideosOnline learning, stochastic gradient descent, perceptron
Lecture 9Notebooks-VideosMulticlass classification, tensorflow

Unsupervised Learning and HMMs

Principles: HMMs, Clustering, Dimension Reduction
Reading: ESL Chapter 14

Lecture 10Notebooks-VideosClustering, Dimension Reduction

Non-linear methods

Principles: basis expansion, kernel trick, bagging, boosting, neural nets
Reading: ESL Chapter 5, 7, 8

Lecture 11Notebook-VideosBasis expansion and kernel trick
Lecture 12Notebook-VideosNeural Networks
Lecture 13Notebook-VideosBagging and Boosting

Deep Learning

Lecture 14Notebook-VideosConvolutional nets
Lecture 15Notebook-VideosRecurrent neural nets

About

License:MIT License


Languages

Language:Jupyter Notebook 83.8%Language:HTML 15.7%Language:Prolog 0.5%Language:Python 0.0%