rovivor / DSCI_571_sup-learn-1

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DSCI 571: Supervised Learning I

Introduction to supervised machine learning, with a focus on classification. Decision trees, SVM, Naive Bayes, and basic machine learning concepts such as generalization error and overfitting.

Schedule

Course structure: this course will be delivered as a flipped classroom, which means you are expected to watch the lecture videos before class. While you're not expected to understand everything just from watching a video once, you're expected to come to class with a basic familiarity of what it's all about; we won't be starting again from scratch in class. During the lecture time itself, we will work on practical examples in Python. As a result of the extra 1-2 hours of time spent watching the videos per week, we will aim to make the labs shorter than in other courses.

Some context: these videos are from Mike's undergraduate machine learning course, CPSC 340, which contains a lot of the same material. They were filmed in January-April 2018. You can find the accompanying slides, and some supplementary readings, here. By the end of MDS we will cover roughly everything in CPSC 340, although often skipping over the implementation details. On the other hand, we cover machine learning topics that do not appear in CPSC 340, like time series data and natural language data. In MDS we also benefit greatly from the statistical perspective of "stat stream" courses. Finally, we do a little bit on how to implement ML algorithms from scratch in DSCI 572.

Video timings: video links have start times embedded in them, which is where you are supposed to start watching from. End times are specified below if you're not supposed to watch the whole video. I recommend watching the videos at 1.25x speed.

# Date Topic To watch before class
1 2018-11-14 Intro to supervised learning, decision trees Decision tree video; "Decision Stump: Rule Search (Attempt 3)" at times 31:40-36:35 is optional.
2 2018-11-19 Fundamentals of learning, cross-validation Fundamentals of learning video and the part of the KNN video up to 29:00 on cross-validation.
3 2018-11-21 KNN, loess, feature preprocessing The rest of the KNN video + first part of naive Bayes video up to 16:20.
4 2018-11-26 Naive Bayes, evaluation metrics The rest of the Naive Bayes video, first part of ensemble methods video up to 14:40.
5 2018-11-28 Continuing with Lecture 4 Understanding ROC/AUC
6 2018-12-03 Ensemble methods, random forests The rest of the ensemble methods video, first part of clustering video up to 8:30.
7 2018-12-05 Linear classifiers Part of the linear classifier prediction video, first part of linear classifier training video up to 7:00.
Note: the part in the first video about the Perceptron Algorithm is optional, though it's somewhat relevant to DSCI 572 and also has historical significance.
Note: unlike the previous videos, here there is a jump from the previous lecture (ensembles, CPSC 340 lecture 7) to this one (linear classifiers, CPSC 340 lecture 18), which may cause a bit of chaos. You will hear words like "regularization" that you may not understand (coming in DSCI 573); when this happens, don't panic.
8 2018-12-10 Kernels, DSCI 571 review, Blocks 4-6 roadmap, outliers This lecture is optional; you will not be tested on content from it.

Reference Material

Books

  • Artificial intelligence: A Modern Approach by Russell, Stuart and Peter Norvig.
  • Artificial Intelligence 2E: Foundations of Computational Agents (2017) by David Poole and Alan Mackworth (of UBC!). Freely available online at https://artint.info/2e/html/ArtInt2e.html.
  • Introduction to Machine Learning with Python: A Guide for Data Scientists by Andreas C. Mueller and Sarah Guido.
  • A Course in Machine Learning by Hal Daumé III (also relevant for DSCI 572, 573, 575, 563)

Online courses

Short posts/articles

Misc

  • Metacademy (sort of like a concept map for machine learning, with suggested resources)
  • Machine Learning 101 (slides by Jason Mayes, engineer at Google)

About