10718: Machine Learning in Practice

Previous Versions:

Spring 2020
Fall 2020

Fall 2021: Tues & Thurs, 4:40-6:00pm (MM A14), Lab Section: Wednesday 4:40-6:00pm (MM A14)

Important

All content will be on github in this repo including schedule and tech setup instructions
All assignments will be on and submitted through canvas
Class communication and announcements will be primarily through Slack

Class Description

This is a project-based course designed to provide students training and experience in solving real-world problems using machine learning, exploring the interface between research and practice, with a particular focus on topics in fairness and explainability.

The goal of this course is to give students exposure to and experience the nuance of applying machine learning to real-world problems, where common assumptions (like iid and stationarity) break down, and the growing needs for (and limitations of) approaches to improve fairness and explainability of these applications. Through project assignments, lectures, discussions, and readings, students will learn about and experience building machine learning systems for real-world problems, as well as applying and evaluating the utility of proposed methods for enhancing the interpretability and fairness of machine learning models. Through the course, students will develop skills in problem scoping and formulation, getting, storing, linking, and working with messy data, making ML pipeline design choices appropriate for the problem at hand, exploring approaches for model selection, model interpretability, as well as understanding and mitigating algorithmic bias & disparities, and evaluating the impact of deployed models.

DRAFT SYLLABUS

People

Instructors

Rayid Ghani	Kit Rodolfa
GHC 8023 Office Hours: Tue 12-1, Wed 2-3	GHC 8018 Office Hours: Wed 11-12, Thu 1:30-2:30

Infrastructure Assistants

Infrastructure Assistants are responsible for managing the compute infrastructure and help with logging in, scaling infrastructure, and connection issues.

Riyaz Panjwani	Abhishek Parikh
Office Hours: Mon 12-1, Fri 12-1 by GHC 8th Fl. Printer	Office Hours: Mon 11-12, Fri 2-3 by GHC 8th Fl. Printer

Grading

Note that this course is being offered pass/fail. Each time you ask how each action you take will affect your grade will result in lowering of the grade.

Weekly project update assignments (10%)

Midterm take-home exam (20%)

Write-up on interpretability findings (15%)

Write-up on fairness findings (15%)

Group presentation (5%)

Future research or project proposal (15%)

Quizzes on readings and concepts (5%)

Class attendance and participation (10%)

Submitting weekly check-in and feedback forms (5%)

Schedule

See the syllabus for much more detail as well, including information about group projects, grading, and helpful optional readings.

Week	Dates	Topic	Required Readings	Assignments
1	Tu: Aug 31	Class Intro and Overview
1	Th: Sep 2	ML Project Scoping	ML Project Scoping Guide	Project Team Selection
2	Tu: Sep 7	Getting, Storing, and Linking Data	Optional readings on github
2	Th: Sep 9	Analytical Formulation / Baselines	List on github
3	Tu: Sep 14	Model Selection Methodology	Readings on github	Project Assignment 1: Formulation and Baseline (due Monday)
3	Th: Sep 16	Performance Metrics	Readings on github
4	Tu: Sep 21	Feature Engineering and Imputation	Readings on github	Project Assignment 2: Validation set up Initial pipeline with train and validation set(s) and baseline implemented (due Monday)
4	Th: Sep 23	Hands-on Session for ML Pipeline review
5	Tu: Sep 28	Models/hyperparameters in practice		Project Assignment 3: list of features and some subset implemented (due Monday)
5	Th: Sep 30	Temporal Model Selection	Readings on github
6	Tu: Oct 5	Module 1 Review: Applied ML - End to End Pipelines		Project Assignment 4: modeling results (due Monday)
6	Th: Oct 7	Mid-term - no class
7	Tu: Oct 12	ML Ethics Issues Overview	Readings on github
7	Th: Oct 14	No Class - Mid-semester break
8	Tu: Oct 19	Review of modeling results		Updated model results assignment (+ model selection) Due Monday
8	Th: Oct 21	Working session: model debugging and updates
9	Tu: Oct 26	Interpretability: Intro and Overview, taxonomy Understanding the Models	Readings on github	Due Monday: Revisions to update 5 results
9	Th: Oct 28	Team Presentations and Discussion on Interpretability Methods: Inherently Interpretable (GA2Ms, RiskSLIM, etc.)	Readings on github
10	Tu: Nov 2	Team Presentations and Discussion on Interpretability Methods:: Post-Hoc Local/Feature-based (LIME, SHAP, MAPLE)	Readings on github
10	Th: Nov 4	Team Presentations and Discussion on Interpretability Methods: Other methods (counterfactual, example-based, etc.)	Readings on github	Interpretability Writeup Due on Friday
11	Tu: Nov 9	Fairness in ML Overview	Readings on github
11	Th: Nov 11	Team Presentations and Discussion on Fairness Methods: Pre-processing (removing sensitive attribute, sampling)	Readings on github
12	Tu: Nov 16	Team Presentations and Discussion on Fairness Methods: In-processing (Zafar, Celis, fairlearn, etc.)	Readings on github
12	Th:Nov 18	Team Presentations and Discussion on Post-Processing: Hardt, LA, etc	Readings on github
13	Tu: Nov 23	Module 3 Review: ML Fairness - Methods, Empirical Results, Gaps
13	Th: Thanksgiving	Thanksgiving holiday
14	Tu: Nov 30	Field Trials and Causality	Readings on github	Bias Writeup Due
14	Th: Dec 2	Wrap-Up: Key points to take away from the semester
15	Tu: Dec 7	No Class - Finals Week
15	Th: Dec 9	No Class - Finals Week		Final Writeup Due

Module II & III Presentation/Discussant Assignments

MODULE 2 –– INTERPRETABILITY

date	method/approach	presenting group	discussant group
Thu, Oct 28	GAMs	effective_analytics	team_16
Thu, Oct 28	RiskSLIM	team_15	team_6
Tue, Nov 2	LIME	team_14	lucky_13
Tue, Nov 2	SHAP	team_12	quack
Tue, Nov 2	MAPLE	wuhu	team_4
Thu, Nov 4	DiCE	team_5	taaab
Thu, Nov 4	ProtoDash	k_means_girls	hmm

MODULE 3 –– FAIRNESS

date	method/approach	presenting group	discussant group
Tue, Nov 16	Zafar In-Processing	team_rocket	team_14
Tue, Nov 16	Celis In-Processing	lucky_13	team_12
Tue, Nov 16	FairLearn In-Processing	team_6	team_15
Thu, Nov 18	Model Selection	hmm	k_means_girls
Thu, Nov 18	Decoupled Classifiers	taaab	team_5
Thu, Nov 18	Score Adjustments	team_4	team_rocket
Tue, Nov 23	Removing Sensitive Features	team_16	effective_analytics
Tue, Nov 23	Resampling - Under and Over	quack	wuhu

fjxmlzn / MLinPractice