fjxmlzn / MLinPractice

Repository for ML in Practice Course at CMU (10-718)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

10718: Machine Learning in Practice

Previous Versions:

Fall 2021: Tues & Thurs, 4:40-6:00pm (MM A14), Lab Section: Wednesday 4:40-6:00pm (MM A14)

Important

  • All content will be on github in this repo including schedule and tech setup instructions
  • All assignments will be on and submitted through canvas
  • Class communication and announcements will be primarily through Slack

Class Description

This is a project-based course designed to provide students training and experience in solving real-world problems using machine learning, exploring the interface between research and practice, with a particular focus on topics in fairness and explainability.

The goal of this course is to give students exposure to and experience the nuance of applying machine learning to real-world problems, where common assumptions (like iid and stationarity) break down, and the growing needs for (and limitations of) approaches to improve fairness and explainability of these applications. Through project assignments, lectures, discussions, and readings, students will learn about and experience building machine learning systems for real-world problems, as well as applying and evaluating the utility of proposed methods for enhancing the interpretability and fairness of machine learning models. Through the course, students will develop skills in problem scoping and formulation, getting, storing, linking, and working with messy data, making ML pipeline design choices appropriate for the problem at hand, exploring approaches for model selection, model interpretability, as well as understanding and mitigating algorithmic bias & disparities, and evaluating the impact of deployed models.

DRAFT SYLLABUS

People

Instructors

Rayid Ghani Kit Rodolfa

GHC 8023
Office Hours:
Tue 12-1, Wed 2-3

GHC 8018
Office Hours:
Wed 11-12, Thu 1:30-2:30

Infrastructure Assistants

Infrastructure Assistants are responsible for managing the compute infrastructure and help with logging in, scaling infrastructure, and connection issues.

Riyaz Panjwani Abhishek Parikh

Office Hours:
Mon 12-1, Fri 12-1
by GHC 8th Fl. Printer

Office Hours:
Mon 11-12, Fri 2-3
by GHC 8th Fl. Printer

Grading

Note that this course is being offered pass/fail. Each time you ask how each action you take will affect your grade will result in lowering of the grade.

Weekly project update assignments (10%)

Midterm take-home exam (20%)

Write-up on interpretability findings (15%)

Write-up on fairness findings (15%)

Group presentation (5%)

Future research or project proposal (15%)

Quizzes on readings and concepts (5%)

Class attendance and participation (10%)

Submitting weekly check-in and feedback forms (5%)

Schedule

See the syllabus for much more detail as well, including information about group projects, grading, and helpful optional readings.

Week Dates Topic Required Readings Assignments
1 Tu: Aug 31 Class Intro and Overview
1 Th: Sep 2 ML Project Scoping ML Project Scoping Guide Project Team Selection
2 Tu: Sep 7 Getting, Storing, and Linking Data Optional readings on github
2 Th: Sep 9 Analytical Formulation / Baselines List on github
3 Tu: Sep 14 Model Selection Methodology Readings on github Project Assignment 1: Formulation and Baseline (due Monday)
3 Th: Sep 16 Performance Metrics Readings on github
4 Tu: Sep 21 Feature Engineering and Imputation Readings on github Project Assignment 2:
Validation set up
Initial pipeline with train and validation set(s) and baseline implemented (due Monday)
4 Th: Sep 23 Hands-on Session for ML Pipeline review
5 Tu: Sep 28 Models/hyperparameters in practice Project Assignment 3:
list of features and some subset implemented (due Monday)
5 Th: Sep 30 Temporal Model Selection Readings on github
6 Tu: Oct 5 Module 1 Review: Applied ML - End to End Pipelines Project Assignment 4:
modeling results (due Monday)
6 Th: Oct 7 Mid-term - no class
7 Tu: Oct 12 ML Ethics Issues Overview Readings on github
7 Th: Oct 14 No Class - Mid-semester break
8 Tu: Oct 19 Review of modeling results Updated model results assignment (+ model selection) Due Monday
8 Th: Oct 21 Working session: model debugging and updates
9 Tu: Oct 26 Interpretability: Intro and Overview, taxonomy

Understanding the Models
Readings on github Due Monday: Revisions to update 5 results
9 Th: Oct 28 Team Presentations and Discussion on Interpretability Methods: Inherently Interpretable (GA2Ms, RiskSLIM, etc.) Readings on github
10 Tu: Nov 2 Team Presentations and Discussion on Interpretability Methods:: Post-Hoc Local/Feature-based (LIME, SHAP, MAPLE) Readings on github
10 Th: Nov 4 Team Presentations and Discussion on Interpretability Methods: Other methods (counterfactual, example-based, etc.) Readings on github Interpretability Writeup Due on Friday
11 Tu: Nov 9 Fairness in ML Overview Readings on github
11 Th: Nov 11 Team Presentations and Discussion on Fairness Methods: Pre-processing (removing sensitive attribute, sampling) Readings on github
12 Tu: Nov 16 Team Presentations and Discussion on Fairness Methods: In-processing (Zafar, Celis, fairlearn, etc.) Readings on github
12 Th:Nov 18 Team Presentations and Discussion on Post-Processing: Hardt, LA, etc Readings on github
13 Tu: Nov 23 Module 3 Review: ML Fairness - Methods, Empirical Results, Gaps
13 Th: Thanksgiving Thanksgiving holiday
14 Tu: Nov 30 Field Trials and Causality Readings on github Bias Writeup Due
14 Th: Dec 2 Wrap-Up: Key points to take away from the semester
15 Tu: Dec 7 No Class - Finals Week
15 Th: Dec 9 No Class - Finals Week Final Writeup Due

Module II & III Presentation/Discussant Assignments

MODULE 2 –– INTERPRETABILITY

date method/approach presenting group discussant group
Thu, Oct 28 GAMs effective_analytics team_16
Thu, Oct 28 RiskSLIM team_15 team_6
Tue, Nov 2 LIME team_14 lucky_13
Tue, Nov 2 SHAP team_12 quack
Tue, Nov 2 MAPLE wuhu team_4
Thu, Nov 4 DiCE team_5 taaab
Thu, Nov 4 ProtoDash k_means_girls hmm

MODULE 3 –– FAIRNESS

date method/approach presenting group discussant group
Tue, Nov 16 Zafar In-Processing team_rocket team_14
Tue, Nov 16 Celis In-Processing lucky_13 team_12
Tue, Nov 16 FairLearn In-Processing team_6 team_15
Thu, Nov 18 Model Selection hmm k_means_girls
Thu, Nov 18 Decoupled Classifiers taaab team_5
Thu, Nov 18 Score Adjustments team_4 team_rocket
Tue, Nov 23 Removing Sensitive Features team_16 effective_analytics
Tue, Nov 23 Resampling - Under and Over quack wuhu

About

Repository for ML in Practice Course at CMU (10-718)

License:MIT License


Languages

Language:Jupyter Notebook 99.9%Language:Shell 0.1%