glouppe / dats0001-foundations-of-data-science

Materials for DATS0001 Foundations of Data Science, ULiège

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DATS0001 Foundations of Data Science

Materials for DATS0001 Foundations of Data Science, ULiège, Fall 2023.

Agenda

Date Topic
September 18 No class
September 25 Course syllabus
Lecture 1: Introduction
nb01: Build, compute, critique, repeat [notebook]
Reading: Blei, Build, Compute, Critique, Repeat, 2014 [Section 1]
Reading: Box, Science and Statistics, 1976
October 2 Lecture 2: Data
nb02a: Tables [notebook]
nb02b: JAX [notebook]
nb02c: Data wrangling [notebook]
Reading: Harris et al, Array programming with NumPy, 2020
October 9 Lecture 3: Visualization
nb03a: Plots [notebook]
nb03b: Data visualization principles [notebook]
nb03c: High-dimensional data [notebook]
Reading: Rougier et al, Ten Simple Rules for Better Figures, 2014
Reading: Rougier, Scientific Visualization: Python+Matplotlib, 2022
October 16 No class
October 23 Lecture 4: Bayesian modeling
nb04: Latent variable models [notebook, sidenotes (LVMs), sidenotes (Probabilistic PCA)]
Reading: Gelman et al, Bayesian workflow, 2020 [Sections 1 and 2]
Reading: Blei, Build, Compute, Critique, Repeat, 2014 [Sections 2 and 3]
October 30 No class
November 6 Lecture 5: Markov Chain Monte Carlo
nb05: Markov Chain Monte Carlo [notebook] [sidenotes]
Reading: Gelman et al, Bayesian Data Analysis, 3rd, 2021 [Chapter 11]
November 13 Lecture 6: Expectation-Minimization
nb06: Expectation-Maximization [notebook] [sidenotes]
Reading: Dempster et al, Maximum Likelihood from Incomplete Data via EM, 1977
November 20 Lecture 7: Variational inference
nb07: ADVI [notebook] [sidenotes]
Reading: Kucukelbir et al, Automatic Differentiation Variational Inference, 2016
November 27 Lecture 8: Model criticism
nb08a: Model checking [notebook]
nb08b: Model comparison [notebook]
Reading: Gelman et al, Bayesian Data Analysis, 3rd, 2021 [Chapters 6 and 7]
December 4 Lecture 9: Wrap-up case study
nb09: Space Shuttle Challenger disaster [notebook]
Reading: Cam Davidson-Pilon, Bayesian Methods for Hackers, 2015 [Chapter 2]

Homeworks

  • Homework 1: Exploration of solar power data and weather data (due by November 6)
  • Homework 2: Modeling photovoltaic power production (due by December 1)
  • Homework 3: Improving and comparing forecasting models (due by December 15)
  • Exam-at-home: TBD

Homeworks must be submitted on Github classroom. Follow the links sent by email to register to each homework.

About

Materials for DATS0001 Foundations of Data Science, ULiège


Languages

Language:Jupyter Notebook 99.8%Language:CSS 0.2%Language:HTML 0.0%