jlaneve / Applied-Bayesian-Modeling-Summer-Reading-Group

This repo is the landing page for a student-run reading group on introductory Bayesian modeling.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Applied-Bayesian-Modeling-Summer-Reading-Group

FAQ

What is this?

This is the landing page for an introductory-level student-run reading group on Bayesian modeling, which will held during June and July of 2021. The purpose of this group is to bring together a group of students interested in learning about Bayesian methods who might not encounter them otherwise, and have a bit of fun while doing so. This group will be application-driven rather than theory-focused, meaning that we will be more concerned with learning how to apply Bayesian methods and software tools to real-world problems than with discussing all of the intricate mathematical details. While this is a "reading" group, we will be running quite a bit of code since this material is best learned by running the examples yourself.

Our reading material will consist of a blend of books, Jupyter Notebooks, blog posts, and software documentation pages. To get an idea of some of the material we will cover, check out Richard McElreath's Statistical Rethinking and Cameron Davidson-Pilon's Bayesian Methods for Hackers. In terms of software, we will be using the popular Python library PyMC for our computational examples but might also discuss some of the other tools that are out there.

What are some examples of what I'll learn to do?

An excellent set of examples of Bayesian models can be found in the examples section of the PyMC documentation (this is what we'll be using to implement models). On a very high-level, suppose you have some data observations and a statistical model for what you think generates your data. Your model will depend on various parameters (e.g. if the data is linear, you would have a parameter for the slope, the intercept, and possibly the variance of the observational noise). Bayesian inference answers the question: given my model and my prior beliefs about the parameter values, what are the most likely values for each of my model parameters conditional on my observed data?

Here's a quick example. Suppose you observe a series of observations along a sample path of a geometric Brownian motion (don't worry if you don't know what this is!), such as shown below.

The geometric Brownian motion model depends on two parameters, namely mu and sigma. So, given our data observations above, what can we say about the values of mu and sigma? After we place some prior distribution on these parameters, we can use Bayesian inference to get a sense of what the most likely values for mu and sigma are.

The histograms above indicate our posterior distributions over these two parameters given our model, prior, and data. An interesting thing to note about Bayesian models is that they are generative, meaning that it is straightforward to then generate "fake" data that mimics our data observations, which very well could've been the data we observed should randomness have played out differently.

Still not sure if you're interested? Checkout some recordings of the talks that were given at PyMCon 2020 or this podcast on Bayesian statistics for some more examples.

Why is this?

Many students don't typically encounter Bayesian methods in their coursework, and even if they do they often are just introduced to the basic theory rather than how to go about building models themselves. Bayesian modeling can be a bit challenging to get started with on your own, and this group is meant to equip beginners with what they need to know in order to pursue whatever projects or applications they might be interested in.

Who is this?

This group will be led by Jonathan Lindbloom (myself) and Julian LaNeve; participants will largely be students from our own SMU academic bubbles but we are open to others who are interested. We are shooting for a group of about 10 people.

When/where is this?

We will meet (virtually) one evening per week (day TBD) for approximately 1.5 hours to discuss the reading for each week and some computational examples. We will begin each meeting with a brief synopsis of the material to make sure everyone gets the gist of what was covered, and then open the floor for discussion. Participants are encouraged to spend time outside of the meetings and readings playing around with code themselves, and technical support / office hours will be available to help.

Target Audience?

In order to get the most out of this group, participants should have some prior programming experience (not necessarily Python) and at least an introductory probability/statistics class. Asides from that, this group is really aimed at people who are complete beginners to Bayesian statistics so not much else background is necessary.

Projects?

Participants will be highly encouraged to plan and conduct their own original project to complete alongside the reading group. Guidelines for this are open-ended and participants can make their projects as involved as they would like - you'll get out of the project what you put into it. Jonathan will be available to meet and help you out with planning/implmenting your project if desired. Depending on how the projects turn out, we may try to assemble them all into a notebook or a report.

Outline

This is a working outline of what we'll cover each week (currently very incomplete). SR stands for the Statistical Rethinking book, BMH stands for the Bayesian Methods for Hackers book.

Week Topic Readings Examples
#1 Introduction. Bayesian vs. Frequentist, weighted coin-flipping, Bayesian linear regression, hypothesis testing. SR Ch. 1; BMH Ch. 1 BEST;
#2 Ingredients for a Model. Introduction to PyMC, probabilistic programming, priors, SR Ch. 2 TBA
#3 Looking Under-the-Hood. Monte Carlo methods, MCMC. TBA MCMC Gallery.
#4 Multivariate Models. Multivariate Gaussians, inverse Wishart distribution, TBA Multivariate Normal Models;
#5 Hierarchical Models. TBA Radon Regression Example; Hierarchical Binomial (Rat Tumors); Industry Weighted Average Cost of Capital (WACC);
#6 Time-Series Models. TBA Stochastic Volatility;
#7 Gaussian Process Models. TBA CO2 at Mauna Loa Part 1, Part 2; Latent Variable Implementation; Marginal Likelihood Implementation.
#8 Model Criticism. TBA TBA

About

This repo is the landing page for a student-run reading group on introductory Bayesian modeling.

License:MIT License