iamzehan / think_bayes

A personal endeavor towards Bayesian Statistics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bayesian Statistics in Python


A personal endeavor towards Bayesian Statistics.

⚠ If your are new to this kind of Statistics: It is advised to also use 'pen and paper' to keep track of the equations or some of them might make your head hurt.

This repository contains the study materials, personal remarks and experimental aspects of the book Titled: 'Think Bayes - Bayesian Statistics in Python (2nd Edition)', written by Allen B. Downey.


Chapter 1: Probability

Covers the basics of Probability with conjunctions, conditionals & laws of probability. The reader is taken through the Chapter with an Example Titled: 'Linda the Banker' involving the famous experiment by Tversky and Kahneman. The chapter takes on a single dataset and examines different aspects of it under different probabilistic theories.

Keywords: conjunction, condition, commutative, probability.


Chapter 2: Bayes' Theorem

This chapter continues after the Bayes' Theorem described briefly in the former. Three main problems are used as examples to devise the further explanation of Bayes' Theorem

  1. Cookie Problem: Using Bayes' Original Theorem.
  2. Dice Problem: Using Bayes Tables and Diachronic Bayes
  3. Monty Hall Problem: Using Bayes Tables and Diachronic Bayes

Keywords: prior, likelihood, posterior, total probability, normazlization.


Chapter 3: Distributions

A python library called empiricaldist and it's extended class Probability Mass Function which comes as named Pmf is extensively used in this chapter. Empirical distribution gives us distributions based on the data as opposed to theoretical distributions that we devised in examples from the earlier chapters. Several problems from Chapter 2 e.g. The Cookie Problem, Dice Problem etc. are also revisited using this distribution and it goes on show how theoretical inference differentiate and is intrecately connected to the everchanging data. A variant of the Cookie Problem named Bowl 101 introduces initialization of prior hypotheses across 100 instances and then updating the posterior probability as the cookies are drawn based on the outcome(s).

Keywords: Empirical Distributions, Normal Distribution, Uniform Distribution, Probabilty Mass Function.


Chapter 4: Estimating Proportions

The reader is posed with a problem titled: The Euro Problem and the problem is tackled using:

  • Binomial Distribution,
  • Bayesian Estimation,
  • Triangle Prior and
  • The Binormial Likelihood Function The binom class is used from the scipy.stats library for binomial distribution. However, this Chapter doesn't entirely answer the Euro Problem but it intends to introduce Binomial Distribution and two types of priors and states the similiarities between the outcomes despite the differences.

Keywords: Binomial Distribution, Triangle Prior, Proportions.


Chapter 5: Estimating Counts

In this chapter reader is posed with a problem titled: The Train Problem . This chapter answers following questions:

  • What do we do when we have limited infromation but infinite possibilities?
  • How does one choose prior probabilities based on the limited number of data?
  • How does one count highest probable number in a situation where there chances of finding out real count is scant?

All of these questions are answered in this chapter with the problem stated above. Also The German Tank Problem is referred to alongside the aforementioned.

Keywords: MMSE(Minimum Means Squared Error), Power Law Prior, Informative Prior, Uninformative Prior.


Chapter 6: Odds and Addends

In this chapter there are two topics: Odds: Introduces the Bayes's Rule and likelihood ratio, A Crime Problem titled- "Oliver's Blood" is solved. Addends: In this part we see how to add several distributions and Central Limit Theorem with example is introduced. One more problem Titled Gluten Sensitivity is solved.

Keywords: odds, likelihood ratio, addends


Chapter 7: Minimum, Maximum and Mixture

In this chapter the cumulative distribution function(cdf) is introduced. Also the interchangeability between pmf and cdf is also shown. Other topics such as choosing minimum, maximum or mixture of distributions to find the optimal results of inference.

Keywords: cdf, cumsum(), mixtures, minimum, maximum


Chapter 8: Poisson Processes

This chapter introduces three new distributions: Poisson Distribution, Gamma Distribution and Exponential Distribution. The reader is taken through all these distribution with a problem titled "The World Cup Problem" which is about soccer games in world cup and tries to figure out the proportions of winning and losing by applying different distributions.

Keywords: pdf, alpha, gamma(), lambda, poisson().


P.S.This is a documentation of a personal journey.

About

A personal endeavor towards Bayesian Statistics


Languages

Language:Jupyter Notebook 100.0%