aloctavodia / Statistical-Rethinking-with-Python-and-PyMC3

Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proposal for Code 5.55 (Ch 5)

sarajcev opened this issue · comments

It seems to me that the Code 5.55, as it presently stands, does not reflect the intended "index variable" approach introduced in the book? Hence, I propose the following code:

with pm.Model() as m5_16_alt:
    a = pm.Normal('a',mu = 0.6, sd=10, shape=len(d['clade_id'].unique()))
    mu = pm.Deterministic('mu', a[d['clade_id'].values])
    sigma = pm.Uniform('sigma', lower= 0 , upper= 10)
    kcal_per_g = pm.Normal('kcal_per_g', mu = mu, sd=sigma, observed = d['kcal.per.g'])
    trace_5_16_alt = pm.sample(1000, tune=1000) 

Proposed code includes shape parameter for the variable a and uses index variable the way it was intended by the author of the book. Is my reasoning on this correct?
The summary produces following output (excerpt):

a:

  Mean             SD               MC Error         89% HPD interval
  -------------------------------------------------------------------
  
  0.544            0.044            0.001            [0.482, 0.620]
  0.713            0.044            0.001            [0.641, 0.781]
  0.788            0.054            0.002            [0.709, 0.882]
  0.506            0.059            0.002            [0.407, 0.593]

sigma:

  Mean             SD               MC Error         89% HPD interval
  -------------------------------------------------------------------
  
  0.131            0.019            0.001            [0.101, 0.159]

Compare this output with the book (page 159).

Good Catch!

Additionally, you could omit the definition of the deterministic variable.

with pm.Model() as m5_16_alt:
    a = pm.Normal('a',mu = 0.6, sd=10, shape=len(d['clade_id'].unique()))
    sigma = pm.Uniform('sigma', lower= 0 , upper= 10)
    kcal_per_g = pm.Normal('kcal_per_g', mu = a[d['clade_id'].values], sd=sigma, observed = d['kcal.per.g'])
    trace_5_16_alt = pm.sample(1000, tune=1000) 

Do you mind sending a PR to fix this? If so, could you please send it to this repository?

I have sent a PR with a fix to the mentioned repository. I should mention that this port from R to Python, for this wonderful book, is of great value! Appreciate it.