aloctavodia / BAP

Bayesian Analysis with Python (Second Edition)

Home Page:https://www.amazon.com/dp/B07HHBCR9G

Repository from Github https://github.comaloctavodia/BAPRepository from Github https://github.comaloctavodia/BAP

Regression with spatial autocorrelation example

radugrosu opened this issue · comments

The Poisson regression with spatial similarity example on page 267 uses a GP with a Gaussian kernel:

cov = η * pm.gp.cov.ExpQuad(1, ls=ℓ)
gp = pm.gp.Latent(cov_func=cov)
f = gp.prior('f', X=islands_dist_sqr) 

I interpret this to mean that the similarity between two islands will be judged by comparing their distances to every other island (which is fine for this small data set). I'm surprised though that the squared distances are used as features (which the ExpQuad kernel would square yet again before scaling, adding up and exponentiating). Also, the plotting code using the posterior samples for the kernel parameters seems to interpret the inputs to the kernel function as distances (not squared distances):
np.median(trace_η) * np.exp(-np.median(trace_ℓ) * xrange**2).

In short, it seems that the model is consistent with something like
f = gp.prior('f', X=islands_dist)
rather than with
f = gp.prior('f', X=islands_dist_sqr)

An additional question is why not add an extra scaling factor (γ, say), which would modulate the influence of geography, i.e. μ = pm.math.exp(α + γ * f[index] + β * log_pop) - is it because the zero mean assumption on the GP would allow it to produce small values easily, if needed?.

Thanks, and please accept my apologies if I misunderstood the text.

Hi @radugrosu thank you for your comments, you are right it should be f = gp.prior('f', X=islands_dist) as you want to use the distances and not they squared values.

Regarding the extra scaling factor, notice that you already have η. Anyway, as you said the zero mean assumption on the GP is good enough as a default value.