ValueError: Logp function returned error Initialization of first point failed
ricardoV94 opened this issue · comments
import nutpie
import pymc as pm
with pm.Model() as m:
proposed_pay = pm.MutableData("proposed_pay", np.array([50_000, 200_000], dtype="float64"))
accepted = pm.MutableData("accepted", np.array([0, 1], dtype="int64"))
mean = pm.Normal("mu", mu=100_000, sigma=25_000)
std = pm.Gamma("std", mu=25_000, sigma=5_000)
p_accept = 1 - pm.logcdf(pm.Normal.dist(mean, std), proposed_pay).exp()
p_accept = pm.Deterministic("p_accept", p_accept)
llike = pm.Bernoulli("llike", p=p_accept, observed=accepted)
cm = nutpie.compile_pymc_model(m)
nutpie.sample(cm)
I get these weird warnings from pytensor/numba
/tmp/tmps3yx0tdg:1: NumbaWarning: Cannot cache compiled function "numba_funcified_fgraph" as it uses dynamic globals (such as ctypes pointers and large global arrays)
def numba_funcified_fgraph(scalar_variable, scalar_variable_1, scalar_variable_7, scalar_variable_11, scalar_variable_3, scalar_variable_5, scalar_variable_15, scalar_variable_19):
site-packages/nutpie/compile_pymc.py:362: NumbaWarning: Cannot cache compiled function "numba_funcified_fgraph" as it uses dynamic globals (such as ctypes pointers and large global arrays)
return inner(x, *_shared_tuple)
And then a ValueError
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [43], in <cell line: 17>()
14 llike = pm.Bernoulli("llike", p=p_accept, observed=accepted)
16 cm = nutpie.compile_pymc_model(m)
---> 17 nutpie.sample(cm)
site-packages/nutpie/sample.py:101, in sample(compiled_model, draws, tune, chains, seed, num_try_init, save_warmup, store_divergences, progress_bar, init_mean, store_unconstrained, **kwargs)
98 if init_mean is None:
99 init_mean = np.zeros(compiled_model.n_dim)
--> 101 sampler = lib.PyParallelSampler(
102 compiled_model.logp_func_maker,
103 init_mean,
104 settings,
105 n_chains=chains,
106 n_draws=draws,
107 seed=seed,
108 n_try_init=num_try_init,
109 )
111 expand_draw = compiled_model.expand_draw_fn
113 def do_sample():
ValueError: Logp function returned error Initialization of first point failed
I think this is a consequence of the initial point choice in nutpie. It currently tries to initialize all parameters in the transformed space in (-1, 1). But in that range we just can't find any valid points. The problem goes away if we change the scaling of the variables (ie we measure everything in multiples of 10_000):
with pm.Model() as m:
proposed_pay = pm.MutableData("proposed_pay", np.array([5.0, 20.0], dtype="float64"))
accepted = pm.MutableData("accepted", np.array([0, 1], dtype="int64"))
mean = pm.Normal("mu", mu=10.0, sigma=2.5)
std = pm.Gamma("std", mu=2.5, sigma=.5)
p_accept = 1 - pm.logcdf(pm.Normal.dist(mean, std), proposed_pay).exp()
p_accept = pm.Deterministic("p_accept", p_accept)
llike = pm.Bernoulli("llike", p=p_accept, observed=accepted)
cm = nutpie.compile_pymc_model(m)
tr = nutpie.sample(cm)
It would be better to initialize at draws from the prior though...
So the difference with PyMC is there there we do transformed(moment) +-1
and here we do 0 +- 1
?
Could we use the PyMC initial point /moment logic if it's more stable?
I think so. There already is an option to set a mean, we'd only have to plug in the moment there.
A workaround at the PyMC level can be as short as 4 lines, but the latest nutpie release was before the init_mean
kwarg was added.
To fix this at the nutpie level, the init_means
could be added as another attribute on the compiled_model
object.
Today I'll try to workaround with local hotfixes, but I should be able to make a PR for this by the end of the week.
This is the workaround for the PyMC level, however it didn't help for my model 🤔
compiled_model = nutpie.compile_pymc_model(model)
# Pass transformed, concatenated initial values until nutpie does it itself
initial_point = model.initial_point()
initial_means = np.concatenate([initial_point[model.rvs_to_values[var].name].flatten() for var in model.free_RVs])
idata = nutpie.sample(
compiled_model,
draws=draws,
tune=tune,
chains=chains,
target_accept=target_accept,
init_mean=initial_means,
seed=_get_seeds_per_chain(random_seed, 1)[0],
progress_bar=progressbar,
**nuts_sampler_kwargs,
)
The point must be on the unconstrained space, but I think this will instead use values on the constrained space?
The point must be on the unconstrained space, but I think this will instead use values on the constrained space?
The pmodel.initial_point()
dictionary has only the unconstrained ones, e.g. noise_log__
So unless I'm mixing up the definitions these should be right, no?
Can confirm that this fix does not work, in general. Also tried with random jitter around initial point. Perhaps this is not the issue?
Here's a rather minimal example to reproduce the issue:
import io
import numpy as np
import pandas as pd
import pymc as pm
import pytensor.tensor as pt
def build_model(df_data, *, hsgp: bool):
with pm.Model(
coords={
"records": np.arange(len(df_data)),
}
) as pmodel:
# Store data
X = pm.ConstantData("X", df_data.x.to_numpy(), dims="records")
Y = pm.ConstantData("Y", df_data.y.to_numpy(), dims="records")
Y_std = pm.ConstantData("Y_std", pt.std(Y).eval())
Y_mean = pm.ConstantData("Y_mean", pt.mean(Y).eval())
# Model the (normalized) latent trend
ls = pm.LogNormal("ls", mu=np.log(0.5), sigma=0.2)
noise = pm.HalfNormal("noise", sigma=0.05)
cov = noise**2 * pm.gp.cov.ExpQuad(1, ls=ls)
mean = pm.gp.mean.Constant(Y_mean)
if hsgp:
gp = pm.gp.HSGP(m=[30], c=4.0, cov_func=cov, mean_func=mean)
else:
gp = pm.gp.Latent(cov_func=cov, mean_func=mean)
ylatent = gp.prior("ylatent", X[:, None], dims="records")
# Connect to observations
pm.Normal("L", mu=ylatent, sigma=Y_std / 3, observed=Y, dims="records")
# Keep a handle on the GP
pmodel.gp = gp
return pmodel
def analyze(df_data, *, build_kwargs, sample_kwargs):
pmodel = build_model(df_data, **build_kwargs)
with pmodel:
idata = pm.sample(
chains=4, tune=2000,
target_accept=0.9, random_seed=1234,
**sample_kwargs,
)
return idata
df_data = pd.read_csv(io.StringIO("""
,x,y
0,6.5,0.03847670954287112
1,7.0,0.040795546149772384
2,7.5,0.04005530626829538
3,6.5,0.03800804967481005
4,7.0,0.042606645754122346
5,7.5,0.03962986979767001
6,6.5,0.03975987684954445
7,7.0,0.042854077804484525
8,7.5,0.0427959406500711
9,6.5,0.0376618863654496
10,7.0,0.043800640042141875
11,7.5,0.04280278855102723
"""), index_col=0)
analyze(
df_data,
build_kwargs=dict(hsgp=True),
sample_kwargs=dict(nuts_sampler="nutpie"),
)
@michaelosthege This does run on my machine, maybe this was fixed with pymc-devs/pytensor#343?
I do see some divergences, but that happens with nutpie and the default sampler.
Confirmed that a new env with PyMC 5.5.0 and PyTensor 2.12.3 fixed this MRE on my machine too.
I'll re-run my benchmarking notebooks next week to see if it fixed all instances I had run into.
Shall we close this and re-open if needed?
I guess I'll close this issue then, but feel free to reopen (or open a new one) if this comes up again...