Bayesian Model Sampling Fails for Some Networks
AbeleMM opened this issue · comments
Description
All forms of Bayesian Model Sampling (forward_sample
, likelihood_weighted_sample
, rejection_sample
) seem to fail in some cases for some networks (e.g., pathfinder).
Environment
- pgmpy 0.1.22
- Python 3.10
- Windows & Linux
Steps to reproduce
from pgmpy.sampling import BayesianModelSampling
from pgmpy.utils import get_example_model
BayesianModelSampling(get_example_model("pathfinder")).forward_sample(size=50, seed=1)
Expected behaviour
Sampling completes succesfully.
Actual behaviour
The execution errors out with the following traceback:
pgmpy\sampling\Sampling.py", line 120, in forward_sample
sampled[node] = sample_discrete_maps(
pgmpy\utils\mathext.py", line 181, in sample_discrete_maps
samples[weight_indices == weight_index] = np.random.choice(
File "mtrand.pyx", line 972, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities are not non-negative
Possible fix
The issue seems to be caused by the following line in the _adjusted_weights
method:
Line 85 in 1e73ff2
If error
is negative and weights[-1]
has a very small value, the change could cause the latter to become negative too.
Replacing it with the following could be a solution that should also preserve existing behaviour when the issue does not arise: weights[len(weights) - 1 - np.argmax(weights[::-1] > -error)] += error
@AbeleMM Thanks for reporting the issue. The possible fix that you have mentioned (if I am understanding it correctly) is to add the error to the largest value in the array? Wouldn't a simpler way to do that would be something like weights[np.argmax(weights)] += error
? Would you like to open a PR fixing this?
The proposed fix adds error
to the last array element greater than -error
to maintain existing behaviour (adding to weights[-1]
) as much as possible. A different element only gets selected if the original operation yields a nonpositive value.
Using weights[np.argmax(weights)] += error
would also solve the issue and is more readable, although slightly deviating from the existing implementation. Note, however, that both methods still (reasonably) assume the existence of some element that can entirely "absorb" a negative error
.
I would be happy to create a PR for the issue; please let me know whether there is a preference for which version to use.
@AbeleMM The idea behind adding the error term is to have the weights such that it sums to 1. It doesn't really matter which weight we add the error to. And since the checks make sure that the error is sufficiently small, I think just adding it to the largest element should always work and is a cleaner solution. What do you think? If you agree, please open a PR. Thanks :)