pgmpy / pgmpy

Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

Home Page:https://pgmpy.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bayesian Model Sampling Fails for Some Networks

AbeleMM opened this issue · comments

Description

All forms of Bayesian Model Sampling (forward_sample, likelihood_weighted_sample, rejection_sample) seem to fail in some cases for some networks (e.g., pathfinder).

Environment

  • pgmpy 0.1.22
  • Python 3.10
  • Windows & Linux

Steps to reproduce

from pgmpy.sampling import BayesianModelSampling
from pgmpy.utils import get_example_model


BayesianModelSampling(get_example_model("pathfinder")).forward_sample(size=50, seed=1)

Expected behaviour

Sampling completes succesfully.

Actual behaviour

The execution errors out with the following traceback:

  pgmpy\sampling\Sampling.py", line 120, in forward_sample
    sampled[node] = sample_discrete_maps(
  pgmpy\utils\mathext.py", line 181, in sample_discrete_maps
    samples[weight_indices == weight_index] = np.random.choice(
  File "mtrand.pyx", line 972, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities are not non-negative

Possible fix

The issue seems to be caused by the following line in the _adjusted_weights method:

weights[-1] += error

If error is negative and weights[-1]has a very small value, the change could cause the latter to become negative too.

Replacing it with the following could be a solution that should also preserve existing behaviour when the issue does not arise: weights[len(weights) - 1 - np.argmax(weights[::-1] > -error)] += error

@AbeleMM Thanks for reporting the issue. The possible fix that you have mentioned (if I am understanding it correctly) is to add the error to the largest value in the array? Wouldn't a simpler way to do that would be something like weights[np.argmax(weights)] += error? Would you like to open a PR fixing this?

The proposed fix adds error to the last array element greater than -error to maintain existing behaviour (adding to weights[-1]) as much as possible. A different element only gets selected if the original operation yields a nonpositive value.

Using weights[np.argmax(weights)] += error would also solve the issue and is more readable, although slightly deviating from the existing implementation. Note, however, that both methods still (reasonably) assume the existence of some element that can entirely "absorb" a negative error.

I would be happy to create a PR for the issue; please let me know whether there is a preference for which version to use.

@AbeleMM The idea behind adding the error term is to have the weights such that it sums to 1. It doesn't really matter which weight we add the error to. And since the checks make sure that the error is sufficiently small, I think just adding it to the largest element should always work and is a cleaner solution. What do you think? If you agree, please open a PR. Thanks :)