probcomp / Gen.jl

A general-purpose probabilistic programming system with programmable inference

Home Page:https://gen.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

out of bounds samples in piecewise_uniform

aldopareja opened this issue · comments

I'm not sure what am I doing wrong, but I'm getting an error with this code:

using Gen

@gen function model()
  x ~ uniform_continuous(0.0,1.0)
  v ~ normal(0.0,1.0)
end

@gen function fail_piecewise_uniform(prev_trace, offset::Float64, addr::Symbol)
  num_bins::Int = 10
  bin_size = 1/num_bins
  
  idx = [categorical([0.5,0.5])  
           for i = 1:num_bins]
  probs = getindex([1.0,100.0], idx)
  probs ./= sum(probs)

  bounds = collect(0.0:bin_size:1.0)
  anchored_bounds = bounds .- offset
  out = {addr} ~ piecewise_uniform(anchored_bounds, probs)
  if out + offset > bounds[end]
    throw(DomainError(out + offset, "greater"))
  end
end

function inference()
  # [Gen.mh(trace, select(:offset, :out)) for _ in 1:10000]
  traces = [Gen.simulate(model, ()) for i=1:10000]
  for t in traces
    selection = select(:x,:v)
    t, _ = Gen.mh(t,fail_piecewise_uniform, (0.0, :x))
    offset = Gen.get_choice(t,:x).retval
    t, _ = Gen.mh(t,fail_piecewise_uniform, (offset, :v))
  end
end

inference()

It fails at the throw(Domain...) line. It is supposed to not sample anything above bounds after correcting for the offset, but it is. I don't know how to debug this.

Any pointers or help?

@aldopareja A few breadcrumbs:

  1. The code fails stochastically:
[ Info: 11
[ Info: (0.5540822686247387, 6, 0.5, 0.6)
[ Info: (0.5540822686247387, 0.0)
[ Info: (0.33623835364936516, 0.0)
[ Info: (0.5403092485648128, 9, 0.4637616463506349, 0.5637616463506349)
[ Info: (0.5403092485648128, 0.33623835364936516)
[ Info: (3.003332763079517, 0.33623835364936516)

this is a segment of running inference() -- the top number is the index into enumeration over the loop in inference() -- if you repeatedly run this, it fails at different indices.

  1. It's not the call to piecewise_uniform(...)

I'm printing this out each time the loop iterates (this is the Tuple{Float64, Int, Float64, Float64} in the info print out) -- this is always in the correct bounds as specified for piecewise_uniform.

  1. out is modified after sampling.

First, we sample it (within the correct bounds) from piecewise_uniform -- then (without another sampling step), it changes.

Ultimately, this change causes the error. If we look at the implementation of Gen.mh when you provide a custom proposal, you'll notice a few things:

function metropolis_hastings(
        trace, proposal::GenerativeFunction, proposal_args::Tuple;
        check=false, observations=EmptyChoiceMap())
    # TODO add a round trip check
    model_args = get_args(trace)
    argdiffs = map((_) -> NoChange(), model_args)
    proposal_args_forward = (trace, proposal_args...,)
    
    # First call here!
    (fwd_choices, fwd_weight, _) = propose(proposal, proposal_args_forward)
    (new_trace, weight, _, discard) = update(trace,
        model_args, argdiffs, fwd_choices)
    proposal_args_backward = (new_trace, proposal_args...,)
    
    # Second call here!
    (bwd_weight, _) = assess(proposal, proposal_args_backward, discard)
    
    alpha = weight - fwd_weight + bwd_weight
    check && check_observations(get_choices(new_trace), observations)
    if log(rand()) < alpha
        # accept
        return (new_trace, true)
    else
        # reject
        return (trace, false)
    end
end

To properly evaluate the accept/reject ratio -- we have to evaluate any discarded choices from the model trace under the proposal. You've specified v ~ normal(0.0, 1.0) -- which can be sampled unconstrained outside the bounds of the piecewise_uniform.

I don't fully understand the out + offset logic -- but this is how your error occurs.

Note that, if you cannot evaluate the discard under the proposal (e.g. the log density returns -Inf -- the kernel is not valid. The proposal does not cover the full support of the projected address space under the model.

Hope this helps! Please close if it does.

Thank you for the fast response!.

In a nutshell, if I subtract a value from the bounds of a uniform and then add it back after sampling, the result should still be within the original bounds. I hope this code makes it a bit clearer:

new_bounds = bounds - offset
sample ~ piecewise_uniform(new_bounds, probs)
sample + offset <= bounds[end] #should always be true

See the problem?, or is there something I'm missing?

Some other answers to your questions:

I don't fully understand the out + offset logic -- but this is how your error occurs.

I'm using a similar logic on a different inference problem where I first need the value of a piecewise uniform to condition the value of the second.

To properly evaluate the accept/reject ratio -- we have to evaluate any discarded choices from the model trace under the proposal. You've specified v ~ normal(0.0, 1.0) -- which can be sampled unconstrained outside the bounds of the piecewise_uniform.

I understand this is not a valid kernel since the proposal doesn't cover the range of the prior, but this is not the case in my particular use case.

I understand this is not a valid kernel since the proposal doesn't cover the range of the prior, but this is not the case in my particular use case.

The way the out of bounds error arises is because the sample choice from normal(0.0, 1.0) has to be evaluated under the proposal -- the support of that normal(0.0, 1.0) is all of R, so sometimes you're going to get values that are outside of the bounds of your piecewise_uniform.

The assess call in Gen.mh above will fix out = that_normal_sample to evaluate the weight.

Even if the kernel is not valid and that's okay for you, this issue will still hit that code branch, as far as I can tell.

Re -- your change above does not fix this issue, because assess goes "let me fix this random choice to whatever v was under the model" -- of course, because that v is outside the bounds of piecewise_uniform, the log weight goes to -Inf. Then, the out value propagates down and falls right outside the bounds into your error branch.

EDIT: yes it works now with the right support.

just changed the code to make the proposal valid. And btw, how does INFO work?

using Gen

@gen function model()
  x ~ uniform_continuous(0.0,1.0)
  v ~ uniform_continuous(0.0-x,1.0-x)
end

@gen function fail_piecewise_uniform(prev_trace, offset::Float64, addr::Symbol)
  num_bins::Int = 10
  bin_size = 1/num_bins
  
  idx = [categorical([0.5,0.5])  
           for i = 1:num_bins]
  probs = getindex([1.0,100.0], idx)
  probs ./= sum(probs)

  bounds = collect(0.0:bin_size:1.0)
  anchored_bounds = bounds .- offset
  out = {addr} ~ piecewise_uniform(anchored_bounds, probs)
  if out + offset > bounds[end]
    throw(DomainError(out + offset, "greater"))
  end
end

function inference()
  # [Gen.mh(trace, select(:offset, :out)) for _ in 1:10000]
  traces = [Gen.simulate(model, ()) for i=1:10000]
  for t in traces
    selection = select(:x,:v)
    t, _ = Gen.mh(t,fail_piecewise_uniform, (0.0, :x))
    offset = Gen.get_choice(t,:x).retval
    t, _ = Gen.mh(t,fail_piecewise_uniform, (offset, :v))
  end
end

inference()

@aldopareja Use @info will basically give you a small [Info: ...] box when you run the code.

You can do things like @info 5 or @info some_data in your code.

I use it similar to printf debugging.