SteinDiscrepancy.jl

What's a Stein discrepancy?

To improve the efficiency of Monte Carlo estimation, practitioners are turning to biased Markov chain Monte Carlo procedures that trade off asymptotic exactness for computational speed. The reasoning is sound: a reduction in variance due to more rapid sampling can outweigh the bias introduced. However, the inexactness creates new challenges for sampler and parameter selection, since standard measures of sample quality like effective sample size do not account for asymptotic bias. To address these challenges, we introduced new computable quality measures, based on Stein's method, that quantify the maximum discrepancy between sample and target expectations over large classes of test functions. We call these measures Stein discrepancies.

For a more detailed explanation, take a peek at the latest papers:

Measuring Sample Quality with Diffusions, Measuring Sample Quality with Kernels.

These build on previous work from

Measuring Sample Quality with Stein's Method

and its companion paper

Multivariate Stein Factors for a Class of Strongly Log-concave Distributions.

These latter two papers are a more gentle introduction describing how the Stein discrepancy bounds standard probability metrics like the Wasserstein distance.

Where has it been used?

Since its introduction in Measuring Sample Quality with Stein's Method, the Stein discrepancy has been incorporated into a variety of applications including:

Hypothesis testing
- A Kernelized Stein Discrepancy for Goodness-of-fit Tests and Model Evaluation
- A Kernel Test of Goodness of Fit
Variational inference
Importance sampling
- Black-box Importance Sampling
- Stein Variational Adaptive Importance Sampling
Training generative adversarial networks (GANs)
- Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning
Training variational autoencoders (VAEs)
- Stein Variational Autoencoder
Sample quality measurement

So how do I use it?

This software has been tested on Julia v0.5. This release implements two classes of Stein discrepancies: graph Stein discrepancies and kernel Stein discrepancies.

Graph Stein discrepancies

Computing the graph Stein discrepancy requires solving a linear program (LP), and thus you'll need some kind of LP solver installed to use this software. We use JuMP (Julia for Mathematical Programming) to interface with these solvers; any of the supported JuMP LP solvers with do just fine.

Once you have an LP solver installed, computing our measure is easy. Below we'll first show how to compute the Langevin graph Stein discrepancy for a univariate Gaussian target:

# import the gsd function (graph Stein discrepancy)
using SteinDiscrepancy: gsd
# define the grad log density of univariate Gaussian (we always expect vector inputs!)
function gradlogp(x::Array{Float64,1})
    -x
end
# generates 100 points from N(0,1)
X = randn(100)
# can be a string or a JuMP solver
solver = "clp"
result = gsd(points=X, gradlogdensity=gradlogp, solver=solver)
graph_stein_discrepancy = result.objectivevalue[1]

Here's another example that will compute the Langevin graph Stein discrepancy for a bivariate uniform sample (notice this target distribution has a bounded support so these bounds become part of the input parameters):

# do the necessary imports
using SteinDiscrepancy: gsd
# define the grad log density of bivariate uniform target
function gradlogp(x::Array{Float64,1})
    zeros(size(x))
end
# generates 100 points from Unif([0,1]^2)
X = rand(100,2)
# can be a string or a JuMP solver
solver = "clp"
result = gsd(points=X,
             gradlogdensity=gradlogp,
             solver=solver,
             supportlowerbounds=zeros(2),
             supportupperbounds=ones(2))
discrepancy = vec(result.objectivevalue)

The variable discrepancy here will encode the Stein discrepancy along each dimension. The final discrepancy is just the sum of this vector.

Kernel Stein discrepancies

Computing the kernel Stein discrepancies does not require a LP solver. In lieu of a solver, you will need to specify a kernel function. Many common kernels are already implemented in src/kernels, but if yours is not there for some reason, feel free to inherit from the SteinKernel type and roll your own.

With a kernel in hand, computing the kernel Stein discrepancy is easy:

# import the kernel stein discrepancy function and kernel to use
using SteinDiscrepancy: SteinInverseMultiquadricKernel, ksd
# define the grad log density of standard normal target
function gradlogp(x::Array{Float64,1})
    -x
end
# grab sample
X = randn(500, 3)
# create the kernel instance
kernel = SteinInverseMultiquadricKernel()
# compute the KSD2
result = ksd(points=X, gradlogdensity=gradlogp, kernel=kernel)
# get the final ksd
kernel_stein_discrepancy = sqrt(result.discrepancy2)

We should note that in order to compute the KSD for a target with a compact support, one must define a SteinKernel which imposes this property (and not use the supportlowerbounds and supportupperbounds arguments). See SteinGaussianRectangularDomainKernel in the src/kernels code directory for an example.

Summary of the Code

All code is available in the src directory of the repo. Many examples for computing the stein_discrepancy are in the test directory.

Contents of src

discrepancy - Code for computing Stein discrepancy
kernels - Commonly used kernels

Compiling Code in discrepancy/spanner directory

Our C++ code should be compiled automatically when the package is built. However, if this doesn't work for some reason, you can issue the following commands to compile the code in src/discrepancy/spanner:

cd <PACKAGE_DIR>/src/discrepancy/spanner
make
make clean

The last step isn't necessary, but it will remove some superfluous files. If you want to kill everything made in the build process, just run

make distclean

About

Practical tools for quantifying how well a sample approximates a target distribution

Other

Languages

Language:Julia 85.5%Language:C++ 12.8%Language:Makefile 1.4%Language:QMake 0.2%