Multinomial sampling is very slow

Question

Multinomial sampling is very slow

jamaltas opened this issue a year ago · comments

Looking into a program I wrote I found the limiting factor was the multinomial sampling:

let mut rng = SmallRng::from_entropy();
let result = Multinomial::new(&weights, 100000).unwrap();
let counts = result.sample(&mut rng);

Specifically the sampling portion of the above code, the SmallRng call is small in comparison.

I rewrote this particular portion of my code in python using numpy.random.multinomial and found an ~400x speed increase. It appears the C code numpy calls on uses an implementation that chains many binomial calls together, whereas the statsrs implementation uses a cdf.

Wonder if there's any plans to change this?

Vinzent Steinberg · Answer 1 · Fri Feb 10 2023 02:03:30 GMT+0800 (China Standard Time)

Did you compile the Rust code with optimizations enabled?

jamaltas · Answer 2 · Fri Feb 10 2023 03:18:16 GMT+0800 (China Standard Time)

Yep. It appears the compiled C that numpy uses employs an algorithm known at BTPE which is significantly faster binomial sampler when p*n > 30. Which is my use case.

Is there any interest in an implementation of this algorithm?

Orion Yeung · Answer 3 · Thu May 02 2024 05:52:36 GMT+0800 (China Standard Time)

I think this could be useful, but I don't have the background for it.

I did notice that rand_distr::Binomial uses the BTPE algorithm. Would you know how to extend BTPE for multinomial from an implementation for binomial?

More broadly, perhaps we should expose the rand_distrs versions of sample when available for performance. It relies only on num_traits and it's in our dependency tree from nalgebra