Multinomial sampling is very slow
jamaltas opened this issue · comments
Looking into a program I wrote I found the limiting factor was the multinomial sampling:
let mut rng = SmallRng::from_entropy();
let result = Multinomial::new(&weights, 100000).unwrap();
let counts = result.sample(&mut rng);
Specifically the sampling portion of the above code, the SmallRng call is small in comparison.
I rewrote this particular portion of my code in python using numpy.random.multinomial and found an ~400x speed increase. It appears the C code numpy calls on uses an implementation that chains many binomial calls together, whereas the statsrs implementation uses a cdf.
Wonder if there's any plans to change this?
Yep. It appears the compiled C that numpy uses employs an algorithm known at BTPE which is significantly faster binomial sampler when p*n > 30. Which is my use case.
Is there any interest in an implementation of this algorithm?
I think this could be useful, but I don't have the background for it.
I did notice that rand_distr::Binomial
uses the BTPE algorithm. Would you know how to extend BTPE for multinomial from an implementation for binomial?
More broadly, perhaps we should expose the rand_distrs
versions of sample
when available for performance. It relies only on num_traits
and it's in our dependency tree from nalgebra