cheng-zhao / prand

C library for generating random numbers with multiple streams

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

prand: parallel random number generator

GitHub Codacy grade

Table of Contents

Introduction

This is an efficient library written in C, for generating pseudo random numbers with multiple streams. In particular, for a given random sequence, multiple streams can be generated from user-defined equally spaced starting points. This is to ensure that for a parallel version of a program, the starting points for different processors/threads can be chosen in a way that the random number sequences are exactly the same as in a serial version of the same program.

For instance, to generate the random sequence {r1, r2, …, rn×m}, one can use n threads starting from rj×m+1, for j = 0, 1, …, n−1, and generate only m numbers for each thread. This is equivalent to generating all (n × m) numbers in sequence with a single stream.

To this end, efficient jump-ahead algorithms are implemented in this library, to permit arbitrary choices of the starting points of random streams, with relatively low computational costs. And this is hopefully useful for Monte-Carlo simulations that are massively parallelised and require strict reproducibility, especially if the same number of processors/threads is not always available.

This library is compliant with the ISO C99 standard, thread-safe, and does not rely on any external libraries. It is written by Cheng Zhao (赵成), and is distributed under the MIT license.

[TOC]

Compilation and linking

On most unix-like systems, the library can be simply compiled with

$ make
$ make install

By default a static library libprand.a is created in the lib subfolder, and a header file prand.h is copied to the include subfolder, of the current working directory. One can change the PREFIX entry in Makefile to customise the installation path of the library.

To link the library with a program, one has to add the -lprand flag for the compilation. And if this library is not installed in the default path for system libraries, the -I and -L options are also necessary for specifying the path to the header and library files. An example of the Makefile for linking prand is provided in the example folder.

[TOC]

Getting started

Random number generator initialisation

Before calling any random number generation functions, the interface of the random number generator has to be initialised by

prand_t *prand_init(const prand_rng_enum type, const uint64_t seed,
    const unsigned int nstream, const uint64_t step, int *err)

The arguments are:

  • type: a pre-defined enumerator indicating the random number generation algorithm;
  • seed: a positive integer for initialising the random number generator;
  • nstream: number of streams that can be sampled in parallel;
  • step: the length of number sequences between adjacent streams;
  • err: a variable for storing the error status, see Error handling.

If seed is set to 0, then a warning is generated, and the default random seed for the given generator will be used. And if a generator accepts only 32-bit seeds, the supplied seed value will be truncated, i.e., reduced modulo 232. Once the interface is initialised with multiple streams, the i-th (i = 0, 1, …, nstream−1) stream starts generating numbers from ri×step+1 of the sequence {r1, r2, …}.

The implemented random number generation algorithms, as well as the corresponding seed values and maximum allowed lengths for a single jump ahead operation are listed below.

Algorithm Enumerator for type seed Maximum step*
MRG32k3a[1] PRAND_RNG_MRG32K3A 32-bit 263−1
Mersenne Twister 19937[2] PRAND_RNG_MT19937 32-bit 263−1

* This limitation is only for a single jump ahead operation. The total skipped length can be larger than this value if jumping ahead for multiple times (see Revising random states).

All the random number generation routines rely on the interface created by this prand_init function. As an example, the interface can be initialised as follows

int err = 0;
prand_t *rng = prand_init(PRAND_RNG_MRG32K3A, 1, 8, 10000, &err);

[TOC]

Sampling a uniform distribution

If there is only one stream, one can generate a new random integer by

long x = rng->get(rng->state);                    /* for single stream */

Here rng is the interface initialised in the previous section. While if there are multiple streams, generating a random integer for the i-th stream should be done with

long x = rng->get(rng->state_stream[i]);          /* for multiple streams */

The minimum and maximum integer values that can be generated are stored in rng->min and rng->max, respectively. These values can be used with the rng->get function to sample uniform numbers in a custom range. For instance, we have provided the function for generating a double-precision floating-point number uniformly distributed in the range [0, 1):

double z = rng->get_double(rng->state);           /* for single stream */
double z = rng->get_double(rng->state_stream[i]); /* for multiple streams */

And these functions evaluate essentially z = x / (rng->max + 1).

Similarly, the following functions are for sampling a uniform double-precision floating-point number in the range (0, 1):

double z = rng->get_double_pos(rng->state);           /* for single stream */
double z = rng->get_double_pos(rng->state_stream[i]); /* for multiple streams */

They are equivalent to z = (x + 1) / (rng->max + 2).

[TOC]

Sampling a Gaussian distribution

It is also a common practice to generate random numbers that follows a Gaussian probability distribution. There are several different ways of sampling Gaussian random numbers from a uniform distribution[3]. However, to ensure the reproducibility with multiple streams, rejection methods are not appropriate, including the fast and robust Ziggurat method[4]. Instead, we recommend the simple Box–Muller transform method[5] for generating random numbers following the Gaussian distribution.

[TOC]

Revising random states

It is sometimes necessary to jump ahead after having sampled some numbers, either for a specific stream or all streams. This can be done with the following functions provided with the random number generator interface:

rng->jump(void *state, const uint64_t step, int *err);
rng->jump_all(rng, const uint64_t step, int *err);

With rng->jump only the stream associated with state is altered. And state can be either rng->state or rng->state_stream[i], depending on how many streams have been initialised. While for rng->jump_all, the first argument must be the interface itself. And once it is called, all the streams will move forward with the length step.

The states of the interface can also be reset, with another random seed and/or jump ahead with a new step size from the initial starting point of the sequence:

rng->reset(void *state, const uint64_t seed, const uint64_t step, int *err);
rng->reset_all(rng, const uint64_t seed, const uint64_t step, int *err);

Here, rng->reset resets the status of a stream indicated by state, with a jump ahead step counted from the starting point of the random number sequence initialised by seed. And rng->reset_all is equivalent of re-initialising the random number generator interface without releasing and re-allocating memory, albeit it is not possible to change the generation algorithm and number of streams with this function.

[TOC]

Releasing memory

Once the random number generator is not needed anymore, the interface has to be deconstructed to release the allocated memory, by simply calling

void prand_destroy(prand_t *rng);

[TOC]

Error handling

An integer err has to be supplied to some of the functions for storing error status. And two macros, PRAND_IS_ERROR(err) and PRAND_IS_WARN(err), are defined for checking whether there is an error or warning message, respectively. Furthermore, the following function returns a string describing the problem in more detail:

const char *prand_errmsg(const int err);

[TOC]

Examples

An example for the usage of this library is provided in the example folder.

It initialises two versions of the random number generator with the same seed, one with a single stream and the other with multiple streams. And the first floating-point number generated by the multiple streams are compared with the ones generated in sequence by the single stream version.

[TOC]

Random number generator algorithms

Algorithm Reference Period Range of integers Jump ahead algorithm
MRG32k3a [1] 2191 [0, 232−209] [6]
Mersenne Twister 19937 [2] 219937−1 [0, 232−1] [7]

[TOC]

Acknowledgements

This library benefits from the following open-source projects:

[TOC]

References

[1] L’Ecuyer, 1999, Good Parameters and Implementations for Combined Multiple Recursive Random Number Generators, Operations Research, 47(1):159–164

[2] Matsumoto & Nishimura, 1998, Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator, ACM Trans. Model. Comput. Simul., 8(1):3–30 (home page)

[3] Thomas, Luk, Leong & Villasenor, 2007, Gaussian Random Number Generators, ACM Comput. Surv., 39(4):11–es

[4] Marsaglia & Tsang, 2000, The Ziggurat Method for Generating Random Variables, Journal of Statistical Software, 5(8):1–7

[5] Box & Muller, 1958, A Note on the Generation of Random Normal Deviates, Ann. Math. Statist., 29(2):610–611

[6] Bradley, du Toit, Tong, Giles & Woodhams, 2011, Parallelization Techniques for Random Number Generators, GPU Computing Gems Emerald Edition, Morgan Kaufmann, 231–246

[7] Haramoto, Matsumoto & L'Ecuyer, 2008, A Fast Jump Ahead Algorithm for Linear Recurrences in a Polynomial Space, Sequences and Their Applications – SETA 2008, Springer Berlin Heidelberg, 290–298

[TOC]

About

C library for generating random numbers with multiple streams

License:MIT License


Languages

Language:C 99.9%Language:Makefile 0.1%