uniform and non-uniform pseudo random numbers generators for high dimensional applications

50
Outline Uniform and non-uniform pseudo random numbers generators for high dimensional applications Régis LEBRUN EADS Innovation Works January 28th 2010 ©EADS IW 2010 BigMC seminar, January 28th 2010

Upload: lebrun-regis

Post on 27-May-2015

5.381 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Outline

Uniform and non-uniform pseudo random numbersgenerators for high dimensional applications

Régis LEBRUN

EADS Innovation Works

January 28th 2010

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 2: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Outline

Outline

1 Pseudo-random numbers generator (PRNG)

2 Uniform PRNGs

3 Non-uniform PRNGs

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 3: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Pseudo-random numbers generator (PRNG)

Any one who considers arithmetical methods ofproducing random digits is, of coures, in a state ofsin.

– John Von NEUMANN (1951)

cited in The Art of Computer Programming, Donald E. KNUTH

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 4: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Stream of independent, U(0, 1) random variables

Uniformity and independenceThe basic building block of stochastic simulation software isthe uniform random numbers generator.We are looking for a routine capable of producingindependent, identically distributed realizations of the uniformdistribution over [0, 1].We want to do it algorithmically and deterministically !

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 5: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

What is a pseudo-number generator ?

Definition, from [1]

A (pseudo)random number generator is a structure G = (S , s0,T ,U,G )where :

S is a finite set of states,

s0 ∈ S is the initial state (or seed),

the mapping T : S → S is the transition function,

U is a finite set of output symbols,

and G : S → U is the output function.

The state of the generator evolves according to the recurrencesn = T (sn−1) and the generator outputs the number un = G (sn), calledthe random numbers.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 6: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

What is a pseudo-number generator ?

Some comments

Nothing about randomness in this definition !

No condition on the repartition of (un)n∈N !

Will be periodic, with a period P ≤ cardT .

Such a device is suppose to be able to produce a stream of numbersthat "looks like" a sequence of realizations of iid uniform randomvariables, i.e. that is not statistically different from a true randomsequence of random numbers.

The temptative to define what is a random sequence fills more than30 pages in [2]...

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 7: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

What is a good pseudo-number generator ?

Uniformity and independence

Uniformity over [0, 1] cannot be achieved on a computer due to thefinite precision of numbers representation, but uniform distributionover Ω = 0/m, 1/m, . . . ,m/m can be (quite well) approximatedthrough the equirepartition property over Ω.

Independence between realizations cannot be achieved due to thedeterministic nature of the recurrence in the generator, but can be(quite well) approximated due to the mixing property of T . This lastpoint can be made more consistent using the copula theory.

We will illustrate the potential pitfalls in designing a PRNG through theexample of Linear Congruential Generator and more precisely theMultiplicative Congruential Generators.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 8: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Empirical validation of PRNGs

Statistical testsSo, a PRNG has to be extensively checked to be used in serioussimulations. The tests must check :

The uniformity of the PRNG output : Kolmogorov-Smirnov,Chi-square, gap tests.

The independence of the PRNG output : serial test, spectral test,autocorrelation

These tests can be classified into :

Empirical tests : one test for a property on a large sample producedby the PRNG

Analytical tests : one check a property on the whole set U ofpossible output of a PRNG

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 9: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Linear congruential generators (LCG)

Definition

Let m > 0 (the modulus), a > 0 (the multiplier) and c (the additiveconstant) be integers. The linear congruential generator associated withthe triplet (m, a, c) is given by :

S = 0, . . . ,m − 1 for c > 0 and S = 1, . . . ,m − 1 for c = 0

sn = T (sn−1) = a.sn−1 + c mod m

U = 0, 1/m, . . . , 1− 1/m for c > 0 and U = 1/m, . . . , 1− 1/mfor c = 0

un = G (sn) = sn/m

When c = 0, these generators are called multiplicative congruencialgenerators. They are often used due to the extra gain in speed.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 10: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Linear congruential generators

Properties

It is the first class of PRNGs for which a mathematical theory hasbeen built,

They are fast,

They are easy to implement,

They are (still) widespreads (rand(), drand48()...)

But they have serious weaknesses :

Short period in typical implementations (m = 232) ;

Bad repartition in high dimension ;

Lower order bits have a shorter period than full numbers (so don’tuse yn = sn mod M if you want integers in 0, . . . ,M − 1 fromintegers in 0, . . . ,m − 1).

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 11: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators (MCG)

How to choose m, a ?

If m = 2M ,M ≥ 3, the maximal period is 2M−2, attained for all oddseed if a mod 8 ≡ 3 or 5.

The period m − 1 is attained only if m is prime. In this case, theperiod divides m − 1, and is m − 1 if and only if a is a primitiveroot, that is, a 6= 0 and a(m−1)/p 6≡ 1 mod m for each prime factor pof m − 1.

More to read in [2].

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 12: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators

Problem with high dimension

Suppose that one want to sample uniformly the hypercube [0, 1]d . Thenatural way to do this is to consider the points (udn, udn+1, ..., u(d+1)n−1).

One can generate at most m (or m − 1) points in [0, 1]d , which willnecessarily be very scarcely distributed in high dimension.

In the case of LCGs or MCGs, these points appear to be distributedaccording to a very regular pattern : a lattice.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 13: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators

Bad high dimension properties : Marsaglia’s theorem, from [3]

Let d > 0 be a positive integer. If α0, . . . , αd−1 is any choice of integerssuch that :

α0 + · · ·+ αd−1ad−1 ≡ 0 mod m (1)

then all points of the form (un, . . . , un+d−1) will lie in the set of parallelhyperplanes defined by the equations :

α0x0 + · · ·+ αd−1xd−1 = 0,±1,±2, . . . (2)

There are at most |α0|+ · · ·+ |αd−1| of these hyperplanes whichintersect ]0, 1[d , and there is always a choice of α0, . . . , αd−1 such thatall of the points fall in fewer than (d !m)1/d hyperplanes.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 14: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators

Bad high dimension propertiesd = 3 d = 4 d = 5 d = 6 d = 7 d = 8 d = 9 d = 10

m = 223 369 119 63 42 32 27 24 22

m = 232 2953 566 220 120 80 60 48 41

m = 252 300079 18131 3520 1216 582 340 227 166

m = 264 4801279 145055 18578 4866 1910 963 573 382

223 for float, 232 for uint32_t, 252 for double, 264 for uint64_t

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 15: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators

Demonstration (1/3)

To demonstrate Marsaglia’s theorem, we need the following result fromnumber theory :

Theorem [4, Theorem 449] Let ξ0, . . . , ξd−1 be d linear formsξi =

∑d−1j=0 ki,j ti with real coefficients, such that :

∆ =

∣∣∣∣∣∣∣k0,0 · · · k0,d−1...

...kd−1,0 · · · kd−1,d−1

∣∣∣∣∣∣∣ 6= 0 (3)

There exit integers t0, . . . , td−1 not all 0, for which|ξ0|+ · · ·+ |ξd−1| ≤ (d !|∆|)1/n.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 16: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators

Demonstration (2/3)

1 If α0 + · · ·+ αd−1ad−1 ≡ 0 mod m, σ = α0un + · · ·+ αd−1un+d−1is an integer for all n, due to the fact thatun+k = snak/m − [aksn/m] so :

σ =

∈Z︷ ︸︸ ︷sn(α0 + · · ·+ αd−1ad−1)/m −

∈Z︷ ︸︸ ︷(α0[sn/m] + · · ·+ αd−1[ad−1sn/m])

2 Thus, the points (un, . . . , un+d−1) must lie in one of thehyperplanes α0x0 + · · ·+ αd−1xd−1 = 0,±1,±2, . . . ;

3 The number of such hyperplanes which intersect ]0, 1[d is at most|α1|+ · · ·+ |αd | by inspection :∑

i∈I1 αi < α0x0 + · · ·+ αd−1xd−1 <∑

i∈I2 αi whereI1 = i |αi < 0 and I2 = i |αi > 0 ;

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 17: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Multiplicative congruential generators

Demonstration (3/3)

4 There exist a set α0, . . . , αd−1 not all zeros with|α0|+ · · ·+ |αd−1| ≤ (d !m)1/d because (1) is equivalent to showthat there exist integers t0, . . . , td−1 not all zeros such that :

|mt0 − at1|+ |t1 − at2|+ · · ·+ |td−2 − atd−1|+ |td−1| ≤ (d !m)1/d

by writing that

(α0, . . . , αd−1) = (t0, . . . , td−1)

m 0 · · · · · · 0

−a 1. . .

...

0. . . . . . . . .

......

. . . . . . . . . 00 · · · 0 −a 1

and it is true as stated in the preliminary theorem (the determinantof the matrix is m).

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 18: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Statistical tests

The gap testThis test is one of the most basic ones. It aims at testing if aPRNG associated with a measure µ produces ouputs U that coverevenly the support of µ.More precisely :

Let ρ = 1/cardU 1 be the theoretical optimal resolution ofa PRNG ;Find µ(I ∗) = max

I=]si ,sj [,I∩U=∅µ(I ), where si , sj ∈ U ;

Check that µ(I ∗) = O(ρ) : the PRNG does not leavesignificant gaps in the support of µ.

One can note that a LCG with full period is perfect with regard tothis test, as it has a gap value of ρ, or ρ/(1− ρ) for MCGs.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 19: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Statistical tests

The spectral test

The spectral test is specifically designed to detect non-uniformity in themultidimensional repartition of U in [0, 1]d . It was designed after a periodof massive use of LCGs that where very poorly designed (as the famousRANDU shipped with IBM computers in the 80’). The test is defined by :

Compute the quantity :

νd = min

||α||2

∣∣∣∣∣d−1∑i=0

αiai ≡ 0 mod m

(4)

for d = 2, . . . , dmax . The quantity νd can be interpreted as thenumber of bits of accuracy of the LCG in dimension d : it is thelargest gap between adjacent hyperplanes in a family of parallelhyperplanes that covers the d− dimensional points generated by theLCG.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 20: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Statistical tests

The spectral test

Compute the normalized quantity :

µd =πd/2νd

dΓ(d/2 + 1)m

(5)

The test is successful if µd > 0.1 for d = 2, . . . , 6.

This test has been experimentaly proved to be very effective in separatingthe good PRNGs from the bad ones.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 21: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Other related PRNGs

LCGs and derived not better than MRGs

Marsaglia’s theorem can be extended to more general LCGs, andeven if the details change, the basic fact remains : LCGs are bad forhigh dimensional applications unless one is ready to use a HUGEvalue for m (m > 101500), for which one can work in dimensiond = 100 without too much care.

Generators that chain several LCGs suffer from the same defect : itis just a way to adjust m to higher values, but starting fromm = 232 one cannot attain sufficient resolution without an obscenenumber of steps. Indeed, the combination of k LCGs with modulim1, . . . ,mk has a maximum period ofP = (m1−1)...(mk−1)

2k−1 ' 232(k−1), which need k = 157 differentgenerators to attain the needed resolution !

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 22: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Better than LCGs

SIMD-oriented Fast Mersenne Twister (SFMT), from [5]

Three of the most important properties of a PRNG are :

a huge period ;

good equirepartition properties even in high dimension ;

fast generation of random numbers.

The SIMD-oriented Fast Mersenne Twister fulfill these needs with aperiod of 219937 − 1 ' 106001 (can be easily extended to2216091 − 1 ' 1065050), a provable equirepartition in dimension up to 382(resp. 4077) as a double-precision floating-point uniform randomnumbers generator.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 23: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

SFMT details

An efficient Linear Feedback Shift Register generator

The state space S of this generator is (Fw2 )N , the transition function T is

linear and very sparse and can be seen as an N terms linear recurrenceover Fw

2 . An element s in S is seen as an N dimensional vector(x0, . . . , xN−1) over Fw

2 , each xk is called a register and the lineartransformation takes the form :

((xn)0, . . . , (xn)N−1) = ((xn−1)1, . . . , (xn−1)N−1, z)

with z = g((xn−1)0, . . . , (xn−1)N−1)) (6)

with g a sparse linear function. For SFMT, w = 128 in order to store thexk in SIMD (Single Instruction, Multiple Data) registers of modernCPUs, and the function g is such that :

g(x0, . . . , xN−1) = x0A + xMB + xN−2C + xN−1D (7)

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 24: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

SFMT details

An efficient Linear Feedback Shift Register generator

The parameters M, N, A, B, C and D have been chosen in order to havea period multiple of 219937 − 1 and provide equirepartition in highdimension, which leads to :

N = 156 and M = 122,

xA = (x128<< 8) ∧ x ,

xB = (x32>> 11)⊕ (BFFFFFF6 BFFAFFFF DDFECB7F

DFFFFFEF ),

xC = (x128>> 8),

xD = (x32<< 18).

Working on a SIMD-oriented computer, the above computations can beimplemented in a very concise and very effective way.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 25: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Corner test : realizations in [0, a]2

a = 0.1, 104 realizations

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 26: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Corner test : realizations in [0, a]2

a = 0.05, 104 realizations

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 27: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Corner test : realizations in [0, a]2

a = 0.01, 104 realizations

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 28: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Corner test : realizations in [0, a]2

a = 0.005, 104 realizations

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 29: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Corner test : realizations in [0, a]2

a = 0.001, 104 realizations

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 30: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Corner test : realizations in [0, a]2

a = 0.0005, 104 realizations

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 31: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

From uniform to non-uniform

A lot of methods

CDF inversion

Rejection (with or without squeeze)

Transformation (with the ratio of uniforms as a special case)

Mix of all of that...

The objective of these methods is to express a random variable X of agiven non-uniform distribution as the output of an algorithm that use oneor more iid variables U1,. . .,UN uniformly distributed over [0, 1]. For abroad overview on these methods, see [6].

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 32: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

CDF inversion

Principle and good properties

If U is uniformly distributed over [0, 1] and if F is a CDF, X = F−1(U) isa random variable distributed according to F . Here,F−1(u) = inft∈RF (t) ≥ u is the generalized inverse of F .This well-known method has some interesting properties :

It needs only one uniform realization per non-uniform realization ;

Same dependence structure as the underlying stream of uniformrealizations ;

Same gap test value as the underlying stream of uniformrealizations ;

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 33: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

CDF inversion

Drawbacks

The resulting generator is rather slow for most of the usualdistributions (normal, gamma etc.) ;

The realizations are truncated to a rather small interval due to thefinite precision of the computer arithmetic : [−5.17, 5.17] for floats,[−8.13, 8.13] for doubles.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 34: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Rejection with squeeze

Principle

Let f be the PDF of the distribution we want to sample from, andsuppose that there exist a PDF g (the rejection distribution) which iseasy to sample and fast to evaluate, a function h (the squeeze function)very cheap to evaluate and a real λ > 0 such that :

∀x ∈ supp(f ), h(x) ≥ f (x) geqλg(x) (8)

Then, the algorithm is :

1 Generate Y according to g ;

2 Generate U uniformly over [0, λg(Y ) ;

3 If U < h(Y ), accept Y as a realization of f ;

4 If U < f (Y ), accept Y as a realization of f ;

5 else go back to step 1 ;

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 35: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Rejection with squeeze

Properties

This algorithm uses at least two realizations of the underlying uniformPRNG, and its efficiency is measured in term of the mean number ofiterations needed to produce one non-uniform realization. The squeezefunction is here only to postpone the potentially expansive test in step 4.

The attainable range is extended compared to the CDF inversion method,as one have to compare functions that takes values close to 0 instead of1. For the normal distribution generated using exponential tails, one canattain the range [−13.47, 13.47] for floats and [−37.62, 37.62] fordoubles.

The gap test can be severely degraded if the rejection distribution is farfrom f : this method change the frequency of the realizations of g , nottheir values, so if they are not well distributed with respect to f it canlead to significant gaps.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 36: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Transformation method

Principle

A random variable X one want to sample can sometimes be expressed asa function of one or several other independent random variablesΞ1, . . . ,ΞN not necessarily identically distributed, but that can all beeasily sampled :

X = φ(Ξ1, . . . ,ΞN) (9)

The simulation method is then straitforward : sample each of the Ξi ,then compute a realization of X using φ.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 37: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Transformation method

Properties

Many possibilities to express X as a function of Ξi ;

Straightforward implementation once the decomposition has beenfound ;

Can be expansive in term of calls to the uniform PRNG for onerealization of X , even if it is sometimes possible to get several iidrealizations of X for one realization of the Ξi , using differentefunctions.

The effective range of the resulting PRNG can be limited. Forexample, for the normal distribution, the Box-Muller transformationleads to the range [−5.64, 5.64] for floats and [−8.49, 8.49] fordoubles.

The gap test can be severely degraded if the uderlying uniform PRNGhas a specific structure that interact with the structure of φ.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 38: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

LCGs and the ratio of uniforms method

The ratio of uniforms methodThis method is a mix between the transformation method and therejection method, that leads to fast and generic non-uniform PRNGs.

Let (U,V ) be uniformly distributed overC = (u, v) | 0 ≤ u ≤

√c .f (v/u), then X = V /U has density f . If f

has sub-quadratic tails, the area of C is bounded and can be enclosed ina rectangle [0, u+]× [v−, v+]. It is then possible to sample (U,V ) by therejection method, and it is very often possible to derive efficient squeezefunctions.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 39: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

LCGs and the ratio of uniforms method

Gap default when combining LCGs and the ratio of uniforms method,from [7]

If (U,V ) is obtain using consecutive output of a LCG(m, a, c), as thesepoints will be contains within a lattice, the possible values for the ratioV /U will inherit a very special structure. It is then possible to show that :

Th. 1 If C = [0, u+]× [−v+, v+], then there exist a gap g such

that µ(g) ≥ 1√m

√√3

32

Th. 2 If C = [0, u+]× [v−, v+], then there exist a gap g such

that µ(g) ≥ 1√m

√√3

16q2

q22+1 where q2 = |e1|/|e2| is the

Beyer quotient of the lattice, (e1, e2) being theMinkowski-reduced basis of the lattice.

These lower bounds are much worse than the expected O(1/m) !

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 40: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

LCGs and the ratio of uniforms method

Gap for usual distributions

We consider the Cauchy, normal and exponential distributions for LCGswith m ' 232 and q2 varying from 0.1 to 1. We compute the lower boundof µ(g) scaled by 106 :

q2 Cauchy Normal Exponential0.1 4.52 2.08 2.980.5 4.52 4.17 5.581.0 4.52 4.65 6.12

If you are dealing with (not so) rare events, don’t use the combinationLCG/ratio of uniforms !

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 41: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Fast and high-quality normal PRNG

The ziggurat method (Marsaglia, Doornik, see [8])

The ziggurat method is an innovative use of the rejection method withsqueeze. Its key characteristics are :

allow to sample symmetric univariate continuous random variables

generate fast and high quality random deviates

The main idea is to use rejection with squeeze with respect to horizontalhistograms, with a specific treatment of the tail of the distribution.

We detail this method in the case of the standard normal distribution.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 42: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Fast and high-quality normal PRNG

The ziggurat method (Marsaglia)

How to build the ziggurat ?

Restrict the PDF f to x ≥ 0 ;

Cover its graph with horizontalrectangular bands

The bottom band contains alsothe tail of PDF ;

All these bands have the samearea V ;

Number these bands from 1 (top)to N (bottom) ;

Consider the inner horizontalhistogram as an inner squeezefunction.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 43: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Fast and high-quality normal PRNG

The ziggurat method (Marsaglia)

How to use it ?1 Generate a band number i

uniformly distributed over1, . . . , N ;

2 If i < N generate ui uniformly in[0, xi ] where xi is the length ofthe covering band, else generateui uniformly in [0, V /f (xN)].

3 If ui ≤ xi−1 (with x0 = 0),accept ui

4 Else

1 if i = N, generate ui inthe tail e.g. usingMarsaglia’s method ;

2 else use simplerejection in the wedge.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 44: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Fast and high-quality normal PRNG

Generating realizations from the tail of the normal distribution,Marsaglia’s method

To generate a random variate X with density proportional toexp(−x2/2)1x≥a, do :

1 Generate U1 and U2, two iid random variables uniform on ]0, 1] ;

2 Define X = 1a log(U1) and Y = log(U2) ;

3 If x2 < −2y return a− X . Note : X < 0

The joint density of the pairs (X ,Y ) obtained after a successful step 3has a joint density of the form :

pXY (x , y) =Kaexp(x/a)1x<0 exp(y)1y<01x2+2y<0 (10)

from which we deduce the marginal density of X and can check thata + X has the correct density.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 45: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Fast and high-quality normal PRNG

The ziggurat method (Marsaglia)

The acceptation ratio is ρ =√

2π2NV , so we use large values for N,

such as N = 128, for which ρ ' 98.78%.

With this setting, a ' 3.4451 and the aceeptation ratio ofMArsaglia’s method for generating tail normal random deviate is98.05%

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 46: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Further material

Does a bad PRNG matter ?

There is no need to have a perfect PRNG to obtain theconvergence, most of the time it suffice to have low discrepancy ;

The problem is more on the justification of the convergence of thealgorithms and the quantification of the confidence interval (e.g. noTCL with low discrepancy sequences)

The only very serious default is the lack of space filling in highdimension : one can completely miss a region of interrest and thusseverely underestimate the estimation of a quantity of interrest.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 47: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

References

1 Jerry Banks (ed.), "Handbook of Simulation : Modelling, Estimationand Control", Wiley, 1998.

2 Donald E. Knuth, "The Art of Computer Programming, Vol. 2 :Seminumerical algorithms, 3rd Ed.", Addison-Wesley, 1998.

3 George Marsaglia, "Random numbers fall mainly in the planes",Proc N.A.S, 1968.

4 G. H. Hardy, E. M. Wright, "An introduction to the theory ofnumbers, 4th ed.", Oxford University Press, 1975.

5 Mutsuo Saito, Makoto Matsumoto, "SIMD-oriented Fast MersenneTwister : a 128-bit Pseudorandom Number Generator", researchreport, Dep. of Math, Hiroshima University, 2006.

6 Luc Devroye, "Non-Uniform Random Variate Generation", Springer,1986.

7 Wolfgang Hörmann, "A Note on the Quality of Random VariatesGenerated by the Ratio of Uniform Method", ACM Transaction onModelling and Computer Simulation, Vol 4, 1.

8 Jurgen A. Doornik, "An Improved Ziggurat Method to GenerateNormal Random Samples", working paper,http ://www.doornik.com/research.html.

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 48: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Conclusion

Numerous communities in numerical applied probabilities

The world of uniform PRNGs makers (those in the state of sin) :arithmetically inclined, know CPU details, IEEE 754 norm ;

The world of non-uniform PRNGs makers (they believe on truthfuluniform PRNGs provided by the first community) : elementaryprobabilities, analysis background, sometimes also members of thefirst community but not so frequently.

The world of applied probabilities problem solvers (you !) : highprobability knowledge, in link with the previous community, butwhen it comes to implement an algorithm, sometime believe thatthe basic rand() function is thruthful ;-) !

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 49: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Conclusion

What should be remembered ?

Never use rand() for something else than initializing the position ofmonsters in a Dungeon and Dragon video game !

Check that the uniform PRNG has a sound theoretical backgroundand a successful record track in real-life applications.

Check that there is no bad interaction between the uniform PRNGand the non-uniform PRNG !

Adapt the precision of your uniform PRNG to the precision of yourcomputation : it is nonsens to use double precision in the high-levelalgorithms if the uniform PRNG gives only float output !

©EADS IW 2010 BigMC seminar, January 28th 2010

Page 50: Uniform and non-uniform pseudo random numbers generators for high dimensional applications

Pseudo-random number generatorsUniform PRNGs

Non-uniform PRNGs

Many thanks for your attention !

Your questions and comments are welcome !

©EADS IW 2010 BigMC seminar, January 28th 2010