cpe 619 testing random-number generators

54
CPE 619 Testing Random-Number Generators Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville http://www.ece.uah.edu/~milenka http://www.ece.uah.edu/~lacasa

Upload: timon-boyer

Post on 31-Dec-2015

33 views

Category:

Documents


0 download

DESCRIPTION

CPE 619 Testing Random-Number Generators. Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville http://www.ece.uah.edu/~milenka http://www.ece.uah.edu/~lacasa. Overview. Chi-square test Kolmogorov-Smirnov Test - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CPE 619 Testing Random-Number Generators

CPE 619Testing Random-Number

Generators

Aleksandar Milenković

The LaCASA Laboratory

Electrical and Computer Engineering Department

The University of Alabama in Huntsville

http://www.ece.uah.edu/~milenka

http://www.ece.uah.edu/~lacasa

Page 2: CPE 619 Testing Random-Number Generators

2

Overview

Chi-square test Kolmogorov-Smirnov Test Serial-correlation Test Two-level tests K-dimensional uniformity or k-distributivity Serial Test Spectral Test

Page 3: CPE 619 Testing Random-Number Generators

3

Testing Random-Number Generators

Goal: To ensure that the random number generator produces a

random stream Plot histograms Plot quantile-quantile plot Use other tests Passing a test is necessary but not sufficient Pass Good

Fail Bad New tests Old generators fail the test Tests can be adapted for other distributions

Page 4: CPE 619 Testing Random-Number Generators

4

Chi-Square Test

Most commonly used test Can be used for any distribution Prepare a histogram of the observed data Compare observed frequencies with theoretical

k = Number of cells oi = Observed frequency for ith cell ei = Expected frequency

D=0 Exact fit D has a chi-square distribution with k-1 degrees of freedom.

Compare D with 2[1-; k-1] Pass with confidence if D is less

Page 5: CPE 619 Testing Random-Number Generators

5

Example 27.1

1000 random numbers with x0 = 1

Observed difference

= 10.380 Observed is Less

Accept IID U(0, 1)

Page 6: CPE 619 Testing Random-Number Generators

6

Chi-Square for Other Distributions

Errors in cells with a small ei affect the chi-square statistic more Best when ei's are equal

Use an equi-probable histogram with variable cell sizes Combine adjoining cells so that the new cell probabilities are

approximately equal The number of degrees of freedom should be reduced to k-r-1

(in place of k-1), where r is the number of parameters estimated from the sample

Designed for discrete distributions and for large sample sizes only Lower significance for finite sample sizes and continuous distributions

If less than 5 observations, combine neighboring cells

Page 7: CPE 619 Testing Random-Number Generators

7

Kolmogorov-Smirnov Test

Developed by A. N. Kolmogorov and N. V. Smirnov Designed for continuous distributions Difference between the observed CDF (cumulative distribution

function) Fo(x) and the expected cdf Fe(x) should be small

Page 8: CPE 619 Testing Random-Number Generators

8

Kolmogorov-Smirnov Test

K+ = maximum observed deviation below the expected cdf K- = minimum observed deviation below the expected cdf

K+ < K[1-;n] and K- < K[1-;n] Pass at level of significance Don't use max/min of Fe(xi)-Fo(xi) Use Fe(xi+1)-Fo(xi) for K- For U(0, 1): Fe(x)=x Fo(x) = j/n,

where x > x1, x2, ..., xj-1

Page 9: CPE 619 Testing Random-Number Generators

9

Example 27.2

30 Random numbers using a seed of x0=15:

The numbers are:14, 11, 2, 6, 18, 23, 7, 21, 1, 3, 9, 27, 19, 26, 16, 17, 20, 29, 25, 13, 8, 24, 10, 30, 28, 22, 4, 12, 5, 15.

Page 10: CPE 619 Testing Random-Number Generators

10

Example 27.2 (cont’d)

The normalized numbers obtained by dividing the sequence by 31 are:0.45161, 0.35484, 0.06452, 0.19355, 0.58065, 0.74194, 0.22581, 0.67742, 0.03226, 0.09677, 0.29032, 0.87097, 0.61290, 0.83871, 0.51613, 0.54839, 0.64516, 0.93548, 0.80645, 0.41935, 0.25806, 0.77419, 0.32258, 0.96774, 0.90323, 0.70968, 0.12903, 0.38710, 0.16129, 0.48387.

Page 11: CPE 619 Testing Random-Number Generators

11

Example 27.2 (cont’d)

K[0.9;n] value for n = 30 and a = 0.1 is 1.0424

Observed<TablePass

Page 12: CPE 619 Testing Random-Number Generators

12

Chi-square vs. K-S Test

Page 13: CPE 619 Testing Random-Number Generators

13

Serial-Correlation Test

Nonzero covariance Dependence. The inverse is not true Rk = Autocovariance at lag k = Cov[xn, xn+k]

For large n, Rk is normally distributed with a mean of zero and a variance of 1/[144(n-k)]

100(1-)% confidence interval for the autocovariance is:

For k1 Check if CI includes zero For k = 0, R0= variance of the sequence Expected to be 1/12

for IID U(0,1)

Page 14: CPE 619 Testing Random-Number Generators

14

Example 27.3: Serial Correlation Test

10,000 random numbers with x0=1:

Page 15: CPE 619 Testing Random-Number Generators

15

Example 27.3 (cont’d)

All confidence intervals include zero All covariances are statistically insignificant at 90% confidence.

Page 16: CPE 619 Testing Random-Number Generators

16

Two-Level Tests

If the sample size is too small, the test results may apply locally, but not globally to the complete cycle.

Similarly, global test may not apply locally Use two-level tests

Use Chi-square test on n samples of size k each and then use a Chi-square test on the set of n Chi-square statistics so obtained

Chi-square on Chi-square test. Similarly, K-S on K-S Can also use this to find a ``nonrandom'' segment of an

otherwise random sequence.

Page 17: CPE 619 Testing Random-Number Generators

17

k-Distributivity

k-Dimensional Uniformity Chi-square uniformity in one dimension

Given two real numbers a1 and b1 between 0 and 1 such that b1 > a1

This is known as 1-distributivity property of un. The 2-distributivity is a generalization of this property in two

dimensions:

For all choices of a1, b1, a2, b2 in [0, 1], b1>a1 and b2>a2

Page 18: CPE 619 Testing Random-Number Generators

18

k-Distributivity (cont’d)

k-distributed if:

For all choices of ai, bi in [0, 1], with bi>ai, i=1, 2, ..., k. k-distributed sequence is always (k-1)-distributed. The inverse

is not true. Two tests:1. Serial test2. Spectral test3. Visual test for 2-dimensions: Plot successive overlapping pairs

of numbers

Page 19: CPE 619 Testing Random-Number Generators

19

Example 27.4

Tausworthe sequence generated by:

The sequence is k-distributed for k up to d /l e, that is, k=1.

In two dimensions: Successive overlapping pairs (xn, xn+1)

Page 20: CPE 619 Testing Random-Number Generators

20

Example 27.5

Consider the polynomial:

Better 2-distributivity than Example 27.4

Page 21: CPE 619 Testing Random-Number Generators

21

Serial Test

Goal: To test for uniformity in two dimensions or higher. In two dimensions, divide the space between 0 and 1 into K2

cells of equal area

xn +1

xn

Page 22: CPE 619 Testing Random-Number Generators

22

Serial Test (cont’d)

Given {x1, x2,…, xn}, use n/2 non-overlapping pairs (x1, x2), (x3, x4), … and count the points in each of the K2 cells

Expected= n/(2K2) points in each cell Use chi-square test to find the deviation of the actual counts

from the expected counts The degrees of freedom in this case are K2-1 For k-dimensions: use k-tuples of non-overlapping values k-tuples must be non-overlapping Overlapping number of points in the cells are not

independent chi-square test cannot be used In visual check one can use overlapping or non-overlapping In the spectral test overlapping tuples are used Given n numbers, there are n-1 overlapping pairs, n/2 non-

overlapping pairs

Page 23: CPE 619 Testing Random-Number Generators

23

Spectral Test

Goal: To determine how densely the k-tuples {x1, x2, …, xk} can fill up the k-dimensional hyperspace

The k-tuples from an LCG fall on a finite number of parallel hyper-planes

Successive pairs would lie on a finite number of lines In three dimensions, successive triplets lie on a finite number of

planes

Page 24: CPE 619 Testing Random-Number Generators

24

Example 27.6: Spectral Test

Plot of overlapping pairs

All points lie on three straight lines.

Or:

Page 25: CPE 619 Testing Random-Number Generators

25

Example 27.6 (cont’d)

In three dimensions, the points (xn, xn-1, xn-2) for the above generator would lie on five planes given by:

Obtained by adding the following to equation

Note that k+k1 will be an integer between 0 and 4.

Page 26: CPE 619 Testing Random-Number Generators

26

Spectral Test (More)

Marsaglia (1968): Successive k-tuples obtained from an LCG fall on, at most, (k!m)1/k parallel hyper-planes, where m is the modulus used in the LCG.

Example: m = 232, fewer than 2,953 hyper-planes will contain all 3-tuples, fewer than 566 hyper-planes will contain all 4-tuples, and fewer than 41 hyper-planes will contain all 10-tuples. Thus, this is a weakness of LCGs.

Spectral Test: Determine the max distance between adjacent hyper-planes.

Larger distanceworse generator In some cases, it can be done by complete enumeration

Page 27: CPE 619 Testing Random-Number Generators

27

Example 27.7

Compare the following two generators:

Using a seed of x0=15, first generator:

Using the same seed in the second generator:

Page 28: CPE 619 Testing Random-Number Generators

28

Example 27.7 (cont’d)

Every number between 1 and 30 occurs once and only once

Both sequences will pass the chi-square test for uniformity

Page 29: CPE 619 Testing Random-Number Generators

29

Example 27.7 (cont’d)

First Generator:

Page 30: CPE 619 Testing Random-Number Generators

30

Example 27.7 (cont’d)

Three straight lines of positive slope or ten lines of negative slope

Since the distance between the lines of positive slope is more, consider only the lines with positive slope

Distance between two parallel lines y=ax+c1 and y=ax+c2 is given by

The distance between the above lines is or 9.80

Page 31: CPE 619 Testing Random-Number Generators

31

Example 27.7 (cont’d)

Second Generator:

Page 32: CPE 619 Testing Random-Number Generators

32

Example 27.7 (cont’d)

All points fall on seven straight lines of positive slope or six straight lines of negative slope.

Considering lines with negative slopes:

The distance between lines is: or 5.76. The second generator has a smaller maximum distance and,

hence, the second generator has a better 2-distributivity The set with a larger distance may not always be the set with

fewer lines

Page 33: CPE 619 Testing Random-Number Generators

33

Example 27.7 (cont’d)

Either overlapping or non-overlapping k-tuples can be used With overlapping k-tuples, we have k times as many points,

which makes the graph visually more complete.The number of hyper-planes and the distance between them are the same with either choice.

With serial test, only non-overlapping k-tuples should be used. For generators with a large m and for higher dimensions,

finding the maximum distance becomes quite complex.

See Knuth (1981)

Page 34: CPE 619 Testing Random-Number Generators

34

Summary

Chi-square test is a one-dimensional testDesigned for discrete distributions and large sample sizes

K-S test is designed for continuous variables Serial correlation test for independence Two level tests find local non-uniformity k-dimensional uniformity = k-distributivity

tested by spectral test or serial test

Page 35: CPE 619 Testing Random-Number Generators

Random Variate Generation

Page 36: CPE 619 Testing Random-Number Generators

36

Overview

Inverse transformation Rejection Composition Convolution Characterization

Page 37: CPE 619 Testing Random-Number Generators

37

Random-Variate Generation

General Techniques Only a few techniques may apply to a particular

distribution Look up the distribution in Chapter 29

Page 38: CPE 619 Testing Random-Number Generators

38

Inverse Transformation

Used when F-1 can be determined either analytically or empirically

0

0.5

u

1.0

x

CDF F(x)

Page 39: CPE 619 Testing Random-Number Generators

39

Proof

Page 40: CPE 619 Testing Random-Number Generators

40

Example 28.1

For exponential variates:

If u is U(0,1), 1-u is also U(0,1) Thus, exponential variables can be generated by:

Page 41: CPE 619 Testing Random-Number Generators

41

Example 28.2

The packet sizes (trimodal) probabilities:

The CDF for this distribution is:

Page 42: CPE 619 Testing Random-Number Generators

42

Example 28.2 (cont’d)

The inverse function is:

Note: CDF is continuous from the right the value on the right of the discontinuity is used The inverse function is continuous from the left u=0.7 x=64

Page 43: CPE 619 Testing Random-Number Generators

43

Applications of the Inverse-Transformation Technique

Page 44: CPE 619 Testing Random-Number Generators

44

Rejection

Can be used if a pdf g(x) exists such that c g(x) majorizes the pdf f(x) c g(x) > f(x) 8 x

Steps: 1. Generate x with pdf g(x) 2. Generate y uniform on [0, cg(x)] 3. If y < f(x), then output x and return

Otherwise, repeat from step 1 Continue rejecting the random variates x and y until y > f(x)

Efficiency = how closely c g(x) envelopes f(x) Large area between c g(x) and f(x) Large percentage of (x, y) generated in steps 1 and 2 are rejected

If generation of g(x) is complex, this method may not be efficient

Page 45: CPE 619 Testing Random-Number Generators

45

Example 28.2

Beta(2,4) density function:

Bounded inside a rectangle of height 2.11 Steps:

Generate x uniform on [0, 1] Generate y uniform on [0, 2.11] If y < 20 x(1-x)3, then output x

and returnOtherwise repeat from step 1

Page 46: CPE 619 Testing Random-Number Generators

46

Composition

Can be used if CDF F(x) = Weighted sum of n other CDFs.

Here, , and Fi's are distribution functions. n CDFs are composed together to form the desired CDF

Hence, the name of the technique. The desired CDF is decomposed into several other CDFs

Also called decomposition Can also be used if the pdf f(x) is a weighted sum of n other

pdfs:

Page 47: CPE 619 Testing Random-Number Generators

47

Steps: Generate a random integer I such that:

This can easily be done using the inverse-transformation method.

Generate x with the ith pdf fi(x) and return.

Page 48: CPE 619 Testing Random-Number Generators

48

Example 28.4

pdf: Composition of two

exponential pdf's Generate

If u1<0.5, return; otherwise return x=a ln u2.

Inverse transformation better for Laplace

Page 49: CPE 619 Testing Random-Number Generators

49

Convolution

Sum of n variables: Generate n random variate yi's and sum For sums of two variables, pdf of x = convolution of

pdfs of y1 and y2. Hence the name Although no convolution in generation If pdf or CDF = Sum Composition Variable x = Sum Convolution

Page 50: CPE 619 Testing Random-Number Generators

50

Convolution: Examples

Erlang-k = i=1k Exponentiali

Binomial(n, p) = i=1n Bernoulli(p)

Generated n U(0,1), return the number of RNs less than p

() = i=1 N(0,1)2

a, b)+(a,b2)=(a,b1+b2) Non-integer value of b = integer + fraction

n Any = Normal U(0,1) Normal

m Geometric = Pascal

2 Uniform = Triangular

Page 51: CPE 619 Testing Random-Number Generators

51

Characterization

Use special characteristics of distributions characterization Exponential inter-arrival times Poisson number of arrivals

Continuously generate exponential variates until their sum exceeds T and return the number of variates generated as the Poisson variate.

The ath smallest number in a sequence of a+b+1 U(0,1) uniform variates has a (a, b) distribution.

The ratio of two unit normal variates is a Cauchy(0, 1) variate. A chi-square variate with even degrees of freedom 2() is the

same as a gamma variate (2,/2). If x1 and x2 are two gamma variates (a,b) and (a,c),

respectively, the ratio x1/(x1+x2) is a beta variate (b,c). If x is a unit normal variate, e+ x is a lognormal(, ) variate.

Page 52: CPE 619 Testing Random-Number Generators

52

Summary

Is pdf a sumof other pdfs?

Use CompositionYes

Is CDF a sumof other CDFs?

Use compositionYes

Is CDFinvertible? Use inversion

Yes

Page 53: CPE 619 Testing Random-Number Generators

53

Summary (cont’d)

Doesa majorizing function

exist?Use rejection

Yes

Is thevariate related to other

variates?Use characterization

Yes

Is thevariate a sum of other

variatesUse convolution

Yes

Use empirical inversion

No

Page 54: CPE 619 Testing Random-Number Generators

54

Homework #6

Submit answers to exercise 27.1 Submit answers to exercise 27.4 Due: Monday, April 7, 2008, 12:45 PM Submit a hard copy to instructor