sms notes 5th unit 8th sem

8/8/2019 SMS NOTES 5th UNIT 8th Sem

1/16

Chapter 7

Random-Number Generation

Properties of Random Numbers

Random numbers should be uniformly distributed and independent.

Uniformity: If we divide (0,1) into n equal intervals, then we expect the number

of observations in each sub-interval to be N/n where N is the total number of

observations.

Independence: The probability of observing a value in any sub-interval is not

influenced by any previous value drawn.

Random Number, Ri, must be independently drawn from a uniform distribution with

pdf:

Generation of Pseudo-Random Numbers Pseudo, because generating numbers using a known method removes the potential for

true randomness.

Pseudo-random numbers are model random numbers for simulation purposes.

Goal: To produce a sequence of numbers in [0,1] that simulates, or imitates, the idea

properties of random numbers (RN).

Potential issues:

Non-uniformity

Discrete valued, not continuously valued

Inaccurate mean

Inaccurate variance

Dependence

Autocorrelation between numbers

Runs of numbers with skewed values, with respect to previous numbers or mean

value

=)(xf


2/16

Important considerations in RN routines:

Fast

Portable to different computers

Have sufficiently long cycle

Replicable

Closely approximate the ideal statistical properties of uniformity and independence.

Techniques for Generating Random Numbers

Linear Congruential Method (LCM).

Combined Linear Congruential Generators (CLCG).

Random-Number Streams.

Linear Congruential Method

To produce a sequence of integers,X1, X2, between 0 and m-1 by following a recursive

relationship:

X0 -seed

a-constant multiplier

c-increment

m-modulus

c=0: multiplicative congruential method, otherwise mixed congruential method

The selection of the values for a, c, m, andX0 drastically affects the statistical properties

and the cycle length.

The random integers are being generated [0,m-1], and to convert the integers to random

numbers:

Example: X0 = 27, a = 17, c = 43, m = 100

Xi+1 = (aXi+c) mod m= (17Xi+43) mod 100

X1 = (17*27 + 43) mod 100= 502 mod 100= 2

R1 = X1/m= 2/100

,...2,1,==

im

X

R

i

i


3/16

= .02X2 = (17*2 + 43) mod 100

= 77 mod 100= 77

R2 = X2/m= 77/100= .77

Characteristics of a Good Generator [LCM] Maximum Density

Such that he values assumed by Ri, i = 1,2,, leave no large gaps on [0,1]

Problem: Instead of continuous, each Ri is discrete

Solution: a very large integer for modulus m

Approximation appears to be of little consequence

Maximum Period

To achieve maximum density and avoid cycling.

Achieve by: proper choice ofa, c, m, andX0.

Most digital computers use a binary representation of numbers

Speed and efficiency are aided by a modulus, m, to be (or close to) a power of2.

Combined Linear Congruential Generators [Techniques]

Reason: Longer period generator is needed because of the increasing complexity of

stimulated systems.

Approach: Combine two or more multiplicative congruential generators.

LetXi,1, Xi,2, ,Xi,k, be the ith output from kdifferent multiplicative congruential generators.

The jth generator:

Has prime modulus mj and multiplier aj and period is mj-1

Produces integersXi,j is approx ~ Uniform on integers in [1, m-1]

Wi,j = Xi,j -1 is approx ~ Uniform on integers in [1, m-2]

Suggested form:

The maximum possible period is:


4/16

Example: For 32-bit computers, LEcuyer [1988] suggests combining k = 2 generators with

m1 = 2,147,483,563, a1 = 40,014, m2 = 2,147,483,399 and a2 = 20,692. The algorithm

becomes:

Step 1: Select seeds

X1,0 in the range [1,2,147,483,562] for the 1st generator

X2,0 in the range [1,2,147,483,398] for the 2nd generator.

Step 2: For each individual generator,

X1,j+1 = 40,014X1,j mod 2,147,483,563

X2,j+1 = 40,692X1,j mod 2,147,483,399.

Step 3: Xj+1 = (X1,j+1 -X2,j+1 ) mod 2,147,483,562.

Step 4: Return

Step 5: Setj = j+1, go back to step 2.

Combined generator has period: (m1 1)(m2 1)/2 ~ 2 x 1018

Random-Numbers Streams [Techniques]

The seed for a linear congruential random-number generator:

Is the integer valueX0 that initializes the random-number sequence.

Any value in the sequence can be used to seed the generator.

A random-number stream:

Refers to a starting seed taken from the sequenceX0, X1, , XP.

If the streams are b values apart, then stream i could defined by starting seed:

Older generators: b = 105; Newer generators: b = 1037.

A single random-number generator with k streams can act like k distinct virtual random-

number generators

To compare two or more alternative systems.

Advantageous to dedicate portions of the pseudo-random number sequence to

the same purpose in each of the simulated systems.

Tests for Random Numbers

Two categories:

Testing for uniformity:

1

21

2

)1)...(1)(1(

=

k

kmmmP

=

>=

+

++

+

0,5632,147,483,

5622,147,483,

0,5632,147,483,

1

11

1

j

jj

j

X

XXR

(= ibi XS


5/16

H0: Ri ~ U[0,1]

H1: Ri ~ U[0,1]

Failure to reject the null hypothesis, H0, means that evidence of non

uniformity has not been detected.

Testing for independence:

H0: Ri ~ independently

H1: Ri ~ independently

Failure to reject the null hypothesis, H0, means that evidence of dependence

has not been detected.

Level of significance a, the probability of rejecting H0 when it is true: a = P(reject H0|H0 is

true)

When to use these tests:

If a well-known simulation languages or random-number generators is used, it is

probably unnecessary to test

If the generator is not explicitly known or documented, e.g., spreadsheet programs

symbolic/numerical calculators, tests should be applied to many sample numbers.

Types of tests:

Theoretical tests: evaluate the choices of m, a, and c without actually generating

any numbers

Empirical tests: applied to actual sequences of numbers produced. Our emphasis.

Frequency Tests [Tests for RN]

Test of uniformity

Two different methods:

Kolmogorov-Smirnov test

Chi-square test

Kolmogorov-Smirnov Test [Frequency Test]

Compares the continuous cdf, F(x), of the uniform distribution with the empirical cdf, SN(x),

of the N sample observations.

We know:

If the sample from the RN generator is R1, R2, , RN, then the empirical cdf, SN(x) is:

Based on the statistic: D = max| F(x) - SN(x)|

Sampling distribution ofD is known (a function ofN, tabulated in Table A.8.)

A more powerful test, recommended.

10,)( = xxxF

N

xRRRxS nN

=

arewhich,...,,ofnumber)( 21


6/16

Example: Suppose 5 generated numbers are 0.44, 0.81, 0.14, 0.05, 0.93.

Chi-square test [Frequency Test]

Chi-square test uses the sample statistic:

=

=

n

i i

ii

E

EO

1

22

0

)(

n is the # of classes

Oiis the observe

# in the i th clas

Eiis the expecte# in the i th clas

Approximately the chi-square distribution with n-1 degrees of freedom (where the

critical values are tabulated in Table A.6)

For the uniform distribution, Ei, the expected number in the each class is:

Valid only for large samples, e.g. N >= 50

Tests for Autocorrelation [Tests for RN]

Testing the autocorrelation between every m numbers (m is a.k.a. the lag), starting with

the ith number

The autocorrelation rim between numbers: Ri, Ri+m, Ri+2m, Ri+(M+1)m

M is the largest integer such that

Step 1:

Step 2:

Step 3: D = max(D+, D-) = 0.26

Step 4: For = 0.05,

D = 0.565 > D

Hence, H0 is not rejected.

Arrange R(i)fromsmallest to largest

D

+

= max {i/N R(i)

}

D- = max {R(i) - (i-1)/N}

R(i) 0.05 0.14 0.44 0.81 0.93

i/N 0.20 0.40 0.60 0.80 1.00

i/N R(i) 0.15 0.26 0.16 - 0.07

R(i) (i-1)/N 0.05 - 0.04 0.21 0.13

nobservatioof#totaltheisNwhere,n

NEi =

N)m(Mi ++ 1

dependentarenumbersif

tindependentarenumbersif

,0:

,0:

1

0

=

im

im

H

H


7/16

Hypothesis:

If the values are uncorrelated:

For large values of M, the distribution of the estimator of rim, denoted is

approximately normal.

Test statistics is:

Z0 is distributed normally with mean = 0 and variance = 1, and:

Ifrim > 0, the subsequence has positive autocorrelation

High random numbers tend to be followed by high ones, and vice versa.

Ifrim < 0, the subsequence has negative autocorrelation

Low random numbers tend to be followed by high ones, and vice versa.

Example:

Test whether the 3rd, 8th, 13th, and so on, for the following output on P. 265.

Hence, a = 0.05, i = 3, m = 5, N = 30, and M = 4

From Table A.3,z0.025 = 1.96. Hence, the hypothesis is not rejected.

Shortcomings:

The test is not very sensitive for small values of M, particularly when the numbers being

tests are on the low side.

Problem when fishing for autocorrelation by performing numerous tests:

Ifa = 0.05, there is a probability of 0.05 of rejecting a true hypothesis.

im

imZ

0

=

)(M

M

.RRM

im

M

k

)m(kikmiim

112

713

2501

1

0

1

+

+=

+=

=

+++

516.11280.0

1945.0

128.01412

7)4(13

1945.0

250)36.0)(05.0()05.0)(28.0(

)27.0)(33.0()33.0)(25.0()28.0)(23.0(

14

1

0

35

35

==

=+

+=

=

++

++

+=

Z

)(

.


8/16

If 10 independence sequences are examined,

The probability of finding no significant autocorrelation, by chance alone, is

0.9510 = 0.60.

Hence, the probability of detecting significant autocorrelation when it does

not exist = 40%

Summary

In this chapter, we described:

Generation of random numbers

Testing for uniformity and independence

Caution:

Even with generators that have been used for years, some of which still in used, are

found to be inadequate.

This chapter provides only the basic

Also, even if generated numbers pass all the tests, some underlying pattern might

have gone undetected.


9/16

CHAPTER 8: Random Variate Generation

Inverse-transform Technique

For cdf function: r = F (x)

Generate r from uniform (0,1)

Find x:

Exponential Distribution

Exponential cdf:

r = F(x)

= 1 e- x for x 0

To generateX1, X2, X3

Xi= F-1(Ri)

= -(1/

ln(1-Ri)

Example: Generate 200variate Xi with distribution exp ( = 1)

Generate 200Rs with U (0,1) and utilize above equation, the histogram of Xs become:

x=F -1(r) r1

x1

r =F(x)

x=F -1(r) r1

x1

r =F(x)

r1

x1

r =F(x)


10/16

Check: Does the random variableX1 have the desired distribution?

Other Distributions Examples of other distributions for which inverse cdf works are:

1. Uniform distribution

2. Weibull distribution

3. Triangular distribution

Empirical Continuous Distn

When theoretical distribution is not applicable

To collect empirical data:

1. Resample the observed data

2. Interpolate between observed data points to fill in the gaps

3. For a small sample set (size n):

4. Arrange the data from smallest to largest

5. Assign the probability 1/n to each interval

where

Example: Suppose the data collected for100 broken-widget repair times are:

)())(()( 00101 xFxFRPxXP ==

(i)1)-(i xxx

(n)(2)(1) xxx

+==

n

iRaxRFX

ii

)1()(

)1(

1

n

xx

nin

xxa

iiii

i

/1/)1(/1

)1()()1()( =

=

i

Interval

(Hours) Frequency

Relative

Frequency

Cumulative

Frequency, c i

Slope,

a i

1 0.25 ? x ? 0.5 31 0.31 0.31 0.81

2 0.5 ? x ? 1.0 10 0.10 0.41 5.0

3 1.0 ? x ? 1.5 25 0.25 0.66 2.0

4 1.5 ? x ? 2.0 34 0.34 1.00 1.47

Consider R 1 = 0.83:

c3 = 0.66 < R 1 < c 4 = 1.00

X1 = x (4-1) + a 4(R1 c(4-1))= 1.5 + 1.47(0.83 -0.66)= 1.75

Consider R 1 = 0.83:

c3 = 0.66 < R 1 < c 4 = 1.00

X1 = x (4-1) + a 4(R1 c(4-1))= 1.5 + 1.47(0.83 -0.66)= 1.75


11/16

Discrete Distribution

All discrete distributions can be generated via inverse-transform technique

Method: numerically, table-lookup procedure, algebraically, or a formula

Example: Suppose the number of shipments, x, on the loading dock of IHW company is 0, 1, or2. Interna

consultants have been asked to improve the efficiency of loading and hauling operation.

Data - Probability distribution:

F (x) is given by:

0, x


12/16

Step 1. Generate R ~ U [0,1]

Step 2a. If R >= , accept X=R.

Step 2b. If R < , reject R,

Step 3: If another RV required, return to Step 1

Rdoes not have the desired distribution, but Rconditioned (R) on the event {R>= } does.

Efficiency: Depends heavily on the ability to minimize the number of rejections.


13/16

For efficient generation purposes, relation (5.30) is usually simplified by first using Equation (5.3),Ai= (1/ )ln Ri, to obtain

I=1n 1/ ln Ri e~a in step 3, then n is rejected and the generation process

must proceed through at least one more trial.

How many random numbers will be required, on the average to generate one Poisson variate, N? If N=n, then

n+1 random numbers are required so the average number is given by

Which is quite large if the mean , alpha, of the Poisson distribution is large.

Example 4:


14/16

NSPP

: Non-stationary Poisson Process

It is a Poisson arrival process with an arrival rate that varies with

time

Idea behind thinning:

1. Generate a stationary Poisson arrival process at the

fastest rate, l*= max l(t).

2. But accept only a portion of arrivals, thinning out jus

enough to get the desired time-varying rate

Example: Generate a random variate for a NSPP

Genera te E ~ Exp( *)

t = t + E

Condi t ion

R


15/16

Procedures:

Step 1.l* = max l(t) = 1/5, t = 0and i = 1. Data: Arrival Rates

Step 2. For random numberR = 0.2130,

E = -5ln(0.213) = 13.13

t = 13.13

Step 3. Generate R = 0.8830

l(13.13)/l*=(1/15)/(1/5)=1/3

Since R>1/3, do

not generate the arrival

Step 2. For random numberR = 0.5530,

E = -5ln(0.553) = 2.96

t = 13.13 + 2.96 = 16.09

Step 3. Generate R = 0.0240

l(16.09)/l*=(1/15)/(1/5)=1/3

Since R


16/16

Consider two standard normal random variables, Z1 and Z2, plotted as a point in the plane:

In polar coordinates:

Z1 = B cos

Z2 = B sin

B2 = Z21 + Z22 ~ chi-square distribution with 2 degrees o

freedom = Exp (

= 2). Hence,

The radius B and angle

are mutually independent.

2. Approach for normal ( , 2):

Generate Zi~ N (0,1)

3. Approach for lognormal ( , 2):

Generate X ~ N ( , 2)

Xi = +

Zi

Yi = eXi

sms notes 5th unit 8th sem

Documents