random numbers and statistical tests

21
Random Numbers and Statistical Tests at Chuo University on January 10, 2003 by J. C. Lee, Korea University

Upload: others

Post on 03-Feb-2022

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Random Numbers and Statistical Tests

Random Numbers and Statistical Tests

at Chuo Universityon January 10, 2003

by J. C. Lee, Korea University

Page 2: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 222

( I ) What is random?

1) “...the sequence of numbers we get when we draw tokens numbered

        0, 1, 2, …, 9 from a bag one by one, each time after replacing the token drawn and thoroughly shuffling the bag. Such sequences are supposed to exhibit maximum uncertainty (chaos or entropy) in the sense that the given the past sequence of digits drawn, there is no clue for predicting the outcome of the next draw.    ”

2) L.H.C. Tippett(1927) produced a book titled Random Sampling Numbers.A book of random numbers! ----- a meaningless and haphazard collection of numbers, neither fact nor fiction.

Page 3: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 333

3) Randomness is a property of an abstract mathematical model that is characterized by probabilities. Random is also used as a synonym for “independent and uniformly distributed”.

Now, the concept of randomness and random numbers became the most useful means for simulation, designing experiments, and information security by encryption of messages. It became one of the most important inventions of our time in the information society.

Page 4: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 444

Used as a source key for message transmission with security:

A simple example of random digits

Key 0 1 0 0 0 1 1 (random digits)Messages 1 0 1 1 0 0 1 (senders message)Encrypted message 1 1 1 1 0 1 0 (transmitted message)Key 0 1 0 0 0 1 1 (same random digits)Recovered message 1 0 1 1 0 0 1 (by receiver)

The quality of the random digits - a good KEY-- guarantees the privacy oftransactions.

Page 5: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 555

1) Depends on the method of generating the random numbers.

Early pioneers used “ TOP-HAT” methods.

( II ) The Quality of Random numbers.

• Now, the use of computer-generated random numbers became commonplace.

• The artificially generated numbers in this way are not random, but deterministic because they are coming from the deterministic pseudostochastic process.

Page 6: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 666

( II ) The Quality of Random numbers.

2) Are the decimal digits in random? Y.Dodge (1996) :

• Technically speaking, a random sequence of symbols is a sequence which can not be recorded by means of an algorithm in a form shorter than sequence itself. In this sense the sequence of decimal digits in do not form a random sequence.

Page 7: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 777

( II ) The Quality of Random numbers.

100100100100100100100100100100Expected

1061019594979310210311693Frequency

9876543210Digits

6.2531.2562.5062.5031.256.25Expected

6416154317Frequency

543210No.of odd digits

Page 8: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 888

( II ) The Quality of Random numbers.

3) Performance check

a) Theoretical Tests: Tests that can be applied to the generator itself without obtaining samples of its output.ex) Period test, Spectral test

b) Empirical Tests(Statistical tests): Tests that are applied to the samples of generated outputs.

c) Desirable characteristics of random numbers or defects for intended use.

d) Departure from uniformitye) Departure from independence can be in so many different ways.f) Most Statistical Tests are designed to detect some special

departure patterns relevant to intended applications.

Page 9: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 999

( II ) The Quality of Random numbers.

4) Statistical Tests for a sample for X1 … Xn being iid from U(0,1) orUniform on a discrete set S.

a) Kolmogorov-Smirnov test : Testing static departures from the hypothesized distribution

D = Sup | Fn(X) – F(X) |

b) The Digit Frequency Test : (Chi-square test of goodness of fit) Test compares the number of occurrences of each digit in a sequence with the expected number for that digit

Page 10: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 101010

( II ) The Quality of Random numbers.

c) Serial test :The digit frequency test can also be applied to the frequency ofeach pair of digits, or triplets and so on, thus attempting to test for serial correlations.

d) Gap test :This test is also a chi-square test comparing the number of intervening digits between successive occurrences of a given digit with the expected number.

Ex.) 5, 3, 2, 4, 9, 5, 6, 5, 7, 3, 9, 1, 0, 5,…Consider the digit 5 in the above sequence. Then the gaps are of length 4, 1, 5, and so on. The distribution of the length of the gap os geometric.

Page 11: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 111111

( II ) The Quality of Random numbers.

e) Poker test The test considers various patterns analogous to poker hands in groups of five consecutive digits in a sample. The number of groups in which all digits are distinct, the number containing exactly one pair, two pairs, three of a kind, a full house, four of a kind, and five of a kind are compared by chi-square test with the expected number of each type in a sample ofgiven size.

f) Coupon collector's test :This test involves counting the length of sequences of digits required to obtain a complete set of all digits and comparing these counts with the expected counts. Greenwood(1955) showed the probability function for digits in base 10

Page 12: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 121212

( II ) The Quality of Random numbers.

g) Permutation test Consider t!, possible orderings of a partial seq.(Xjt, Xjt+1,…,Xjt+t-1).Count the # of partial seq’s falling in each ordering for n groups of partial seq’s. Probability of a given partial seq. to be in a specific category is 1/t!. Compute Chi-square values based on t! categories. (df=t!-1)

h) Run test :One important departure from randomness is a propensity for the occurrence of long monotonic subsequences. Run tests are designed to detect nonrandom behavior of this type. A run is a monotonic subsequence.

"run up" - increasing subsequence. "run down" - decreasing subsequence.

The number of runs is used for test.

Page 13: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 131313

( II ) The Quality of Random numbers.

i) Maximum test Let Vj = max{Xtj, xtj+1, …, Xtj+t-1}, 0≤j≤nUnder H0 : X ~ Uniform , Vj follows cdf F(v)=vt, 0≤v≤1use K-S test, D=Sup | Fn(X) – F(X) |

j) Collision test :When the number of categories is much bigger than the number of obs.Ex) n=5, m=6, c=2 (collisions)□ □ □ □ □ □ categories need n-c=3 non-empty cells[mPn-c/mn]× nCn-c Probability

Page 14: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 141414

0 0 (0) in decimal0 1 (1)1 0 (2):

0 0 (1)1 1 (3)1 0 (2)

Counting the pairs there are (0, 0) → 2 (0, 1) → 8 (1, 0) → 8 (1, 1) → 2

i) From the first 20 digits count the zeros and the ones. There are 10 zeros and 10 ones.

ii) From the first 21 digits make 20 groups of 2 consecutive digits with one digit overlapping. They make two digit binary numbers as follows.

Given a sample of binary sequence with n+2=22 {0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1} to construct a TSM tree with step m=3.

1) Construction of a tree by an example.

( III ) Tree Structure Method (TSM)

Page 15: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 151515

(III) Tree Structure Method (TSM)

0 0 1 (1) in decimal0 1 0 (2):0 1 1 (3)1 1 0 (6)1 0 1 (5)

Counting The triple

(0, 0, 0) → There are 0 (0, 0, 1) → There are 2 (0, 1, 0) → There are 6 (0, 1, 1) → There are 2 (1, 0, 0) → There are 1 (1, 0, 1) → There are 7 (1, 1, 0) → There are 2 (1, 1, 1) → There are 0

iii) Using 22 binary digits group 3 digits and make 20 overlapping groups.

Page 16: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 161616

(III) Tree Structure Method (TSM)

iv) Construct a Tree

Sta rt

0

1

1

0

1

0

1

0

1

0

1

0

1

0

(10)

(10)

(2)

(8)

(8)

(2)

(0)

(2)

(6)

(2)

(1)

(7)

(2)

(0)

(Note)m = Number of Steps, (n+m-1) binary digits.at the m-th step the expected number in each cell isn/2m.

Page 17: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 171717

(III) Tree Structure Method (TSM)

2) Test StatisticsAt each stage i

where n0 and n1 are observed counts of 0 and 1, and npi0 and npi1are expected counts respectively. (j=1,2, … , 2j-1 )

Xij follows chi-squared distribution with df=1.Hence, we obtain 2j-1 Xij values.

Let Yi=max{Xij ; j=1,2, … , 2j-1 }.

Page 18: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 181818

So, for given value of p we can find y by taking inverse of standard normal deviates.

(III) Tree Structure Method (TSM)

3) Distribution of test statistic Y

Let Xi ~ X2(1) for each i and Y=Max{Xi ; i=1,.., n}The cdf of Y is given by

Let t=X2 transformation then,

Hence,

Page 19: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 191919

(III) Tree Structure Method (TSM)

In Our Example…

Page 20: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 202020

(III) Tree Structure Method (TSM)

4) Alternative Test Statistic

Page 21: Random Numbers and Statistical Tests

Jae Chang Lee, Korea Univ.

Random Numbers and Statistical Tests 212121

Monte Carlo Simulation was Performed at α=0.005 to estimate the power of tests. It was repeated 500 times.

(III) Tree Structure Method (TSM)

5) Power Comparison

Consider a Markov Transition Matrix

Randomness implies a=b=1/2And n-th transition in given by