an optimised implementation of mwc pseudo- random number
TRANSCRIPT
BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de
Universitatea Tehnică „Gheorghe Asachi” din Iaşi Tomul LVIII (LXII), Fasc. 2, 2012
SecŃia AUTOMATICĂ şi CALCULATOARE
AN OPTIMISED IMPLEMENTATION OF MWC PSEUDO-
RANDOM NUMBER GENERATOR ANALYSED WITH THE
NIST STATISTICAL TEST SUITE FOR RANDOMNESS
BY
MARIAN CREłU∗∗∗∗
“Politehnica” University of Bucharest, Faculty of Electronics, Telecommunications and Information Technology
Received: April 5, 2012 Accepted for publication: May 7, 2012
Abstract. This paper aims to analyse the efficiency of a personalized
implementation of the multiply-with-carry (MWC) pseudo-random number generator (PRNG) created by George Marsaglia. The algorithm was modified to accept two MWC generators working in parallel and an XOR function applied to the individual outputs of the generators to obtain the final pseudo-random output. This implementation aims to demonstrate that applying the bitwise XOR function on two pseudo-random values the resulted value is also pseudo-random. For statistical tests, there was used the test battery created by U.S. National Institute of Standards and Technology (NIST) to analyse the pseudo-random number generators randomness.
Key words: NIST; MWC; PRNG; statistical analyse; entropy; P-value.
2010 Mathematics Subject Classification: 54C70, 46N30, 65C10.
∗Corresponding author; e-mail: [email protected]
Marian CreŃu
10
1. Introduction
Random Number Generators (RNGs) are used in a large applications
spectrum such as neural networks, image processing, genetic algorithms, online gambling, video games and secure communications. The most common RNGs are the Truly Random Number Generators (TRNGs) and the Pseudo-Random Number Generators (PRNGs). While the first ones uses truly random factors from physical phenomena such as space noises, jittered oscillators or other unpredictable sources as inputs, the second ones uses different algorithms where the inputs are predefined seeds or values obtained from the machine states where algorithms are running (such as system clock, mouse movement or other processes running in that moment on that machine) (Ferguson & Schneier, 2003; Viega, 2003).
Geographic information system (GIS) applications use PRNGs to calculate the impact of error (Holmes et al., 2000; Næsset, 1999), to realize dynamic modeling (Burrough et al., 2000) or for stochastic simulation (Goovaerts, 2000). Radio frequency identification (RFID) devices use them in security features for password-protected operations as in (Che et al., 2008). TRNGs were successfully implemented in field programmable gate arrays (FPGAs) as described in (Tsoi et al., 2003; Kohlbrenner & Gaj, 2004).
2. PRNGs in Cryptography
An important domain where RNGs are extremely important is
cryptography, where many applications and protocols use random and pseudorandom numbers in their functionality as encryption keys for messages, parameters in digital signatures, initialization vectors or session keys for stream cyphers. Here are listed some of the most known PRNGs:
a) Linear Congruential Generator (LCG); b) Combined Linear Congruential Generator (CLCG); c) Shuffled Multiple LCG (SMLCG); d) Recursive Linear Congruential Generator (RLCG); e) Multiple Recursive Generators (MRG); f) Inverse Congruential Generator (ICG); g) Combined Inverse Congruential Generators (CICG); h) Multiply With Carry (MWC); i) Mersenne Twister (MT); j) Linear Feedback Shift Register (LFSR); k) Generalized Feedback Shift Register (GFSR); l) Twisted Multiple Generalized Feedback Shift Register (TMGFSR); m) Primitive Trinomial Generators (PTG); n) Fibonacci Generators (Fib and Luxury); o) Combined Hybrid Generators; p) Blum Blum Shub Generator (BBS).
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
11
2.1. Attacks on PRNGs
Because of their importance in encryption processes, both software and hardware type generators must handle a wide range of attacks well classified in (Kelsey et al., 1998).
a) Direct Cryptanalytic Attack – the attacker is able to identify if the outputs are random or generated by a PRNG;
b) Input-Based Attacks – the attacker controls the inputs and outputs of the PRNG and have the knowledge to analyse it;
c) State Compromise Extension Attacks – the attacker try to extend the advantages gained from a successful previously attack that recovered the state S.
– Backtracking Attacks – this type of attack uses a weakness of PRNG, taking advantage of the state S at a moment t to calculate previous outputs;
– Permanent Compromise Attacks – for a compromised state S at a moment t all the rest of S values are vulnerable;
– Iterative Guessing Attacks – for known S values at the moment t, observing the PRNG outputs, S values at the moment t + x can be predicted;
– Meet-in-the-Middle Attacks – is a combination of the backtracking attack and iterative guessing attack meaning that knowing the S values at the moments t and t + 2x will permit the attacker to calculate S values at the moment t + x.
2.2. PRNG Requirements
To resist to previously listed attacks, PRNGs must fulfill some security
requirements. Here are some guidelines for PRNGs developers: a) PRNGs must be based on something strong – usually, cryptographic
attacks are based on primitives considered strong; b) “Catastrophic reseeding” of PRNG is preferred – the entropy pool
must be separated from the outputs of the internal state. To resist to iterative guessing attacks the generation state have to be modified when good entropy is achieved;
c) Resistance to backtracking – for a compromised state S at the moment t +1, the state from the moment t must be impossible to guess. One-way function used to every few outputs is indicated;
d) Change of the entire PRNG state over time – this is a measure that prevents the compromise of the internal state of PRNG over time;
e) Quick recover from compromises – every bit of entropy received from output must be used in the PRNG advantage, forcing the attacker to analyse the entire input sequence;
f) Resistance to chosen-input attacks – PRNG inputs must be combined in the state so attackers who know the PRNG state without knowing the input, or vice-versa, must be unable to guess the final state.
Marian CreŃu
12
2.3. NIST Statistical Battery Test
NIST SP 800-22 is a standard published by U.S. National Institute of Standards and Technology and include a suite of 16 tests to determine the quality of PRNGs. In the final revised edition (NIST S.P. 800-22, 2010), the Lempel-Ziv test was dropped, remaining only 15 randomness statistical tests:
2.3.1. Frequency Test
The purpose of this test is to calculate the proportion of the number of
“1” and “0” within the given sequence. For a proportion of approximately 1/2 from the total number of the elements for the two values, the sequence is considered random (Pitman, 1993). The test cosists in:
a) Conversion to ±1: “1” and “0” values from the sequence ε are converted to “–1” and “+1” and added to produce Sn = X1 + X2 + … + Xn, where Xi = 2εi – 1.
b) Computation of Sobs = |Sn|/ n .
c) Computation of P-value = erfc (Sobs/ 2 ), where
22( ) e du
z
erfc z uπ
∞−= ∫ (1)
is the error function.
2.3.2. Frequency Test within a Block
This test is realized to compute the proportion of “1” within M-bit block sequences. For a frequency of M/2 for the ones in an M-bit block, the sequence is considered random. The χ2 distribution is considered for statistical test (Knuth, 1998).
Test description:
a) The input sequence is divided into n
NM
= non-overlapping blocks.
b) Computation of proportion πi of “1” for every M-bit block
( 1)1
1 M
i i M j
jM
π ε − +=
= ∑ . (2)
c) Computation of χ2 statistic, Sobs =|Sn|/ n ,
( )22
1
1(obs) 4 2
N
i
i
Mχ π=
= −∑ . (3)
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
13
d) Computation of P-value = igamc (N/2; χ2(obs)/2), where igamc is the
incomplete gamma function.
2.3.3. Runs Test
This test is focused on computing the total number of runs of
uninterrupted identical bits. A k length run of ones is bounded before and after with zeros. It also determines the oscillation between “1” and “0” values and the total number of runs, Vn(obs), for both zeros and ones (Godbole & Papastavridis, 1994).
Test description: a) Computation of the proportion π of “1” values in the sequence ε:
1j
jn
π ε= ∑ . (4)
b) Check if the frequency test is passed. If the test is not applicable, the P-value is set to 0.0000.
c) Computation of statistical test
1
1
(obs) ( ) 1n
n
k
V r k−
=
= +∑ , (5)
where: r(k) = 0 for εk = εk+1, or r(k) =1 otherwise.
d) P-value computation
−
−−=−
)1(22
)1(2)(
ππ
ππ
n
nobsVerfcvalueP
n . (6)
2.3.4. Test of the Longest Run of ones in a Block
This test is focused on the longest run of “1” values in a M-bits block.
Its purpose is to calculate if the longest run of ones from ε is consistent with the length of the one expected in a random sequence (Godbole & Papastavridis, 1994).There are three values for M length in accordance with ε length n as it is presented in Table 1.
Table 1 M Length Values
Minimum n M
128 8
6272 128
750000 104
Marian CreŃu
14
Test description: a) Divide ε into N sequences of M-bit blocks. b) Computation of the vi frequencies for the longest runs of ones in each
block (Table 2).
Table 2 Frequencies Computation
vi M = 8 M = 128 M = 104
v0 ≤ 1 ≤ 4 ≤ 10
v1 2 5 11
v2 3 6 12
v3 ≥ 4 7 13
v4 8 14
v5 ≥ 9 15
v6 ≥ 16
c) Computation of χ2(obs) with relation:
( )22
0
(obs)K
i i
ii
v N
N
πχ
π=
−=∑ , (7)
where the values of K and N depend on M value, as presented in Table 3.
Table 3 Values of K and N for the Computation of χ2
M K N
8 3 16
128 5 49
104 6 75
d) Computation of P-value:
2(obs)
-2 2K
P value igamcχ
= ⋅
. (8)
2.3.5. Binary Matrix Rank Test
This test calculates the rank of sub-matrices of ε. Linear dependence
among fixed length substrings of ε is checked (Marsaglia & Tsay, 1985). Here χ2(obs) is a measure of how ranks of various orders matches the expected values
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
15
for randomness characteristics. Test description:
a) Divide ε into N sequences of M•Q-bit disjoint blocks, N = n
MQ
.
b) Determine the binary rank of every matrix (Rl) in sequence. c) For: FM – number of matrices with Rl = M and FM – 1 – number of matrices with Rl = M – 1, N – FM – FM – 1 – is the number of remaining matrices. d) Computation of
N
NFFN
N
NF
N
NFobs MMMM
1336.0
)1336.0(
5776.0
)5776.0(
2888.0
)2888.0()(
21
21
22 −−−
+−
+−
= −−χ . (9)
e) Computation of P-value = 2 (obs)/2e χ− .
2.3.6. Discrete Fourier Transform Test
This test seeks to detect repetitive patterns, using DFT (Discrete Fourier
Transform) applied to the ε sequence. It detects the deviations from the assumption of randomness for bit patterns. d represents the normalized difference between expected and observed number of frequency components over 95% thresholds.
Test description: a) “0” and “1” values from ε are transformed in “–1” and “+1” to create
the sequence X = x1, x2,…, xn , where xi = 2εi – 1. b) Apply DFT on X to compute S = DFT(X). c) Computation of M = |S′|, where S′ is a substring obtained from first
n/2 elements of S. d) Computation of 95% threshold value for T:
( )log0.05T n= − . (10)
The assumption of randomness presumes that 95% of values are under T. e) Computation of N0 = 0.95 n/2, where N0 is the expected number of
peaks less than T value. f) Computation of N1, the number of peaks in M less than T. g) Computation of
1 0
(0.95)(0.05) 4
N Nd
n
−= . (11)
Marian CreŃu
16
h) Computation of
-2
dP value erfc
=
. (12)
2.3.7. Non-overlapping Template Matching Test
This test aims to detect generators that provide aperiodic patterns. An
m-bit length sequence is used to determine m-bit patterns. B is a string of length m (Barbour, 1992).
Test description: a) Sequence ε is partitioned in N blocks of length M. b) Consider Wj, (j = 1,…,N), as the number of B (the template)
occurrences within the block j. c) Computation of variance σ2 and mean µ, considering the assumption
of randomness
22
1 2 1
2 2m m
mMσ
− = −
, (13)
1
2m
M mµ
− += . (14)
d) Computation of
( )222
1
(obs)N
j
j
W µχ
σ=
−=∑ . (15)
e) Computation of
2(obs)-
2 2N
P value igamcχ
= ⋅
. (16)
2.3.8. Overlapping Template Matching Test
This test counts the number of occurrences of previously specified
strings. As in the non-overlapping template matching test it uses an m-bit window in order to find an m-bit pattern. It aims to detect generators that provide aperiodic patterns. An m-bit length sequence is used to determine m-bit patterns. B is a string of length m (Hamano & Kaneko, 2007).
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
17
Test description: a) Sequence ε is partitioned in N blocks of length M. b) Count the number of occurrences of B for each of the N blocks and
record them by incrementing an array, υi (where i = 0,…,5). c) Computation of λ and η used to calculate the probabilities πi for the
corresponding classes of v0,
1
2m
M mλ
− += , (17)
2λ
η = . (18)
d) Computation of
( )252
0
(obs) i i
ii
v N
N
πχ
π=
−=∑ . (19)
e) Computation of
=−
2
)(,
2
5 2obs
igamcvaluePχ . (20)
2.3.9. Universal Test (Maurer’s Statistical Test)
This test detects if a sequence can be significantly compressed without
information alteration. A non-random sequence can be significantly compressed (Gustafson et al., 1994). For L being the length of each block, fn represents the sum of log2 distances between L-bit templates.
Test description
a) There are two segments for the n-bit sequence ɛ: the initialization
segment of Q L-bit blocks and a test sequence of K L-bit blocks, as it is shown in Fig. 1 (NIST S.P. 800-22, 2010).
Fig. 1 − The segments for the n-bit sequence ε.
Marian CreŃu
18
b) A table is created for each L-bit value, using the initialization segment.
Table 4
The Table for L-bit Values
Possible L-bit value
00 (T0) 01 (T1) 10 (T2) 11 (T3)
Initialization 0 2 4 0
c) Examination of the K blocks and determination of the number of
occurrences. d) Computation of statistical test
21
1log ( ),
Q K
n j
i Q
f i TK
+
= +
= −∑ (21)
where Tj represents the table entry for the decimal representation for the i-th L-bit block.
e) Computation of
expectedValue( )- ,
2nf L
P value erfcσ
−=
(22)
where σ is the theoretical standard deviation:
3/0.8 32 variance( )0.7 4 .
15
LK L
L L Kσ
− = − + +
(23)
2.3.10. Linear Complexity Test
This test focuses on the length of an LFSR (Linear Feedback Shift
Register). A short LFSR implies non-randomness. Here, χ2(obs) is a measure of how the observed number of occurrences of LFSRs fixed length matches an expected number.
Test description a) The n-bit sequenceis partitioned into N blocks of M-bits. b) The linear complexity Li for each of the N blocks is determined using
Berlekamp-Massey algorithm (Menezes et al., 1997). c) Computation of mean µ, considering the assumption of randomness:
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
19
12
9 ( 1) 3 9 .2 36 2
M
M
MM
µ+ ++ −
= + − (24)
d) Calculate a value of Ti for each substring:
9
2)()1( +−•−= µi
M
i LT . (25)
e) Record the Ti values in v0 ,…,v6.
f) Computation of
5 22
0
( )(obs) .i i
ii
v N
N
πχ
π=
−=∑ (26)
g) Computation of
2 ( )
,2 2
K obsP value igamc
χ − =
. (27)
2.3.11. Serial Test
The serial test determines the number of occurrences of 2m
m-bit patterns and their chance to appear across the entire sequence ε (Knuth, 1998). For m = 1, this test is similar to frequency test.
Test description:
a) Create a sequence ε′ and extend it with the first m – 1 bits to the end of ε′n.
b) Determine frequencies for all overlapping m-bit, (m – 1)-bit and (m – 2)-bit blocks.
c) Computation of
1 1
1 1
22 2
... ...... ...
2 2,
2m m
m m
m m
m i i i imi i i i
nv v n
n nψ = − = −
∑ ∑ (28)
1 1 1 1
1 1 1 1
21 12 2
1 ... ...1... ...
2 2,
2m m
m m
m m
m i i i imi i i i
nv v n
n nψ
− −
− −
− −
− −
= − = −
∑ ∑ (29)
Marian CreŃu
20
1 2 1 2
1 2 1 2
22 22 2
2 ... ...2... ...
2 2.
2m m
m m
m m
m i i i imi i i i
nv v n
n nψ
− −
− −
− −
− −
= − = −
∑ ∑ (30)
d) Computation of
2 2 2
1,m m mψ ψ ψ −∇ = − (31)
2 2 2 2 2
1 22 .m m m mψ ψ ψ ψ− −∇ = − + (32)
e) Computation of
( )2 2- 1 2 , ,mmP value igamc ψ−= ∇ (33)
and
( )3 2 2- 2 2 , .mmP value igamc ψ−= ∇ (34)
2.3.12. Aproximate Entropy Test
The approximate entropy test compares the frequency of two consecutive blocks with lengths m and m + 1, against an expected value for a random sequence ε (Rukhin, 2000).
Test description
a) Enlarge the n-bit sequence to realize n overlapping m-bit sequences by appending the first m – 1 bits of the sequence, to the end of it.
b) Count the n overlapping blocks and represent the count of m-bit and (m + 1)-bit as Ci
m, with i as the m-bit value.
c) Computation of Cim = (number of i)/n for each value of i.
d) Computation of
2 1
( )
0
log ,
m
mi i
i
ϕ π π−
=
=∑ (35)
where πi = Cj
3 and j = log2i.
e) Steps a),…,d) are repeated, replacing m by m+1.
f) Computation of the statistical test
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
21
[ ]2 2 log2 ( ) ,n ApEn mχ = − (36)
where ApEn(m)= φ
(m) – φ(m+1)
. g) Computation of
( )1 2- 2 , 2 .mP value igamc χ−= (37)
2.3.13. Cumulative Sums Test
This test is focused on the maximal excursion of the random walk defined
by the cumulative sum of modified digits in the sequence and determines if they are too small or too large for random sequences (Revesz, 1990).
Test description a) Convert the values Xi of the ε sequence in –1 (for zeros) and +1 (for
ones), using Xi = 2εi – 1. b) Computation of partial sums Si for successively subsequences. c) Computation of statistical test z = max1≤k≤n|Sk| for max1≤k≤n|Sk| as the
largest value of sums Sk. d) Computation of
1 4
1 4
1 4
3 4
(4 1) (4 1)- 1
(4 3) (4 1) .
n
z
nk
z
n
z
nk
z
k z k zP value
n n
k z k z
n n
−
− = +
−
− = −
+ − = − Φ −Φ +
+ + + Φ −Φ
∑
∑
(38)
2.3.14. Random Excursion Test
The random excursion test is focused on the number of cycles with K
visits in a cumulative random walk and determines the number of visits to a state in a cycle (Baron & Rukhin, 1999).
Test description: a) Convert the values Xi of the ε sequence in –1 (for zeros) and +1 (for
ones), using Xi= 2εi – 1. b) Compute of partial sums Si for successively subsequences and form
the set S = {Si}.
Marian CreŃu
22
c) Form of another sequence, S′, by appending zeros before and after the S. d) For J as the total number of zeros in S′ (except the starting zero)
lower than 500 the test is interrupted. e) Computation of the frequency x in each cycle where x takes values in
–4 ≤ x ≤ –1 and 1 ≤ x ≤ 4.
Table 5 Frequency Computation in Each Cycle
Cycles State x Cycle 1
(0,–1,0) Cycle 2 (0,1,0)
Cycle 3 (0,1,2,1,2,1,2,0)
–4 0 0 0
–3 0 0 0
–2 0 0 0
–1 1 0 0
1 0 1 3
2 0 0 3
3 0 0 0
4 0 0 0
f) Computation of the total number of cycles vk(x) where state x occurs k
times within all cycles, for each of the state of x. g) Computation of statistical test χ2(obs) for each of x values:
[ ]252
0
( ) ( )(obs)
( )k k
kk
v x J x
J x
πχ
π=
−=∑ . (39)
h) Computation of P-value for each state of x:
25 ( )- , .
2 2obs
P value igamcχ
=
(40)
2.3.15. Random Excursion Variant Test
This test focuses on the total number of times that a state occurs in a
random walk and detects the deviations from an expected number of occurrences. It contains eighteen subtests with individual conclusions (Revesz, 1990; Baron & Rukhin, 1999).
Test description:
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
23
a) Convert the values Xi of the ε sequence in –1 (for zeros) and +1 (for ones), using Xi = 2εi – 1.
b) Compute of partial sums Si for successively subsequences and form the set S = {Si}.
c) Form of another sequence S′ by appending zeros before and after the S. d) Computation of ξ(x) as the total number of times where the state
occurred on all J cycles. e) Computation of P-value for each ξ(x)
( )
2 (4 2)
x JP value erfc
J x
ξ − − = −
(41)
3. Multiply-With-Carry (MWC) Generator
The Multiply-With-Carry (MWC) generator was proposed by George
Marsaglia in 1994 and analysed by Couture and L’Ecuyer in 1997. MWC was proposed as a modification of the Add-With-Carry (AWC) generator.
For an integer “base”, b ≥ 2 and integer coefficients a0, a1,…,ar , with a0 prime to b, the MWC generator of order r and base b have the state
( )1,..., ; ,rx x cσ − −= (42)
where 0 ≤ xi′ < b and c ∈ Z and T is a transformation rule
( )1: ' ' ,..., ' ; ' ,rT x x cσ σ − −→ = (43)
For i < –1, x′i = xi+1. x′–1 and c′ represents unique solutions of
0 11
' ' ,r
i i
i
a x c b a x c− −=
+ = +∑ (44)
with 0 ≤ x′i < bx′–1 and c′ are computed as follows
10 , (mod ),A a b−= (45)
This is realized as an integer in the interval [0, b–1). First we set
1
,r
i i
i
a x cτ −=
= +∑ (46)
and compute:
Marian CreŃu
24
1' ( ), (mod ),x A bτ− = (47)
( )0 1
1' ' .c a x
bτ −= − (48)
The value c is the “carry” (or “memory”) of the state. The result of the
state σ is OUT(σ) = x–r , where σ takes values according to formula (42). The normalized value is the real number x–r /b. c ∈ Z is an arbitrary value and therefore there is an infinity of states and
output sequences. Within a certain finite interval, w– ≤ c ≤ w
+, there are finitely periodic states. For any initial state, the output of the generator is eventually
periodic, depending on how far is c from the initial transient segment.
4. XOR Operation on Parallel MWC Generator
A generalized scheme of the MEC generator using the system clock as seed is presented in Fig. 2.
Fig. 2 – Classic MWC generator.
The XOR-MWC generator implementation proposed in this paper uses two MWC generators working in parallel with a time delay to generate different inputs and an implemented XOR function applied to the individual generator outputs to obtain the final random generated value (Fig. 3).
Fig. 3 – Parallel MWC generators with XOR function.
In order to analyse the proposed generator efficiency, individual files were generated, containing the values for the two separate MWC generators (MWC 1 GEN and MWC 2 GEN) and one file containing the final output values (XOR GEN). Generated values are situated in the interval 0…255. In
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
25
Fig. 4 we are able to visualize the values distribution for the first 1000 rounds of the implemented generator.
Fig. 4 – Distribution of MWC 1 GEN, MWC 2 GEN and XOR GEN
generated values for the first 1000 rounds.
The implemented generator was tested on 5000 rounds using the National Institute of Standards and Technology statistical test suite (sts-1.8). For each round 16 P-values were calculated, one for each test from statistical suite, except for the serial test that require the computation of two P-values.
The average of calculated p-values for MWC 1 GEN, MWC 2 GEN and XOR GEN, and the decision of test passing or failure (according to NIST Special Publication 800-22) can be found in Table 6.
Table 6
Average p-Values and Test Decision for MWC 1 GEN, MWC 2 GEN and XOR GEN
MWC Generators
MWC 1 GEN MWC 2 GEN XOR GEN Test Nr.
Statistical test Average P-value
Decision Average P-value
Decision Average P-value
Decision
1. Frequency 0.5878 passed 0.6334 passed 0.0482 passed
2. Frequency test within a block
0.5958 passed 0.7788 passed 0.3767 passed
3. Runs 0.6609 passed 0.3554 passed 0.5501 passed
4. Longest run of ones in a block
0.8878 passed 0.9778 passed 0.4646 passed
5. Binary matrix rank
0.7844 passed 0.3948 passed 0.7260 passed
6. Discrete Fourier Transform
0.8922 passed 0.7224 passed 0.4314 passed
7. Non-Overlapping Template Matching
0.8844 passed 0.8100 passed 0.3696 passed
Marian CreŃu
26
Table 6 Continuation
MWC Generators MWC 1 GEN MWC 2 GEN XOR GEN Test
Nr. Statistical test
Average P-value
Decision Average P-value
Decision Average P-value
Decision
8. Overlapping Template Matching
0.9766 passed 0.6888 passed 0.3767 passed
9. Universal test (Maurer’s Statistical test)
0.9609 passed 0.3465 passed 0.5269 passed
10. Linear complexity 0.9922 passed 0.9308 passed 0.4795 passed
0.0086 failed 0.8840 passed 0.7242 passed 11. Serial
0.5747 passed 0.5322 passed 0.6766 passed
12. Approximate Entropy
0.5008 passed 0.6945 passed 0.7044 passed
13. Cumulative Sums 0.5747 passed 0.7055 passed 0.6180 passed
14. Random Excursions
0.6877 passed 0.2997 passed 0.0256 passed
15. Random Excursions Variant
0.8753 passed 0.0098 failed 0.3114 passed
The distribution of the 80000 p-values computed in 5000 rounds for
MWC 1 GEN, MWC 2 GEN and XOR GEN is represented in Figs. 5, 6 and 7. The length of analysed sequences for MWC 1 GEN, MWC 2 GEN and
XOR GEN was n = 640000 bits (5000 blocks of 128 bits). P-value belongs to interval [0, 1] and it indicates that a sequence is
considered random (for a value close to “1”) or non-random (for a value close to “0”). For this, a significance level (α) is chosen in the range [0.001, 0.01].
Fig. 5 – Distribution of P-values computed for all 15 randomness
statistical tests for MWC 1 GEN generator.
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
27
Fig. 6 – Distribution of P-values computed for all 15 randomness statistical tests for MWC 2 GEN generator.
Fig. 7 – Distribution of P-values computed for all 15 randomness statistical tests for XOR GEN generator.
For P-value > α, the analysed sequence is considered random with a
confidence of 99.99% while for P-value < α, the sequence is considered non-random with a confidence of 99.99% (NIST S.P. 800-22, 2010). In the XOR GEN implementation α value is 0.01.
5. Conclusions
Despite its age, the multiply-with-carry (MWC) generator is a simple
but efficient pseudo-random number generator (PRNG). Using the seed obtained from the system clock instead of predefined large numbers makes it even more efficient by reducing the predictability of the first input values.
Marian CreŃu
28
In Figs. 5, 6 and 7 we can see that distributions of computed P-values for all statistical tests are quite similar in the interval [0, 1].
For a significance level (α) of 0.01, one sequence in 100 is expected to be considered non-random. As we can see in Table 6, MWC 1 GEN and MWC 2 GEN generators failed the serial and, respectively, random excursion variant test, while the bitwise XOR-ed sequences passes these tests.
For 5000 runs, the XOR GEN generator was proven viable, even if the frequency and the random excursions average P-values significantly decreased for the last 2000 rounds, meaning that more sequences were considered non-random.
REFERENCES
*** A Statistical Test Suite for Random and Pseudorandom Number Generators for
Cryptographic Applications. NIST Special Publ. 800-22, Revision 1a, April 2010.
Barbour A.D., Holst L., Janson S., Poisson Approximation. Clarendon Press, Oxford, UK, (Section 8.4 and Section 10.4), 1992.
Baron M., Rukhin A.L., Distribution of the Number of Visits for a Random Walk. Commun. in Stat.: Stoch. Models, 15, 593−597 (1999).
Burrough P.A., Van Gaans P.F.M., Macmillan R.A., High Resolution Landform
Classification Using Fuzzy k-Means. Fuzzy Sets a. Syst., 113, 1, 37−52 (2000). Che W., Deng, H., Tan, X., Wang, J., A Random Number Generator for Application in
RFID Tags. In Cole P.H., Ranasinghe D.C. (Eds.), Networked RFID Systems
and Lightweight Cryptography, Springer-Verlag, Berlin, 2008, 279−287. Couture R., L’Ecuyer P., Distribution Properties of Multiply-With-Carry Random
Number Generators. Mathem. of Comp., 66, 591−607 (1997). Ferguson N., Schneier B., Practical Cryptography. J. Wiley & Sons, NY, USA, 2003. Godbole A.P., Papastavridis S.G., Runs and Patterns in Probability: Selected papers.
Kluwer Acad., Dordrecht, Netherlends, 1994. Goovaerts P., Estimation or Simulation of Soil Properties? An Optimization Problem
with Conflicting Criteria. Geoderma, 97, 165−186 (2000). Gustafson H., Dawson E., Nielsen L., Caelli W., A Computer Package for Measuring
the Strength of Encryption Algorithms. Comp. & Sec., 13, 687−697 (1994). Hamano K., Kaneko T., The Correction of the Overlapping Template Matching Test
Included in NIST Randomness Test Suite. IEICE Trans. of Electron., Commun. a. Comp. Sci., E90-A(9), 1788−1792 (2007).
Holmes K.W., Chadwick, O.A., Kyriakidis, P.C. Error in a USGS 30-Meter Digital
Elevation Model and its Impact on Terrain Modeling. J. of Hydrol., 233, 154−173 (2000).
Kelsey J., Schneier B., Wagner D., Hall C., Cryptanalytic Attacks on Pseudorandom
Number Generators. Springer-Verlag, NY, USA, 1998. Kim S., Umeno K., Hasegawa A., Corrections of the NIST Statistical Test Suite for
Randomness. Cryptol. ePrint Arch., Report 2004/018, 2004. Knuth D.E., The Art of Computer Programming. Vol 2. Seminumerical Algorithms. 3rd
Ed., Addison-Wesley, Reading, Mass, USA, 42−47 (1998).
Bul. Inst. Polit. Iaşi, t. LVIII (LXII), f. 2, 2012
29
Kohlbrenner P., Gaj K., An Embedded True Random Number Generator for FPGAs. Proc. of the ACM/SIGDA 12th Internat. Symp. on Field Program. Gate Arrays, FPGA 2004, Monterey, California, USA, February 22-24, 2004, 71−78.
Marsaglia G., Tsay L.H., Matrices and the Structure of Random Number Sequences. Linear Algebra and its Applications, 67, 147−156 (1985).
Marsaglia G., Yet Another RNG. Posted to Electronic Bull. Board Sci. Stat. Math., Aug. 1, 1994.
Menezes A.J., Van Oorschot P.C., Vanstone S.A., Handbook of Applied Cryptography. CRC Press, Boca Raton, FL, USA, 1997.
Næsset E., Effects of Delineation Errors in Forest Stand Boundaries on Estimated Area
and Timber Volumes. Scand. J. of Forest Res., 14, 558−566 (1999). Pitman J., Probability. Springer-Verlag, New York, USA, 1993, 93−108. Revesz P., Random Walk in Random and Non-Random Environments. World Scientific,
Singapore, 1990. Rukhin A., Approximate Entropy for Testing Randomness. J. of Appl. Probab., 37 (2000). Tsoi K.H., Leung K.H., Leong P.H.W., Compact FPGA-Based True and Pseudo
Random Number Generators. 11th IEEE Symp. on Field-Program. Custom Comp. Machines (FCCM 2003), April 8-11, 2003, Napa, CA, Proc. IEEE Comp. Soc., 2003, 51−61.
Viega J., Practical Random Number Generation in Software. ACSAC '03, Proc. of the 19th Ann. Comp. Sec. Appl. Conf., Washington, DC, USA, IEEE Comp. Soc., 2003, 129.
O IMPLEMENTARE OPTIMIZATĂ A GENERATORULUI DE NUMERE PSEUDO-ALEATOARE MWC ANALIZATĂ CU SUITA DE
TESTE STATISTICE DE ALEATORISM NIST
(Rezumat)
Se analizează eficienŃa unei implementări personalizate a generatorului de
numere pseudo-aleatoare MWC (Multiply With Carry) realizat de George Marsaglia. În implementarea realizată, algoritmul înglobează două generatoare MWC proiectate să lucreze în paralel, precum şi o funcŃie XOR aplicată rezultaltelor acestor generatoare, pentru a obŃine un rezultat final pseudo-aleator. Prin această implementare s-a dorit să se demonstreze faptul că aplicând funcŃia XOR pe secvenŃele de biŃi rezultate din două generatoare pseudo-aleatoare, şirul de biŃi rezultat are caracteristici pseudo-aleatoare.
Pentru cele două generatoare pseudo-aleatoare (MWC 1 GEN şi MWC 2 GEN) valorile de iniŃializare au fost obŃinute din funcŃia de timp a sistemului pe care rulează aplicaŃia, cu o întârziere de 0.3 s între cele două generatoare. Generatorul final rezultat este XOR GEN.
Afişarea valorilor calculate (P-values) pentru un nivel de semnificaŃie ales α = 0.01, indică o uniformitate a distribuŃiei acestora în intervalul [0, 1] pentru cele 5 000 de runde, fără diferenŃe semnificative între cele trei generatoare analizate. Tabelul 1 prezintă valorile medii calculate pentru fiecare din cele 15 teste de aleatorism în cazul celor trei generatoare, precum şi decizia de trecere a testului.
Pentru testarea statistică a fost utilizată bateria de teste statistice de aleatorism realizată de Institutul NaŃional de Standarde şi Tehnologii (NIST) din S.U.A.