session 2: secret key cryptography – stream ciphers – part 2
DESCRIPTION
Session 2: Secret key cryptography – stream ciphers – part 2. The Berlekamp-Massey algorithm. Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence. - PowerPoint PPT PresentationTRANSCRIPT
Session 2: Secret key cryptography – stream
ciphers – part 2
The Berlekamp-Massey algorithmComputational complexity of the
Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence.
Thus, if the linear complexity is very high, then the task of predicting the next bits of the sequence is too complex.
The Berlekamp-Massey algorithm
Then, in order to prevent the cryptanalysis of a pseudorandom sequence generator, we must design it in such a way that its linear complexity is too high for the application of the Berlekamp-Massey algorithm.
Pseudorandom sequence generators Based on LFSRs
The goals:
• Preserve good characteristics of the
PN-sequences
• Increase the linear complexity
The key is the initial state
Different families of generators
Combinational generators
Non linear filter Non linear combiner
LFSR
LFSR
LFSR
Non linear filters
In general, it is difficult to calculate the value of the linear complexity of the resulting sequence.
However, under some special conditions, it is possible to estimate the linear complexity of the resulting sequence.
Algebraic normal form
It is the form of a Boolean function that uses only the operations and
In the ANF, the product that includes the largest number of variables is denominated non linear order of the function.
Example: The non linear order of the function
f(x1,x2,x3)=x1x2x3x1x3 is 2.
Algebraic normal form
The ANF of a function can be determined from its truth table.
nn
u
uxxu
u
n
i
uiun
uuuu
a
xfa
xaxxxfn
i
1,0,,,
1,0
,,,
110
0:
1,0
1
0110
The Möbius transform
Algebraic normal form
Example: n=3, u=001
000001010011100101110111
x
Algebraic normal form
Example: n=3, u=010
000001010011100101110111
x
Algebraic normal formExample: n=3
x0 x1 x2 f
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 0
Algebraic normal form
u=000 u=001 u=010
000001010011100101110111
000001010011100101110111
000001010011100101110111
a000=f(0,0,0)=0 a001=f(0,0,0)++f(0,0,1)=0+1=1
a010=f(0,0,0)++f(0,1,0)=0+0=0
Algebraic normal form
u=011 u=100 u=101
000001010011100101110111
000001010011100101110111
000001010011100101110111
a011=f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)=
0+1+0+1=0
a100=f(0,0,0)++f(1,0,0)=0+0=0
a101=f(0,0,0)+ f(0,0,1)+f(1,0,0)+f(1,0,1)=
0+1+0+1=0
Algebraic normal form
u=110 u=111
000001010011100101110111
a110=f(0,0,0)+ f(0,1,0)+f(1,0,0)+f(1,1,0)=
0+0+0+1=1
a111=f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)+ f(1,0,0)+f(1,0,1)+f(1,1,0)+
f(1,1,1)=0
Algebraic normal form
f(x0,x1,x2)=a001x2+a110x0x1=x2+x0x1
Non linear filtersTheorem (Rueppel, 1984):
• With the LFSR of length n and with the filter function with the property that its unique term in the ANF of maximum order k is a product of equidistant phases, the lower limit of the linear complexity of the resultant sequence is
k
nLC
Non linear filters
Design principles:• The feedback polynomial: primitive
• The filter function must have various terms of each order.
• kn/2
• Include a linear term in order to obtain good statistical properties of the resulting sequence (balanced filter function).
Non linear combiners
In these generators, the keystream sequence is obtained by combining the output sequences of various LFSRs in a non linear manner.
Example – it is possible to use a Boolean function (without memory).
Non linear combiners
Two cryptographic principles by Shannon:• Confusion – we must use complicated
transformations – as many bits of the key as possible should be involved in obtaining a single bit of the keystream sequence (and the ciphertext).
• Diffusion – Every bit of the key must affect many bits of the keystream sequence (and the ciphertext).
Non linear combiners
Possible flaws of non linear combiners (to be considered during the design):• Bad statistical properties – e.g. too many
zeros/ones in the output sequence.
• Correlation – The output sequence coincides too much with one or more internal sequences – this enables correlation attacks.
Non linear combinersCorrelation attacks:
• It is possible to divide the task of the cryptanalyst into several less difficult tasks – “Divide and conquer”.
• In order to prevent the correlation attacks, the non linear function of the combiner must have, at the same time:• as high non linear order as possible
• as high correlation immunity as possible.
• These two requirements are opposite – we must find a trade off between these two values.
Non linear combiners
Correlation immunity:• A Boolean function is correlation immune of
order m if its output sequence is not correlated with any set of m and less input sequences.
• But, the higher the correlation immunity, the lower the non linear order k.
• The trade off (N is the number of variables)m+kN; 1kN, 0mN-1
Non linear combiners
A Boolean function is balanced if it has an equal number of 0s and 1s in its truth table.
The balanced correlation immune functions of order m are denominated m-resilient functions.
Non linear combinersExample:
• The sum modulo 2 of N variables has the maximum possible value of correlation immunity, N-1, but its non linear order is 1.
If the combination function contains memory, then the trade off between the correlation immunity and non linearity is not needed – it is possible to maximize both values – a single bit of memory is enough (Rueppel, 1984).
Non linear combiners
If F is a Boolean function of N periodic input sequences a1(t), a2(t), ..., aN(t), then the output sequence b(t) = F(a1(t), a2(t), ..., aN(t)) is a linear combination of various products of sequences.
These products are determined by determining the ANF of the function F.
Non linear combiners If in the ANF of the function F instead of
the sum and product modulo 2 we use the sum and product of integers, the resulting function is denominated F* and for the linear complexity and the period of the output sequence of F the following holds:
N
N
aPeraPeraPerbPer
aLCaLCaLCFbLC
,,,lcm
,,,*
21
21
Non linear combinersExample:
If the characteristic polynomials of the input sequences are:
20100210
20100210
,,*
,,
xxxxxxxxF
xxxxxxxxF
522
431
40
1:
1:
1:
XXa
XXa
XXa
All these polynomials are
primitive!
Non linear combiners
Example (cont.):• Then
46531,15,15lcm)(
4054444
bPer
bLC
Non linear combiners
The sum of N sequences in GF(q):
The equality holds if the characteristic polynomials of the input sequences are mutually prime.
N
iiaLCbLC
1
Non linear combinersThe sum of N sequences in GF(q):
Obviously, if the periods of the input sequences are mutually prime then
N
N
ii
aPeraPeraPerbPer
aLCbLC
,,,lcm
thenIf
21
1
N
iiaPerbPer
1
Non linear combinersExample:
89653
2
6110651
1
1
XXXXXf
XXXXXf
1212
89618961
Per
LC
Primitive!
The periods are Mersenne primes
Non linear combiners
Product of N sequences in GF(q):• Theorem (Golić, 1989)
If Per(ai) are mutually prime, then
• Theorem (Lidl, Niedereiter)
Per(ai) are mutually prime
N
iiaLCbLC
1
N
iiaPerbPer
1
Non linear combinersExample:
89653
2
6110651
1
1
XXXXXf
XXXXXf
1212
542989618961
Per
LC
Primitive!
The periods are Mersenne primes
Non linear combiners
The general case:• Let be the Boolean function obtained by
removing all the products from the function F except those of the maximum order. Let be the corresponding integer function.
^
F
*^
F
Non linear combiners
Theorem (Golić, 1989)• F depends on all the N input variables.
• Per(ai) are mutually prime.
• Then
N
ii
N
aPerbPer
aLCaLCFbLC
1
1
^
1,,1*
Non linear combinersExample:
If the characteristic polynomials of the input sequences are:
2010210
^
2010210
^
20100210
20100210
,,*
,,
,,*
,,
xxxxxxxF
xxxxxxxF
xxxxxxxxF
xxxxxxxxF
Primitive, periods Mersenne
primes
107974
2
896531
6110650
1
1
1
XXXXXf
XXXXXf
XXXXXf
Non linear combiners
Example (cont.)
121212
116401066088601078961
Per
LC
Geffe’s generator
32213
3221321 1,,
xxxxx
xxxxxxxF
F balanced – good statistical properties
Geffe’s generator
The equivalent scheme
Geffe’s generator
Example: polynomials – primitive, with periods that are Mersenne primes.
107974
3
896532
6110651
1
1
1
XXXXXf
XXXXXf
XXXXXf
121212
146081068888601078961
Per
LC
Geffe’s generator
Problem: Correlation!
4
3Pr
4
3Pr
2
10Pr
11Pr
2
1
21
21
nn
nn
nnn
nnn
ss
sssss
sss
Correlation immune functions
Is there a way to find a Boolean memoryless combiner that guarantees a high level of correlation immunity?
This is a difficult problem and there is no final answer.
However, some Boolean combiners are known to have a high level of correlation immunity.
Correlation immune functions
One of the classes of such “good” functions – Latin squares.
A Latin square is an n×n scheme of integers in which each element appears exactly once in each row and in each column.
Correlation immune functions
Basic property of Latin squares: • If we exchange two rows/columns of a Latin
square, the obtained scheme is also a Latin square.
This gives rise to a construction (one of the possible algorithms):• We start from the table of addition of the
additive group with n elements.
• We exchange some rows and columns of the table several times.
Correlation immune functions
Example – a Latin square of order 4:
3 2 0 1
1 0 2 3
0 3 1 2
2 1 3 0
Correlation immune functions
A Latin square of dimension n as a family of log2n Boolean functions (a vectorial Boolean function with log2n outputs):
• There are 2 address branches, log2n bits each
• The output has log2n bits.
Example (see previous slide):• The address is 0110 (the two most significant bits
address the row).
• The output is 10.
Correlation immune functions
Basic correlation-related property of Latin squares:• Each bit of output is correlated with a linear
combination of inputs that are located in both address branches.
• Consequence: there is no way of analyzing the address branches individually – no divide and conquer.
Correlation immune functions
Decimation of sequencesThe principal characteristic:
• The output sequence of a subgenerator controls the clock sequence of one or more subgenerators.
n
ii
nfn
Ynnf
nXZ
0
,2,1,0,
Decimation of sequencesExample 1:
• X=1,1,0,1,0,1,0,1
• Y=0,1,0,0,1
• Z=1,0,1,0,0
Example 2:• X and Y are generated by LFSRs and the
BRM is applied
Decimation of sequencesTheorem (Chambers, Jennings, 1984)
• R1, R2 – primitive polynomials, degrees m and n, respectively
• Periods M=2m-1 and N=2n-1
• All the prime factors of M divide N
• Then:
1,1
0
NXM
ii
MNPer
nMLC
Decimation of sequences
The requirements of the Theorem are satisfied if the lengths of both LFSRs are equal and the feedback polynomials are primitive.
Decimation of sequences
Example: n=m=107, primitive polynomials
LC=nM=107(2107-1)
Per = NM =(2107-1)(2107-1)
The shrinking generator (1993) A very simple binary sequence generator
(Crypto’93)
It consists of two LFSRs: LFSR1 and LFSR2
Based on P, LFSR1 (the control register) decimates the sequence generated by LFSR2
LFSR 1
LFSR 2
P
ia
ibjc
clock
The shrinking generator
If ai=0, bi is discarded, otherwise bi is sent to the output.
Thus the number of discarded bits from the sequence b depends on the lengths of runs of 0s in the sequence a.
The shrinking generator (an example)LFSRs: LFSR1: L1=3, f1(x)=1+x2+x3, IS1=(1,0,0)
LFSR2: L2=4, f2(x)=1+x+x4, IS2=(1,0,0,0)
Decimation rule P:
{ai}= 0 1 1 1 0 0 1 0 1 1 1 0 0 1 …
{bi}= 1 1 1 0 1 0 1 1 0 0 1 0 0 0 …
{cj}= 1 1 0 1 0 0 1 0 …
The underlined bits (1 and 0) are eliminated.
Characteristics of the output sequence
Period:
Linear complexity:
Number of 1’s:
balanced sequence
)1( 12 2)12( LLT
)1(2
)2(2
11 22 LL LLCL
)1()1( 12 22'1. LLsNo
Example – BRM vs. ShrinkingBRM:
• X=000100110101111…
• Y=001110100111010…
• Z=0010100111…
Shrinking:• X=000100110101111…
• Y=001110100111010…
• Z=01011011
Statistical testing of PN generators
The output sequence of a generator of pseudorandom sequences looks random, but it is not.
Pseudorandom generators expand a truly random sequence (the key) to a much longer sequence, such that an adversary cannot distinguish between the pseudorandom sequence and a truly random sequence.
Statistical testing of PN generators
In order to obtain a guarantee of the security of this type of generators various statistical tests are applied, especially designed for this purpose.
The fact that a generator passes a set of statistical tests should be considered a necessary condition, although not a sufficient one, for the security of the generator.
Statistical testing of PN generators If the result X of an experiment can take any
real value, then X is a continuous random variable.
The probability density function f(x) of a continuous random variable X can be integrated and the following holds:• f(x)0, for all xR
• For all a, b R the following holds
1dxxf
b
a
dxxfbXaP
Statistical testing of PN generators
A continuous random variable has a normal distribution with the mean and the variance 2 if its probability density function is:
We say that X is If X is , then we say that X has a
standard normal distribution.
xexf
x2
2
2
2
1
2,N
1,0N
Statistical testing of PN generators
If the random variable X is , then the variable is .
The Euler’s gamma function:
2,N / XZ 1,0N
0
1 dxext xt
Statistical testing of PN generators
A continuous random variable X has a 2 distribution with degrees of freedom if its probability density function is
0,0
0,22/
1 21
22/
x
xexxf
x
22
Statistical testing of PN generators
A statistical hypothesis H is an affirmation about the distribution of one or more random variables.
A hypothesis test is a procedure based on the observed values of the random variable that leads to the acceptance or rejection of the hypothesis H.
Statistical testing of PN generators
The test only provides a measure of the strength of evidence given by the data against the hypothesis.
The conclusion is probabilistic.The level of significance of the test of
the hypothesis H is the probability of rejecting the hypothesis H when it is true.
Statistical testing of PN generators
The hypothesis to be tested is denominated the null hypothesis, H0.
The alternative hypothesis is denoted by H1 or Ha.
In cryptography:
• H0 – the given generator is a random sequence generator.
Statistical testing of PN generators
If is too small, the test could accept non random sequences.
If is too high, the test could reject random sequences.
In cryptography: is between 0,001 and 0,05.
Statistical testing of PN generators
A test:• Determines a statistic for the sample of the
output sequence.
• This statistic is compared with the expected value of a random sequence.
Statistical testing of PN generators
How is the comparison carried out?
• The computed statistic – X0 – follows a 2 distribution with degrees of freedom.
• It is assumed that this statistic takes large values for non random sequences.
• In order to achieve , a threshold X is chosen (by means of the corresponding table), such that P(X0>X)=.
Statistical testing of PN generators
How is the comparison carried out? (cont.)• If the value of the statistic for the sample of the
output sequence, Xs, satisfies Xs>X, then the sequence fails on the test.
Basic tests for cryptographic use:• Frequency test, serial test, poker test, runs
test, autocorrelation test, etc.
Statistical testing of PN generators
Frequency test• Purpose: determine if the number of zeros and ones
in a sequence s is approximately the same.
• n0 – number of zeros, n1 – number of ones.
• The statistic:
10
210
1
nnnn
nnX
Statistical testing of PN generators
Frequency test (cont.)• The statistic follows a 2 distribution with 1
degree of freedom.
• The approximation is good enough if n10.
Statistical testing of PN generators Serial test
• Tries to determine if the number of occurrences of 00, 01, 10 and 11, as subsequences of s is approximately the same.
• The statistic:
• The statistic follows a 2 distribution with 2 degrees of freedom.
• The approximation is good enough if n21.
1
12
1
4
11100100
21
20
211
210
201
2002
nnnnn
nnn
nnnnn
X
Statistical testing of PN generators
Poker test• A positive integer m is considered such that
• The sequence s is divided into k parts of size m.
• ni is the number of occurrences of the type i of the sequence of length m, 1i2m (that is, i is the value of the integer whose binary representation is the sequence of length m.
• The test determines if every sequence of length m appears approximately the same number of times.
m
m
nk 25
Statistical testing of PN generators
Poker test (cont.)• The statistic:
• The statistic follows approximately a 2 distribution with 2m-1 degrees of freedom.
knk
Xm
ii
m
2
1
23
2
Statistical testing of PN generators
Runs test• A run of length i – a subsequence of s formed
by i consecutive zeros or i consecutive ones that are neither preceded nor followed by the same symbol.
• A run of zeros – gap
• A run of ones – block
Statistical testing of PN generators Runs test (cont.)
• Purpose: determine if the number of runs of different lengths in the sequence s is that expected in a random sequence.
• The number of gaps (or blocks) of length i in a random sequence of length n is
• It is considered that k is equal to the largest integer i for which ei5.
• We denote by Bi and Hi the number of blocks and gaps of length i in s, for each i, 1ik.
22/3 ii ine
Statistical testing of PN generators
Runs test (cont.)• The statistic
• The statistic follows approximately a 2 distribution with 2k-2 degrees of freedom.
k
i i
iik
i i
ii
e
eH
e
eBX
1
2
1
2
4
Statistical testing of PN generators
Autocorrelation test• Checks the correlation between s and shifted
versions of s.
• An integer d, 1dn/2 is considered.
• The number of bits in s that are not equal to the d-shifts is
1
0
dn
idii ssdA
Statistical testing of PN generators
Autocorrelation test (cont.)• The statistic
• The statistic follows approximately a N(0,1) distribution.
• The approximation is good enough if n-d10.
dn
dndA
X
22
5