sponge-based pseudo-random number generators · 2010. 5. 12. · sponge-based pseudo-random number...

15
Sponge-based pseudo-random number generators Guido Bertoni 1 , Joan Daemen 1 , Micha¨ el Peeters 2 , and Gilles Van Assche 1 1 STMicroelectronics 2 NXP Semiconductors Abstract. This paper proposes a new construction for the generation of pseudo-random numbers. The construction is based on sponge functions and is suitable for embedded security devices as it requires few resources. We propose a model for such generators and explain how to define one on top of a sponge function. The construction is a novel way to use a sponge function, and inputs and outputs blocks in a continuous fashion, allowing to interleave the feed of seeding material with the fetch of pseudo-random numbers without latency. We describe the consequences of the sponge indifferentiability results to this construction and study the resistance of the construction against generic state recovery attacks. Finally, we propose a concrete example based on a member of the Keccak family with small width. Keywords: pseudo-random numbers, hash function, stream cipher, sponge func- tion, indifferentiability, embedded security device, Keccak 1 Introduction In various cryptographic applications and protocols, random numbers are used to generate keys or unpredictable challenges. While randomness can be extracted from a physical source, it is often necessary to provide many more bits than the entropy of the physical source. A pseudo-random number generator (PRNG) provides a way to do so. It is initialized with a seed, generated in a secret or truly random way, and it then expands the seed into a sequence of bits. For cryptographic purposes, it is required that the generated bits cannot be predicted, even if subsets of the sequence are revealed. In this context, a PRNG is pretty similar to a stream cipher. If the key is unknown, it must be infeasible to infer anything on the key stream, even if it is partially known. The state of the PRNG must have sufficient entropy, from the point of view of the adversary, so that the prediction of the output bits cannot rely on simply guessing the state. Hence, the seeding material must provide sufficient entropy. Physical sources of randomness usually provide seeding material with relatively low entropy rate due to imbalance of or correlations between bits. To increase entropy, one may use the seeding material from several randomness sources. However, this entropy must be transferred to the finite state of the PRNG. Hence, we need a way to gather and combine seeding material coming from several sources into the state of the PRNG. Loading different seeds into the

Upload: others

Post on 08-Oct-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

Sponge-based pseudo-random number generators

Guido Bertoni1, Joan Daemen1, Michael Peeters2, and Gilles Van Assche1

1 STMicroelectronics2 NXP Semiconductors

Abstract. This paper proposes a new construction for the generation ofpseudo-random numbers. The construction is based on sponge functionsand is suitable for embedded security devices as it requires few resources.We propose a model for such generators and explain how to define one ontop of a sponge function. The construction is a novel way to use a spongefunction, and inputs and outputs blocks in a continuous fashion, allowingto interleave the feed of seeding material with the fetch of pseudo-randomnumbers without latency. We describe the consequences of the spongeindifferentiability results to this construction and study the resistanceof the construction against generic state recovery attacks. Finally, wepropose a concrete example based on a member of the Keccak familywith small width.

Keywords: pseudo-random numbers, hash function, stream cipher, sponge func-tion, indifferentiability, embedded security device, Keccak

1 Introduction

In various cryptographic applications and protocols, random numbers are usedto generate keys or unpredictable challenges. While randomness can be extractedfrom a physical source, it is often necessary to provide many more bits than theentropy of the physical source. A pseudo-random number generator (PRNG)provides a way to do so. It is initialized with a seed, generated in a secret ortruly random way, and it then expands the seed into a sequence of bits.

For cryptographic purposes, it is required that the generated bits cannot bepredicted, even if subsets of the sequence are revealed. In this context, a PRNGis pretty similar to a stream cipher. If the key is unknown, it must be infeasibleto infer anything on the key stream, even if it is partially known.

The state of the PRNG must have sufficient entropy, from the point of viewof the adversary, so that the prediction of the output bits cannot rely on simplyguessing the state. Hence, the seeding material must provide sufficient entropy.Physical sources of randomness usually provide seeding material with relativelylow entropy rate due to imbalance of or correlations between bits. To increaseentropy, one may use the seeding material from several randomness sources.However, this entropy must be transferred to the finite state of the PRNG.Hence, we need a way to gather and combine seeding material coming fromseveral sources into the state of the PRNG. Loading different seeds into the

Page 2: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

PRNG shall result in different output sequences. The latter implies that differentseeds result in different state values. In this respect, a PRNG is similar to acryptographic hash function that should be collision-resistant.

It is convenient for a pseudo-random number generator to be reseedable, i.e.,one can bring an additional source of entropy after pseudo-random bits have beengenerated. Instead of throwing away the current state of the PRNG, reseedingcombines the current state of the generator with the new seeding material. Froma user’s point of view, a reseedable PRNG can be seen as a black box with aninterface to request pseudo-random bits and an interface to provide fresh seeds.

The remainder of this paper is organized as follows. We continue our introduc-tion with the advantages and limitations of our construction and an illustrativeexample of a pseudo-random number generator mode of a hash function. Wethen define the reference model of a reseedable PRNG in Section 2 and specifyand motivate our sponge-based construction in Section 3. We discuss the securityaspects of our proposal in Section 4 and provide a concrete example in Section 5.

1.1 Advantages and limitations of our construction

With their variable-length input and variable-length output, sponge functionscombine in a unified way the functionality of hash functions and stream ciphers.They make therefore a natural candidate for building PRNGs, taking the seedingmaterial as input and producing a sequence of pseudo-random bits as output.

In this paper, we provide a clean and efficient way to construct a reseed-able PRNG with a sponge function. The main idea is to integrate in the sameconstruction the combination of the various sources of seeding material and thegeneration of pseudo-random output bits. The only requirement for seeding ma-terial is to be available as bit sequences, which can be presented as such withoutany additional preprocessing. So both seeding and random generation can workin a continuous fashion, making the implementation simple and avoiding extraiterations when providing additional seeding material.

In the context of an embedded security device, the efficiency and the sim-plicity of the implementation is important. In our construction we can keep thestate size small thanks to two reasons. First, the use of a permutation preservesthe entropy of the state (see Section 1.2). Second, we have strong bounds on theexpected complexity of generic state recovery attacks (see Section 4.2).

Making sure that the seeding material provides enough entropy is out of scopeof this paper. This aspect has been studied in the literature, e.g., [10,16] andis fairly orthogonal to the problem of combining various sources and generatingpseudo-random bits.

In our construction, forward security must be explicitly activated. Forwardsecurity (also called forward secrecy) requires that the compromise of the currentstate does not enable the attacker to determine the previously generated pseudo-random bits [2,9]. As our construction is based on a permutation, revealing thestate immediately allows the attacker to backtrack the generation up to the pre-vious combination of that state and seeding material. Nevertheless, reseeding

2

Page 3: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

regularly with sufficient entropy already prevents the attacker from going back-wards. Also, an embedded security device such as a smartcard in which such aPRNG would be used is designed to protect the secrecy of keys and thereforereading out the state is expected to be difficult. Yet, we propose in Section 4.3a simple solution to get forward secrecy at a small extra cost. Hence, if forwardsecurity is required, one can apply this mechanism at regular intervals.

1.2 Using a hash function for pseudo-random number generation

Sponge functions are a generalization of hash functions and using the latterfor generating pseudo-random bits is not new, e.g., [12,14]. For instance, NISTpublished a recommendation for random number generation using deterministicrandom bits generators [14]. They specify how to implement a PRNG using ahash function, a keyed hash function, a block cipher or an elliptic curve. Whenusing a hash function H, the state of the PRNG is essentially determined by twovalues, V and C, each of the size of the input block of H.

– At initialization, both V and C are obtained by hashing the seeding material,a nonce and an optional personalization string. If V and C are larger thanthe output size of H, a specific derivation function is used to produce alonger digest.

– The pseudo-random bits are produced by hashing V . If more than one outputblock is requested, further blocks are produced by hashing V + i, where iis the index of the produced output block. The value of V is then updatedby combining it with, amongst others, H(V ) and C. The value C is notmodified in this process.

– When reseeding, the new value of V is obtained by hashing the old value ofV together with the new seeding material. The value C is derived from thenew value of V by hashing.

For a PRNG based on a hash function, there are two aspects we wish to drawattention to.

First, due to the requirements they must satisfy, cryptographic hash func-tion are not injective. Iterating the function, i.e., computing H(H(. . . H(x)) . . . )reduces the size of the range resulting in entropy loss. To prevent this, one canfor instance keep the original seed along with the evolving state. In the hashfunction based PRNG specified in [14], the value V evolves by iterated hashingevery time output bits are produced, but the value C does not and thereforekeeps the full entropy of the seed. This comes at the cost of keeping a statetwice the block size of the hash function.

Second, when reseeding, the current state or the original seed must be hashedtogether with the seeding material. However, the current state V and the seedC are already the result of a hashing process.

The sponge-based construction we propose below addresses these two aspectsmore efficiently. First, by using a P-sponge, i.e., a sponge function based on apermutation, no entropy is lost when iterating the permutation and this allows

3

Page 4: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

one to have a smaller state for the same security level. Second, the current stateof our construction is precisely the state of the sponge function. Hence, reseedingis more efficient than in the example above, as the current state can be reusedimmediately instead of being hashed again.

Finally, the use of a sponge function for PRNG is conceptually simpler thanexisting constructions.

2 Modeling a reseedable pseudo-random numbergenerator

We define a reseedable PRNG as a stateful entity that supports two types ofrequests, in any order:

– feed request, feed(σ), injects a seed consisting of a non-empty string σ ∈ Z+2

into the state of the PRNG;– fetch request, fetch(l), instructs the PRNG to return l bits.

The seeding material is the concatenation of the σ’s received in all feed requests.Informally, the requirements for a reseedable PRNG can be stated as follows.

First, its output (i.e., responses to fetch requests) must depend on all seedingmaterial fed (i.e., payload of feed requests). Second, for an adversary not know-ing the seeding material and that has observed part of the output, it must beinfeasible to infer anything on the remaining part of the output.

To have more formal security requirements, one often defines a referencesystem that behaves ideally. For sponge functions, hash functions and streamciphers the appropriate reference system is the random oracle [1]. For reseedablePRNG we cannot just use a random oracle as it has a different interface. However,we define an ideal PRNG as a particular mode of use of a random oracle.

The mode we define is the following. It keeps as state the sequence of allfeed and fetch requests received, the history h. Upon receipt of a feed requestfeed(σ), it updates the history by incorporating it. Upon receipt of a fetchrequest fetch(l), it queries the random oracle with a string that encodes thehistory and returns the bits z to z + l − 1 of its response to the requester, withz the number of bits requested in the fetch requests since the last feed request.Hence, concatenating the responses of a run of fetch requests is just the responseof the random oracle to a single query. This is illustrated in Figure 1. We call thismode the history-keeping mode with encoding function e(h). The definition of ahistory-keeping mode hence reduces to the definition of this encoding function.

As the output of the PRNG must depend on the whole seeding materialreceived, the encoding function e(h) must be injective in the seeding material.In other words, for any two sequences of requests with different seeding ma-terials, the two images through e(h) must be different. We call this propertyseed-completeness. With a seed-complete encoding function, the response of themode to a fetch request corresponds with non-overlapping parts of the responseof the random oracle to different input strings. It follows that the PRNG returnsindependent and a priori uniformly distributed bits.

4

Page 5: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

Fig. 1. Response of an ideal reseedable PRNG to fetch requests

We thus propose the following definition of an ideal PRNG. In the sequel,we will use PRNG to indicate a reseedable pseudo-random number generator.

Definition 1. An ideal PRNG is a history-keeping mode calling a random or-acle with an encoding function e(h) that is seed-complete.

3 Constructing a PRNG using a sponge function

In general, the history-keeping mode is not practical as it needs to store all pastqueries and hence requires ever growing amounts of memory. In this section wewill show that if we use a sponge function instead of a random oracle we candefine an encoding function that can work with a limited amount of memory.

3.1 The sponge construction

The sponge construction [3] is a simple iterated construction for building a func-tion S[f ] with variable-length input and arbitrary output length based on afixed-length transformation (or permutation) f operating on a fixed number b ofbits. Here b is called the width. A sponge function, i.e., a function implementingthe sponge construction provides a particular way to generalize hash functionsand has the same interface as a random oracle.

For given values of r and c, the sponge construction operates on a state ofb = r+c bits. The value r is called the bitrate and the value c the capacity. First,all the bits of the state are initialized to zero. The input message is padded andcut into blocks of r bits. The sponge construction then proceeds in two phases:the absorbing phase followed by the squeezing phase.

– In the absorbing phase, the r-bit input message blocks are XORed into thefirst r bits of the state, interleaved with applications of the function f . Whenall message blocks are processed, the sponge construction switches to thesqueezing phase.

– In the squeezing phase, the first r bits of the state are returned as outputblocks, interleaved with applications of the function f . The number of outputblocks is chosen at will by the user.

5

Page 6: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

The last c bits of the state are never directly affected by the input blocks andare never output during the squeezing phase. The capacity c actually determinesthe attainable security level of the construction [4].

3.2 Reusing the state for multiple feed and fetch phases

It seems natural to translate the feed of seeding material into the absorbingphase and the fetch of pseudo-random numbers into the squeezing phase of asponge function, as illustrated in Figure 2. However, as such, a sponge functionexecution has only one absorbing phase (i.e., one input), followed by a singlesqueezing phase (i.e., one output, of arbitrary length), and thus cannot be usedto provide multiple “absorbing” phases and multiple “squeezing” phases.

Fig. 2. The sponge construction with multiple feed and fetch phases.

This apparent difficulty is easy to circumvent. Conceptually, it suffices toconsider that each time pseudo-random bits are fetched, a different execution ofthe sponge function is queried with a different input, as illustrated in Figure 3.When entering the squeezing phase of each of these queries (so before pseudo-random bits are requested), one must thus guarantee that the data absorbed sofar compose a valid sponge input, i.e., the input is properly padded [3]. This canbe achieved by defining an encoding function adapted to the particular sponge.

In the sponge construction, an input message m ∈ Z∗2 must be cut into blocksof r bits and padded. Let us denote as p(m) the function that does this, and weassume that this function only appends bits after m (as in the padding of most,if not all, practical hash functions). Let us assume that we wish to reuse thestate of the sponge whose input was the string m1 and from which l > 0 outputbits have been squeezed. The state of the sponge function at this point is as ifthe partial message m′1 = p(m1)||0r(dl/re−1) was absorbed. Note that the zeroblocks account for the extra iterations due to the squeezing phase. Restartingthe sponge from this point means that the input is going to be a message m2 ofwhich m′1 is a prefix.

6

Page 7: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

Fig. 3. The multiple feed and fetch phases of Figure 2 can be viewed as a spongefunction queried multiple times, each having only one absorbing and one squeez-ing phase. In this example, P0||P1, P0||P1||P2 and P0||P1||P2||0r||P3 must all bevalid sponge inputs.

3.3 Constructing a reseedable pseudo-random number generator

To define a PRNG formally, we need to specify a seed-complete encoding functione(h) that maps the sequence h of feed and fetch requests onto a string of bits,as in Section 2. The output of e(h) is then used as input to the sponge function.In practice, the idea is not to call the sponge function with the whole e(h) everytime a fetch is requested. Instead, the construction uses the sponge function ina cascaded way, reusing the state as explained in Section 3.2. To allow the stateof the sponge function to be reused as described above, e(h) must be such thatif h′ = h||fetch(l)||feed(σ), then p(e(h))||0r(dl/re−1) is a prefix of e(h′).

We now explain how to link a mode to a practical implementation. To makethe description easier, we describe a mode with two restrictions. We later discusshow to implement a more elaborate mode without these restrictions. The firstrestriction is on the length of the seed requests. For a fixed integer k, we requirethat the length of the seeding material σ in any feed request feed(σ) is such that|p(σ)| = kr. In other words, after padding, the seeding material covers exactlyk blocks of r bits. The second restriction is that the first request must be feed.

7

Page 8: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

The mode is stateful, and its state is composed of m ∈ N, the number of bitsfetched since the last feed. We start with a new execution of a sponge function,and we set m = 0. Depending on the type of requests, the following operationsare done on the sponge function on the one hand and on the encoding functione(h) on the other. We denote by e a string that reflects e(h) as the requests areappended to the history h.

– If the request is fetch(l), the following is done.

• The implementation produces l output bits by squeezing them from thesponge function. Formally, e will be adapted during the next feed request.

• The value of m is adapted: m← m+ l.

– If the request is feed(σ), the following is done.

• Formally, this feed request triggers a query to the sponge function withe as input. If it is not the first request, e is up-to-date only up to the lastfeed request. So, the effect of the fetch requests since the last feed requestmust be incorporated into e, as if e was previously absorbed. First, ebecomes p(e) to simulate the padding when switching to the squeezingphase after the previous feed request. Then dm/re− 1 blocks of r zeroesare appended to e to account for the extra calls to the f function duringthe subsequent fetch requests. Now m is reset: m← 0. (This part affectsonly e formally; nothing needs to be done in the implementation.)

• Then, the implementation absorbs σ. Formally, this is reflected by ap-pending σ to e.

• Finally, the implementation switches the sponge function to the squeez-ing phase. This means that the absorbed data must be padded and thepermutation f is applied to the state. (Formally, this does not change e,as the padding is by definition performed when switching to the squeez-ing phase.)

To show that the encoding function is seed-complete, let us demonstratehow to find the seeding material from it. If e(h) is empty, no feed request hasbeen done and the seeding material is the empty string. If e(h) is not empty,it necessarily ends with the fixed amount of seeding material from the last feedrequest, which we extract. Before that, there can be one or more blocks of r bitsequal to zero. This can only come from blocks that simulate fetch requests, asthe padding function p would necessarily create a non-zero block. So, we can skipbackwards consecutive blocks of zeroes, until the beginning of e(h) is reachedor a non-zero block is encountered. In this last case, we can extract the seedingmaterial from the k blocks of r bits and move backwards by the same amount.Finally, we repeat this process until the beginning of e(h) is reached.

The construction, described directly on top of the permutation f , is given inAlgorithm 1. For completeness, we also give in Algorithm 2 an implementationof the squeezing phase of a sponge function, although it follows in a straight-forward way from the definition [3]. The cost of a feed request is always k callsto the permutation f . Consecutive fetch requests totalling m bits of output costdm/re − 1 calls to f . So a fetch(l) with l ≤ r just after a feed request is free.

8

Page 9: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

Algorithm 1 Direct implementation of the PRNG using the permutation f

s = 0r+c

m = 0while requests are received do

if the request is fetch(σ) with |p(σ)| = kr thenP1|| . . . ||Pk = p(σ)for i = 1 to k dos = s⊕ (Pi||0c)s = f(s)

end form = 0

end ifif the request is fetch(l) then

Squeeze l bits from the sponge function (see Algorithm 2)end if

end while

Algorithm 2 Implementation of the squeezing of l bits from the sponge function

Let a be the number of available bits, i.e., a = r if m = 0 or a = (−m mod r)otherwisewhile l > 0 do

if a = 0 (we need to squeeze the sponge further) thens = f(s)a = r

end ifOutput l′ = min(a, l) bits by taking bits r − a to r − a+ l′ − 1 of the stateSubtract l′ from a and from l, and add l′ to m

end while

The restriction of fixed-size feed requests is not essential and can be removed.The description of the mode would be only a bit more complex, but woulddistract the reader from the aspects of this construction that tightly integrateto a sponge function and its underlying function f . In fact, the restriction offixed-size feed requests makes it easy to ensure and to show that the encodingfunction is seed-complete. To allow for variable length seeding materials andretain seed-completeness, some form of padding within the encoding functionmust be introduced to make sure that the boundaries of the seeding materialcan be identified. Furthermore, one may have to add a way to distinguish blocksof zero-valued seeding material from zero blocks due to fetch requests. This canbe done, e.g., by putting a bit 1 in every block that contains seeding material.

The restriction of the first request being a feed request can be removed, eventhough it makes little sense generating pseudo-random bits without first feedingseeding material. If the first request is a fetch, the implementation immediatelypads the (empty string) input, switches the sponge function to the squeezingphase and produces output bits by squeezing. Formally, in the next feed request,this must be accounted for in e by setting e to p(empty string)||0r(dm/re−1).

9

Page 10: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

4 Security

Hash functions are often designed in two steps. In the first step, one choosesa mode of operation that relies on a cryptographic primitive with fixed inputsize (e.g., a compression function or a permutation) and builds a function thatcan process a message of arbitrary size. If the security of the mode of opera-tion can be proven, it then guarantees that any potential flaw can only comefrom the underlying cryptographic primitive, and thereby reduces the scope ofcryptanalysis.

We proceed similarly to assess the security of the PRNG, in two steps. First,we look at the security of the construction against generic attacks, i.e., againstattacks that do not use the specific properties of f . We do this in the followingsubsections. Then, the security of the PRNG depends on the actual function fand we give an example in Section 5.

4.1 Indifferentiability

Indifferentiability is a concept developed by Maurer, Renner and Holenstein andallows one to compare the security of a system to that of an ideal object, suchas the random oracle [11]. The system can use an underlying cryptographicprimitive (e.g., a compression function or a permutation) as a public subsystem.For instance, many hash function constructions have been proven to be indif-ferentiable from a random oracle when using an ideal compression function or arandom permutation as public subsystem (e.g., [8]).

By using indifferentiability, one can build a construction that does not haveany generic flaw, i.e., any undesired property or attack that does not rely on thespecific properties of the underlying primitive.

Theorem 1. The pseudo-random number generator P[F ] that uses a permuta-tion F is (tD, tS , N, ε)-indifferentiable from an ideal PRNG, for any tD, tS =O(N2), N < 2c and any ε with ε > N2/2c+1 when 1� N .

Proof. The proof follows immediately from [4, Theorem 2], where the (tD, tS , N, ε)-indifferentiability is proven between the sponge construction and a random or-acle. In [4, Theorem 2], the adversary has access to two interfaces: one to thepermutation F or its simulator, and one to input a message m ∈ Z∗2. In thecontext of this theorem, the same settings apply, except that the adversary doesnot have a direct access to the latter interface but only through the encodingfunction e(h). The same restriction applies both on the side of the sponge con-struction and on the side of the random oracle. Since the adversary has no betteraccess than in [4, Theorem 2], her probability of success cannot be higher. ut

Distinguishing the sponge-based PRNG calling a random permutation froman ideal PRNG defined in Section 2 takes about 2c/2 operations. In other words,the former is as secure as the latter if c is large enough.

10

Page 11: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

4.2 Resistance against state recovery

Indifferentiability provides a proof of resistance against all possible generic at-tacks on the construction. However, in practice, we can also look at the resistanceof the construction against generic attacks with a specific goal. In this case, theresistance cannot be lower than 2c/2 but may be higher.

The main purpose of the PRNG is to avoid that an adversary, who has seensome of the generated bits, can predict other values. A way to predict otheroutput bits is to recover the state of the PRNG by observing the generatedpseudo-random bits. In fact, since we use a permutation, the adversary canequivalently recover the state at any time during a fetch request. She can alsodetermine the state before or after a feed request if she can guess the seedingmaterial input during that request.

Let the state of a sponge function be denoted as (a, x), where a is theouter part (i.e., the r-bit part output during the squeezing phase) and x repre-sents inner part (i.e., the remaining c bits). Let (a0, a1, . . . , a`) be a sequence ofknown output blocks. The goal of the adversary is to find a value x0 such thatf(ai−1, xi−1) = (ai, xi) for 1 ≤ i ≤ ` and some values xi. Notice that once x0is fixed, the values xi, 1 ≤ i ≤ ` follow immediately. Furthermore, since f is apermutation, the adversary can choose to first determine xi for some index i andthen compute all the other xj 6=i from (ai, xi by applying f and f−1.

An instance of the passive state recovery problem is given by a vector (a0,a1, . . . , a`) of r-bit values. We focus on the case where such a sequence of valueswas actually observed, so that we are sure there is at least one solution. Also, weassume that there is only one solution, i.e., one value x0. This is likely if `r > c,and the probability that more than one solution exists decreases exponentiallywith `r − c. The adversary wants to determine unseen output blocks, so shewants to have only one solution anyway and will ask for more output blocks toremove any ambiguity.

The adversary can query the permutation f with values (a, x) and get f(a, x)or its inverse to get f−1(a, x). If f is a random permutation, we wish to computean upper bound on the success probability after N queries.

Theorem 2. Given an instance of the passive state recovery problem A = (a0,a1, . . . , a`) and knowing that there is one and only one solution x0, the successprobability after N queries is at most N2−cm(A), with m(A) the multiplicitydefined as

m(A) = max{mf(A),mb(A)}, with

mf(A) = maxa∈Zr

2

|{i : 0 ≤ i < ` ∧ ai = a}|, and

mb(A) = maxa∈Zr

2

|{i : 1 ≤ i ≤ ` ∧ ai = a}|.

Proof. Let F1(A) be the set of permutations f such that there is only one so-lution to the state recovery problem with instance A. For a given value (a, x),within F1(A), the inner part of f(a, x) (or f−1(a, x)) can be symmetrically cho-sen among the 2c possible values as the problem instance does not express any

11

Page 12: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

constraints on the inner parts. In other words, if x is such that the outer partof f(a, x) is b, then for any x′ 6= x there exists another permutation f ′ ∈ F1(A)where x′ is such that the outer part of f ′(a, x′) is b too. Such symmetries existalso for multiple inner values, independently of each other, as long as the corre-sponding outer values are different. E.g., if a1 6= a2 and (x1, x2) is such that theouter parts of f(ai, xi) are bi for i = 1, 2, then for any (x′1, x

′2) 6= (x1, x2) there

exists another permutation f ′ ∈ F1(A) where (x′1, x′2) verifies the same equality.

Let us first consider that ` = 1. In this case, m(A) = 1.

Let F1(A, x0, x1) be the subset of F1(A) where the value x0 is the solutionand f(a0, x0) = (a1, x1). The sets F1(A, x0, x1) partition the set F1(A) into 22c

subsets of equal size identified by x0 and x1, or in other words, x0 and x1 cutthe set in an orthogonal way.

The goal of the adversary is to determine in which subset F1(A, x0, x1) thepermutation f is. To do so, she is going to make queries of the form (a0, x0) andcheck if the outer part of f(a0, x0) is a1 (called forward queries), or she can makequeries to the inverse permutation and check if f−1(a1, x1) gives a0 as outer part(called backward queries). As the subsets F1(A, x0, x1) cut F1(A) orthogonallyin x0 and x1, forward queries help determine whether x0 is the solution butwithout reducing the set of possible values for x1, and vice-versa for backwardqueries. So, after Nf forward queries and Nb backward queries, the probabilitythat one of them gives the solution is 1− (1−Nf/2

c)(1−Nb/2c) ≤ N/2c, where

the probability is taken over all permutations f drawn uniformly from F1(A).

Let us now consider the general case where ` > 1. The reasoning can be gen-eralized in a straightforward way if all the ai are different, but some adaptationshave to be made to take into account the values appearing multiple times. Givena set of indexes {i1, . . . , im} such that ai1 = ai2 = · · · = aim , there may or maynot be constraints on the possible values that the corresponding inner valuesxi1 , xi2 , . . . , xim can take. For instance, if ai1−1 6= ai2−1 or if ai1+1 6= ai2+1, thennecessarily xi1 6= xi2 . In another example, A can be periodic, allowing the xivalues to be equal.

Let i(j, k) be a partition of the indexes 0 to ` such that ai(j,k) = ai(j′,k′)

iff j = j′, i.e., the j index identifies the subsets and the k index the indiceswithin that subset. Let F1(A, x0, x1, . . . , x`) be the subset of F1(A) such that(x0, x1, . . . , x`) is the solution. Here, the set F1(A) is again cut into subsets ofequal size if we use the n vectors (xi(j,1), . . . , xi(j,mj)) as identifiers, and each ofthese vectors cut F1(A) in an orthogonal way. (In general, however, the valuesx corresponding to identical values a do not cut F1(A) in an orthogonal way.)

The adversary can make a forward query to check whether f(ai(j,k), xi(j,k))gives ai(j,k)+1 as outer value. Using the same query, she can also check whetherf(ai(j,k′), xi(j,k)) yields ai(j,k′)+1 for any other k′ (as long as i(j, k′) < `). Thesame reasoning goes for backward queries: does f−1(ai(j,k′), xi(j,k)) yield ai(j,k′)−1for any k′ (as long as i(j, k′) > 0). So, a forward (resp. backward) query cancount as up to mf(A) (resp. mb(A)) chances to hit the correct outer value. Af-ter N queries, the probability that one of them gives the solution is at most

12

Page 13: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

m(A)N/2c, where the probability is taken over all permutations f drawn uni-formly from F1(A). ut

The previous theorem also imposes an upper bound on the success probabilityof preimage attacks, generically against a sponge function. This follows from thefact that finding a preimage implies that the state can be recovered.

This theorem covers the case of a passive adversary who observes outputblocks. Now, the PRNG implementation could allow seeding material to be pro-vided from outside, hence allowing an active adversary to absorb blocks of hischoice. This case is covered in the next theorem. We assume that the adversarycontrols the blocks bi that are injected at each iteration, i.e., the PRNG computesf(ai ⊕ bi, xi) = (ai+1, xi+1) and the adversary observes ai+1. Now an instanceof the problem is also determined by the injected blocks B = (b0, b1, . . . , b`).

Theorem 3. Given an instance of the active state recovery problem A = (a0,a1, . . . , a`), B = (b0, b1, . . . , b`) and knowing that there is one and only onesolution x0, the success probability after N queries is at most N2−c`.

Proof. The reasoning is the same as in Theorem 2, except that the queries areslightly different. In a forward query, the adversary checks if the outer partof f(ai ⊕ bi, xi) is ai+1. In a backward query, she checks if the outer part off−1(ai, xi) is ai−1⊕bi−1. Another difference is that now the forward multiplicityto be considered is

mf(A,B) = maxa⊕b∈Zr

2

|{i : 0 ≤ i < ` ∧ ai ⊕ bi = a⊕ b}|,

as one forward query can be used to check inner values at up to mf(A,B) in-dexes at once. Furthermore, the adversary can influence the multiplicity, e.g.,by making sure ai ⊕ bi is always the same value. So m(A) ≤ ` and the successprobability after N queries is at most N2−c`. ut

An active attacker can use ` = 2c/2 output blocks and the complexity of herattack is going to be N = 2c/2, a result in agreement with the indifferentia-bility result of Theorem 1. However, here we can distinguish between the datacomplexity, i.e., the available number of output data of the PRNG and the timecomplexity, the number of queries to f , of the attack. If the implementation of aPRNG limits the number of output blocks to some value `max < 2c/2, the timecomplexity of a generic attack is bounded by N = 2c/`max > 2c/2.

4.3 Forward security

Our construction does not inherently provide forward security, but it can beexplicitly triggered by using the following technique. One can fetch r′ ≤ r bitsout of the current PRNG and feed them immediately afterwards. This way, ther′ bits of the outer part of the state will be set to zero, making this process anirreversible step. By repeating this process dc/r′e times, the adversary has toguess at least c bits when evaluating the state backwards. This process can beactivated, for instance, at regular intervals.

13

Page 14: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

5 A concrete example with Keccak

Keccak is a family of sponge functions submitted to the SHA-3 contest or-ganized by NIST [13,6,7]. The family uses seven permutations ranging froma width of 25 bits to a width of 1600 bits. While the SHA-3 proposal usesKeccak-f [1600] only, other members of the family with a smaller width canbe interesting in the context of a PRNG in an embedded device. For instance,Keccak[r = 96, c = 104] and Keccak[r = 64, c = 136] both use Keccak-f [200]as underlying permutation. This permutation is suitable for devices with scarseresources as the state can be stored in only 25 bytes. In hardware it can bebuilt in a very compact core and in software it can be implemented with bitwiseBoolean instructions and rotations within bytes only. These sponge functionscan produce 96 and 64 pseudo-random bits, resp., per call to Keccak-f [200].

In terms of security, Keccak follows what is called the hermetic sponge strat-egy [7,5]. This means that the Keccak-f permutations are designed with thetarget that they cannot be distinguished from a randomly-chosen permutation.Biased output bits on one of the Keccak members, for instance, would implya distinguisher on the underlying permutation Keccak-f and would thereforecontradict the design strategy.

Against passive state recovery attacks in the generic case, Theorem 2 provesa resistance of 2c/m(A). If a sequence of `r output bits is known, the expectedvalue of m(A) is close to 1 unless ` > 2r/2. One can limit to r2r/2 the number ofoutput bits between times where the state has gained at least c bits of fresh seed-ing material. This way, Keccak[r = 96, c = 104] and Keccak[r = 64, c = 136]provides a resistance of about 2104 and 2136, resp., against state recovery, at leastas long as no distinguisher on Keccak-f [200] is found.

If the PRNG allows the user to provide seeding material, active state recoveryattacks must also be considered. Here, the implementation can limit, e.g., to`max = 224 or 232 output blocks before the state has again been fed with cbits of fresh seeding material. In this case, Keccak[r = 64, c = 136] provides aresistance of about 2112 and 2104, respectively.

We have implemented our PRNG based on Keccak[r = 96, c = 104] andKeccak[r = 64, c = 136] and passed the statistical tests proposed by NIST [15].The tests were performed on 200 sequences of 106 bits each. The sequences weregenerated by squeezing 2 × 108 bits after providing the empty string as input,namely bKeccak[r = 96, c = 104]()c2×108 and bKeccak[r = 64, c = 136]()c2×108 .

6 Conclusions

We have presented a construction for building a reseedable pseudo-random num-ber generator using a sponge function. This construction is efficient in terms ofmemory use and processing, and inherits the provable security properties of thesponge construction. We have provided bounds on generic state recovery attacksallowing the use of a small state. We have given a concrete example of sucha PRNG based on Keccak with a state of only 25 bytes that is particularlysuitable for embedded devices.

14

Page 15: Sponge-based pseudo-random number generators · 2010. 5. 12. · Sponge-based pseudo-random number generators Guido Bertoni 1, Joan Daemen , Micha el Peeters2, and Gilles Van Assche1

References

1. M. Bellare and P. Rogaway, Random oracles are practical: A paradigm for designingefficient protocols, ACM Conference on Computer and Communications Security1993 (ACM, ed.), 1993, pp. 62–73.

2. M. Bellare and B. Yee, Forward-security in private-key cryptography, CryptologyePrint Archive, Report 2001/035, 2001, http://eprint.iacr.org/.

3. G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche, Sponge functions,Ecrypt Hash Workshop 2007, May 2007, also available as public commentto NIST from http://www.csrc.nist.gov/pki/HashWorkshop/Public_Comments/

2007_May.html.4. , On the indifferentiability of the sponge construction, Advances in Cryp-

tology – Eurocrypt 2008 (N. P. Smart, ed.), Lecture Notes in Computer Science,vol. 4965, Springer, 2008, http://sponge.noekeon.org/, pp. 181–197.

5. , Cryptographic sponges, 2009, http://sponge.noekeon.org/.6. , Keccak specifications, version 2, NIST SHA-3 Submission, September

2009, http://keccak.noekeon.org/.7. , Keccak sponge function family main document, NIST SHA-3 Submission

(updated), September 2009, http://keccak.noekeon.org/.8. J. Coron, Y. Dodis, C. Malinaud, and P. Puniya, Merkle-Damgard revisited: How

to construct a hash function, Advances in Cryptology – Crypto 2005 (V. Shoup,ed.), LNCS, no. 3621, Springer-Verlag, 2005, pp. 430–448.

9. A. Desai, A. Hevia, and Y. L. Yin, A practice-oriented treatment of pseudorandomnumber generators, Advances in Cryptology – Eurocrypt 2002 (L. R. Knudsen,ed.), Lecture Notes in Computer Science, vol. 2332, Springer, 2002, pp. 368–383.

10. N. Ferguson and B. Schneier, Practical cryptography, John Wiley & Sons, 2003.11. U. Maurer, R. Renner, and C. Holenstein, Indifferentiability, impossibility results

on reductions, and applications to the random oracle methodology, Theory of Cryp-tography - TCC 2004 (M. Naor, ed.), Lecture Notes in Computer Science, no. 2951,Springer-Verlag, 2004, pp. 21–39.

12. NIST, Federal information processing standard 186-2, digital signature standard(DSS), May 1994.

13. , Announcing request for candidate algorithm nominations for a new cryp-tographic hash algorithm (SHA-3) family, Federal Register Notices 72 (2007),no. 212, 62212–62220, http://csrc.nist.gov/groups/ST/hash/index.html.

14. , NIST special publication 800-90, recommendation for random number gen-eration using deterministic random bit generators (revised), March 2007.

15. , NIST special publication 800-22, a statistical test suite for random andpseudorandom number generators for cryptographic applications (revision 1), Au-gust 2008.

16. J. Viega, Practical random number generation in software, ACSAC ’03: Proceedingsof the 19th Annual Computer Security Applications Conference (Washington, DC,USA), IEEE Computer Society, 2003, p. 129.

15