on string replacement exponentiation

11
Designs, Codes and Cryptography, 23, 173–183, 2001 C 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. On String Replacement Exponentiation L. J. O’CONNOR [email protected] IBM Zurich Research Laboratory, Saumerstrasse 4, R¨ uschlikon, CH-8803, Switzerland Communicated by: F. Piper Received June 15, 1999; Revised December 9, 1999; Accepted January 18, 2000 Abstract. The string replacement (SR) method was recently proposed as a method for exponentiation a e in a group G. The canonical k -SR method operates by replacing a run of i ones in a binary exponent, 0 < i k , with i 1 zeroes followed by the single digit b = 2 i 1. After recoding, it was shown in [5] that the expected weight of e tends to n/4 for n-bit exponents. In this paper we show that the canonical k -SR recoding process can be described as a regular language and then use generating functions to derive the exact probability distribution of recoded exponent weights. We also show that the canonical 2-SR recoding produces weight distributions very similar to (optimal) signed-digit recodings, but no group inversions are required. Keywords: Exponentiation, generating functions, regular languages 1. Introduction One of the fundamental operations in cryptography is exponentiation a e over groups such as Z p , Z n , general finite fields, and the group of points on an elliptic curve [3,4,17]. Many algorithms offer complexity improvements over the standard binary method [10], including the sliding-window method ([1,8,13] for example), signed-digit representations [9,11,14,16,19], the signed-window method [12] and the Lempel-Ziv recoding [20] (see [13] for a survey). The String Replacement (SR) method for exponentiation was re- cently proposed and analysed by Gollman, Han and Mitchell [5]. Let 1 i denote a run of i ones, and let 1 i k denote a run 1 i where 1 i k . The basic approach of the SR method is to select a parameter k and then replace runs in the exponent e of the form 1 i k with the string 0 i 1 b where b = 2 i 1, which is known as the canonical k -SR recoding of the exponent. The set of possible values for b are 3, 7,..., 2 k 1, which are precomputed, and the value of a e is then determined by using the precomputed values and a variant of the b-ary method. As is customary, the efficiency of the canonical k -SR method is measured in terms of the number of required squarings and multiplications to compute a e . The (k 1) values a 2 i 1 , 2 i k can be precomputed using the binary method at a cost of (k 1) squarings and multiplications [5]. The number of squarings to complete a e is approximately n 1, Correspondence should be sent to Unisys (Schweiz) AG, Zeucherstrasse, 59-61 CH-8800, Thalwil, Switzerland. Email: [email protected].

Upload: luke-oconnor

Post on 03-Aug-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On String Replacement Exponentiation

Designs, Codes and Cryptography, 23, 173–183, 2001©C 2001 Kluwer Academic Publishers. Manufactured in The Netherlands.

On String Replacement Exponentiation

L. J. O’CONNOR∗ [email protected] Zurich Research Laboratory, Saumerstrasse 4, Ruschlikon, CH-8803, Switzerland

Communicated by: F. Piper

Received June 15, 1999; Revised December 9, 1999; Accepted January 18, 2000

Abstract. The string replacement (SR) method was recently proposed as a method for exponentiation ae in agroup G. The canonical k-SR method operates by replacing a run of i ones in a binary exponent, 0 < i ≤ k, withi − 1 zeroes followed by the single digit b = 2i − 1. After recoding, it was shown in [5] that the expected weightof e tends to n/4 for n-bit exponents. In this paper we show that the canonical k-SR recoding process can bedescribed as a regular language and then use generating functions to derive the exact probability distribution ofrecoded exponent weights. We also show that the canonical 2-SR recoding produces weight distributions verysimilar to (optimal) signed-digit recodings, but no group inversions are required.

Keywords: Exponentiation, generating functions, regular languages

1. Introduction

One of the fundamental operations in cryptography is exponentiation ae over groups suchas Z∗

p, Zn , general finite fields, and the group of points on an elliptic curve [3,4,17].Many algorithms offer complexity improvements over the standard binary method [10],including the sliding-window method ([1,8,13] for example), signed-digit representations[9,11,14,16,19], the signed-window method [12] and the Lempel-Ziv recoding [20](see [13] for a survey). The String Replacement (SR) method for exponentiation was re-cently proposed and analysed by Gollman, Han and Mitchell [5]. Let 1i denote a run of iones, and let 1i≤k denote a run 1i where 1 ≤ i ≤ k. The basic approach of the SR methodis to select a parameter k and then replace runs in the exponent e of the form 1i≤k withthe string 0i−1b where b = 2i − 1, which is known as the canonical k-SR recoding of theexponent. The set of possible values for b are 3, 7, . . . , 2k − 1, which are precomputed, andthe value of ae is then determined by using the precomputed values and a variant of theb-ary method.

As is customary, the efficiency of the canonical k-SR method is measured in terms ofthe number of required squarings and multiplications to compute ae. The (k − 1) valuesa2i −1, 2 ≤ i ≤ k can be precomputed using the binary method at a cost of (k −1) squaringsand multiplications [5]. The number of squarings to complete ae is approximately n − 1,

∗Correspondence should be sent to Unisys (Schweiz) AG, Zeucherstrasse, 59-61 CH-8800, Thalwil, Switzerland.Email: [email protected].

Page 2: On String Replacement Exponentiation

174 O’CONNOR

which depends on the length of the most significant run in e. On the other hand, the numberof multiplications to complete ae is determined to be wk,SR(e) − 1 where wk,SR(e) is therecoded weight of e, or equivalently, the number of nonzero digits in the recoded exponent.In [5] it was shown that

E[wk,SR(e)] ∼ n2k−2

(2k − 1)+ (2k − k − 1)2k−2

(2k − 1)2, (1)

and thus the expected recoded weight is tending to n/4 as k increases.In this paper we will use techniques presented in [15] to precisely analyse the distri-

bution of exponent weights after (canonical) k-SR recoding. For a recoding rule A, thebasis of the analysis in [15] is to define a bivariate generating function (bgf) G A(x, z) =∑

n,m≥0 an,m zm xn such that am,n is the number of exponents e of length n that have weight mafter recoding according to A. Equivalently, Pr(w A(e) = m | #e = n) = am,n/2n , where #eis the bit length of e. The bgf G A(x, z) can be directly determined when the steps performedby A can be carried out by a finite state machine, or equivalently, described by a regularlanguage [7]. We will show that k-SR recoding is regular, and prove that

Gk,SR(x, z) =∑

n,m≥0

an,m xnzn = 1 − x − zxk + zx

1 − 2x + x2 − zxk + 2zxk+1 − zx2. (2)

Given Gk,SR(x, z) it is straightforward to examine the exact weight distribution, computemoments of the distribution and hence determine the variance of wk,SR(e). The authors of [5]stress the advantages of k-SR recoding over signed-digit recoding, where the latter requiresa group inversion to be computed. Using previous results we are able to compare the weightdistributions of these two methods closely and show, for example, that 2-SR encoding isalmost equivalent to signed-digit encoding with respect to weight, but no inversions arerequired.

The paper is set out as follows. In §2 we derive the bgf Gk,SR(x, z) by first introducingsome enumeration theorems for regular languages in §2.1, and then modeling k-SR recodingas a regular language in §2.2. In §3 we expand the coefficients of Gk,SR(x, z), and thenexamine the weight distribution and its moments. We analyse the weight distributions of512-, 768- and 1024-bit exponents for k-SR recoded weight, with k in the range 2 ≤ k ≤ 5.For example, we show that a 4-SR recoding of a 512-bit exponent will have a weightbetween 128 and 145 with probability greater than a half, and a weight between 125 and148 with probability at least three quarters. We also show that for n-bit exponents therecoded weight for unrestricted k (that is k ≥ n) has an expectation of (n + 1)/4 with avariance of (n + 1)/16.

2. An Algebraic Treatment of SR Recoding

The goal of this section is to derive the bgf Gk,SR(x, z) which describes the weight dis-tribution of the canonical k-SR recoding process (explained in §2.2). Our approach is tomodel the recoding process as a regular language and then use enumeration theorems estab-lished in §2.1 to derive Gk,SR(x, z) in §2.2. Intuitively, this correspondence between regular

Page 3: On String Replacement Exponentiation

ON STRING REPLACEMENT EXPONENTIATION 175

languages and canonical SR recodings exists since the SR recoding is based on examiningruns in the exponent, and these runs can be recognized by a deterministic finite automaton.

2.1. Enumerating Regular Languages

We assume that the reader is familiar with regular languages [7]. For regular expressions Rand S, we recall the common operations of union (R + S), concatenation (RS) and Kleeneclosure (R∗ where R∗ = ∑

k≥0 Rk = ε + R + R R + R R R +· · ·, and ε is the empty string).Over a binary alphabet we will call 1k a k-run, k ≥ 1, and any word ω that is a k-run willalso be simply referred to as a run. A regular expression R over the alphabet generates a(regular) language L R . Further, let Ln

R ⊆ L R be the set of all length n words in L R , n ≥ 0.We will say that the (ordinary) generating function G R(x) = ∑

n≥0 an xn enumerates L R bylength if an = |Ln

R| for all n ≥ 0. Let [xn] be the operator that extracts the coefficient of xn ,so that [xn]G R(x) = an . It is clear that the regular expression R = (1 + 0)∗ generates thelanguage L R which is the set of all binary strings, and since |Ln

R| = 2n , L R is enumeratedby the geometric series G R(x) = 1/(1 − 2x). The key property that permits G R(x) to bederived from R directly is given in the next definition.

Definition 2.1. A regular expression R is unambiguous if there is only one way for R togenerate each ω ∈ L R .

For example (1 + 0)∗ is unambiguous, but (1 + 0 + 10)∗ is ambiguous since the stringω = 10 can be generated by concatenating 1 and 0, or simply selecting 10. Since it is knownthat any regular language can be generated by an unambiguous regular expression [18,p. 378], the following theorem due to Chomsky and Schutzenberger [2] will be our mainenumeration tool.

THEOREM 2.1. Let R and S be unambiguous regular expressions, that are enumerated bythe gfs G R(x) and GS(x). Then if R + S, RS and R∗ are also unambiguous, G R(x)+GS(x)

enumerates R + S, G R(x)GS(x) enumerates RS, and 1/(1 − G R(x)) enumerates R∗.

EXAMPLE 2.1. Observe that R = (1 + 01)∗(ε + 0) unambiguously generates the set of allbinary words that do not contain the subword 00. The gf for 1 + 01 is x + x2, and the gffor ε + 0 is 1 + x, yielding via Theorem 2.1 the gf for R to be

G R(x) = 1

1 − (x + x2)· (1 + x) = 1 + x

1 − x − x2=

∑n≥0

Fn+2xn

where Fn is the n-th Fibonacci number.

For our purposes we need to adapt Theorem 2.1 from univariate gfs to bgfs, where a secondformal variable z is used to mark strings for weight, in addition to x marking strings forlength. The appropriate definition of weight and how z will be used to mark strings willdepend on the recoding algorithm under consideration. In Example 2.2 below the binarymethod is considered, and the appropriate weight function in this case is hamming weight

Page 4: On String Replacement Exponentiation

176 O’CONNOR

and z will be used to mark the number of ones in the exponent. A typical application ofTheorem 2.1 is to enumerate arbitrarily long strings by combining many smaller stringsaccording to a rule described by an unambiguous regular expression. Then to move fromunivariate gfs to bgfs, we must then state conditions where the weight of an arbitrarilylong string is given by combining the weights of the smaller strings that unambiguouslygenerate it.

COROLLARY 2.1. Let R and S be unambiguous regular expressions, that are enumeratedby the bgfs G R(x, z) and GS(x, z). Let w(·) be a weight function that obeys w(R + S) =w(R) + w(S) and w(RS) = w(R)w(S). If R + S, RS and R∗ are also unambiguous,then G R(x, z) + GS(x, z) enumerates R + S, G R(x, z)GS(x, z) enumerates RS, and1/(1 − G R(x, z)) enumerates R∗.

Proof. Since R + S, RS and R∗ are unambiguous, then the corresponding gfs for theseoperations enumerate correctly for length as in Theorem 2.1. If w(R + S) = w(R) + w(S)

then R + S is enumerated for weight by G R(x, z) + GS(x, z) since [xnzm](G R(x, z) +GS(x, z)) = [xnzm]G R(x, z) + [xnzm]GS(x, z). By a similar expansion of coefficients itcan be shown that if w(RS) = w(R)w(S) then G R(x, z)GS(x, z) enumerates RS and1/(1 − G R(x, z)) enumerates R∗.

We now show how to apply Corollary 2.1 to determine the bgf for the weight of thebinary method.

EXAMPLE 2.2. Since the binary method processes the exponent bit-by-bit, we considerthe regular expression R = (1 + 0)∗, which generates each binary string unambiguouslybit-by-bit. The weight function of interest in this case is the hamming weight, since thiswill determine the number of multiplications executed by the binary method. Let 1 bemarked for weight by z such that w(1) = z, and let 0 be marked for weight as 1 = z0 suchthat w(0) = 1. Then the weight function expressed through the marking by z satifies theconditions of Corollary 2.1 since w(1 + 0) = w(1) + w(0) = z + 1 = w(0 + 1), w(10) =w(1)w(0) = z · 1 = z = w(01), w(00) = 1 = w(0)w(0), and w(11) = z2 = w(1)w(1).Then we may mark (1 + 0) for length and weight as (xz + x), and apply Corollary 2.1 toobtain G R(x, z) as

G R(x, z) = 1

1 − (xz + x)= 1

1 − x(1 + z)=

∑n≥0

xn∑m≥0

( nm

)zm . (3)

Thus the number of strings of length n with (hamming) weight m is [xnzm]G R(x, z) = (nm ),

as expected.

Let A be some an exponent recoding algorithm such that G R(x, z) is the gf for the weightdistribution of A, and w A(e) is the recoded weight of e by A. An advantage of using G R(x, z)for enumeration is that the expectation and variance of w A(e) can be directly determinedfrom manipulating G R(x, z). Using standard operations on bgfs (see for example [18,p. 138]) it can be shown that

E[w R(e)] = [xn]

(∂G R(x/2, z)

∂z

∣∣∣∣z=1

), (4)

Page 5: On String Replacement Exponentiation

ON STRING REPLACEMENT EXPONENTIATION 177

Var[w R(e)] = [xn]

(∂2G R(x/2, z)

∂2z

∣∣∣∣z=1

+ ∂G R(x/2, z)

∂z

∣∣∣∣z=1

)

−(

[xn]

(∂G R(x/2, z)

∂z

∣∣∣∣z=1

))2

. (5)

Thus E[w R(e)] and Var[w A(e)] can be extracted by several differentiations of G R(x, z)with respect to z, and determining the coefficient of xn after setting z = 1.

2.2. Bgfs for SR Recoding

We now describe the k-SR canonical recoding process, then show how it can be modeledby a regular expression Rk,SR, and lastly derive the rules for forming Gk,SR(x, z) from Rk,SR.The k-SR recoding of e is in general not unique in that e may have several such encodings.For the purposes of analysis, the authors of [5] defined the canonical k-SR recoding of e asthe output of the following string replacement algorithm: for i from k down to 2, startingfrom the most significant bit of e and scanning left towards the least significant bit, replace1i with 0i−1bi , bi = 2i −1. While the canonical k-SR recoding is not guaranteed to producea recoded exponent of minimal weight, the resulting recoded exponents appear to have nearoptimal weight [5].

EXAMPLE 2.3. Consider the 64-bit exponent e = 14312983206104981813 whose binaryexpansion is

110001101010000111101110011001000111110000000100111000010011010, (6)

and thus w(e) = 29. The 3-SR and 4-SR canonical recodings are given, respectively, as

030000301010000007100070003001000007030000000100007000010003010,

030000301010000000F00070003001000000F10000000100007000010003010,

where 15 has been recoded as the hexadecimal digit F. In both cases the recoded weightof e is 16, which is about 55% of the original weight.

We begin our analysis by observing that e can be uniquely decomposed into substringse = s1s2 · · · sm such that si = 0 ji or 1 ji 0 for 1 ≤ i < m and sm = 0 jm or 1 jm , where each jiis maximal, 1 ≤ i ≤ m. We will call this the zero terminated run (ZTR) decomposition ofe. All runs are terminated with a zero, with the possible exception of the last sm . We willcall s1, s2, . . . , sm the ZTR substrings of e.

LEMMA 2.1. Let e = s1s2 · · · sm be the ZTR decomposition of e. Then in the canonicalk-SR recoding of e each si is recoded independently.

Proof. The canonical k-SR-recoding only operates on runs in e of the form 1t . The re-codings of the si will be independent if no run spans two adjacent substrings si and si+1.However this is not possible since each substring si that contains a run terminates the run

Page 6: On String Replacement Exponentiation

178 O’CONNOR

with a 0 within si if si+1 exists. All si of the form 0 ji are recoded independently since each0 is recoded independently (to itself).

We now consider how ZTR substrings that contain runs are recoded. Let si = 1 ji 0 bea substring of the ZTR of e, such that ji = kq + r where r ≡ ji mod k. Recalling thatbi = 2i − 1, the canonical k-SR recoding of e will recode si to (0k−1bk)

q0r−1br 0 since

si = 1 ji 0 �⇒ (1k)q1r 0 �⇒ (0k−1bk

)q1r 0 �⇒ (

0k−1bk)q

0r−1br 0. (7)

Thus conceptually the canonical k-SR recoding of si = 1 ji 0 where ji = kq + r, r < k, isto parse 1 ji into q runs of length k and a single zero terminated run of length r . After thisparsing, s = (1k)q1r 0, 1k is recoded as 0k−1bk, 1r is recoded as 0r−1br , and these recodingsare substituted into si . Thus we may consider the canonical k-SR-recoding of an exponentto proceed in two steps: first the parsing step, then the recoding step.

Now consider the following regular expression, Rk,SR, defined as

Rk,SR = R∗k1 Rk2 =

(0 + 1k +

k−1∑i=1

1i 0

)∗(ε +

k−1∑i=1

1i

)(8)

where∑k−1

i=1 1i 0 = 10 + 110 + · · · + 1k−10. We will relate Rk,SR to the canonical k-SRparse of an exponent, which will be evident from the proof of the next theorem.

THEOREM 2.2. Rk,SR unambiguously generates all binary strings.

Proof. Since the ZTR decomposition is unique, and exists for all binary strings, we needonly show that each ZTR substring is generated unambiguously by Rk,SR.

It is clear that si = 0 ji can only be generated in one way: select ji zeros from Rk1. To showthat si = 1 ji 0 (i �= m) is generated in exactly one way we observe that si must be generatedby Rk1, and that strings in Rk1 that contain a 1 are of the form 1k or 1r 0, 1 ≤ r < k. Thus si isgenerated by selecting exactly one string of the form 1r 0, 0 ≤ r < k where 100 = ε0 = 0,since si has one 0, and the remainder of si is generated by selecting q copies of 1k . Clearlyq and r must satisfy ji = kq + r, r ≡ ji mod k which are uniquely determined by ji andk. Thus si is generated uniquely and hence unambiguously. A similar argument shows thatsm = 1 ji is also generated unambiguously by Rk2, and hence by Rk,SR.

COROLLARY 2.2. The bgf Gk,SR for Rk,SR can be derived via the rules of Corollary 2.1.

Proof. The Corollary follows from the fact that Rk,SR is unambiguous and that each si isrecoded independently.

Thus in Rk,SR we have a regular expression that generates all exponents (binary strings) eunambiguously, such that the manner in which the substrings of Rk,SR are selected to forme corresponds exactly to the parse of e produced by its k-SR canonical recoding. We nowderive the Gk,SR using Theorem 2.1.

Page 7: On String Replacement Exponentiation

ON STRING REPLACEMENT EXPONENTIATION 179

THEOREM 2.3. Let an,m be the number of binary strings of length n for which a k-SRrecoding has hamming weight m, 0 ≤ m < n. Then

Gk,SR(x, z) =∑

n,m≥0

an,m xnzn = 1 − x − zxk + zx

1 − 2x + x2 − zxk + 2zxk+1 − zx2(9)

Proof. We use Theorem 2.1 to transform Rk,SR to Gk,SR(x, z). Rk1 and Rk2 are transformedas follows

Rk1 =(

0 + 1k +k−1∑i=1

1i 0

)�⇒ G R1(x, z) = x + zxk + z

(1 − xk+1

1 − x− 1 − x

),

Rk2 =(

ε +k−1∑i=1

1i

)�⇒ G R2(x, z) = 1 + z

(1 − xk

1 − x− 1

).

For example, in G Rk1(x, z) the term x corresponds to 0 (length 1 and no weight in therecoded exponent), and the term zxk corresponds 1k (length k and one non-zero digit inthe recoded exponent). The theorem follows from simplifying Gk,SR(x, z) = G Rk2(x, z)/(1 − G Rk1(x, z)).

As a check on the correctness of our expression for Gk,SR(x, z), we can consider somespecial cases for k and z. When z = 1 it follows that Gk,SR(x, 1) should enumerate allbinary strings of length n since [xn]Gk,SR(x, 1) = ∑

m≥0 an,m = 2n. It is easily verified thatGk,SR(x, 1) = 1/(1 − 2x), the gf for enumerating all binary strings by length. On the otherhand, when k = 1 the k-SR method reduces to the binary method. Again it can be verifiedthat G1,SR(x, z) = 1/(1− x − xz) which agrees with the derivation of the bgf for the binarymethod given in Example 2.2.

In practice we expect k to be small, say less than 10, but we may also consider theweight distribution when k is large, or at least k ≥ n, where n is the exponent length. LetGSR = ∑

n,m≥0 an,mzm xn be the bgf such that am,n is the number of n-bit exponents recodedto weight m by a k-SR canonical recoding where k is unrestricted.

THEOREM 2.4. Let an,m be the number of binary strings of length n for which an unrestrictedSR recoding has hamming weight m, 0 ≤ m < n. Then

GSR(x, z) =∑

n,m≥0

an,mxnzn = 1 − x − xz

1 − 2x + x2 − zx2. (10)

Proof. In the unrestricted case it is directly observed that the parsing of the unrestrictedcanonical recoding is described by the regular expression

RSR =(

0 +∑i≥1

1i 0

)∗(ε +

∑i≥1

1i

). (11)

RSR unambiguously generates all binary strings, and can be enumerated using Theorem 2.1.The proof of this is very similar to Theorem 2.2 and Corollary 2.2 for the same propertieswith respect to Rk,SR, and is thus omitted.

Page 8: On String Replacement Exponentiation

180 O’CONNOR

3. The Canonical SR Weight Distribution

We begin by examining the unrestricted k-SR canonical weight of an exponent. Since theweight depends directly on the distribution of runs in the exponent, we then expect theweight reduction due to canonical recoding to have a limit independent of k.

THEOREM 3.1. Let wSR(e) be the weight of an exponent e after canonical recoding. Thenfor a uniformly distributed n-bit exponent e

E[wSR(e)] = n + 1

4, Var[wSR(e)] = n + 1

16. (12)

Proof. Apply (4) and (5) directly.

Recall that Chebyshev’s inequality bounds the deviation of a random variable X from itsmean µ in terms of its variance σ 2: Pr(|X − µ | ≥ d) ≤ σ 2/d2. Then define α(X, p) as

α(X, p) = mind

(σ 2

d2< (1 − p)

)(13)

which states that d is the smallest for which Pr(|X − µ | < d) > p according to boundsderived by Chebyshev’s inequality. Using α(X, p) from (13) we may bound the weightdistribution for unrestricted k, as shown in Table 1 for 512-, 768- and 1024-bit exponents.For example, the table states that for a 1024-bit exponent, its weight deviates by less than12 from its mean value of 64.0625 for more than half the exponents. On the other hand,99% of exponents deviate by less than 81 from the mean value.

The simple expressions for the expectation and variance given in Theorem 3.1 can befound since GSR(x, z) does not depend on a parameter k. It is more difficult to analyseGk,SR(x, z), which does depend on k. We have calculated E[wk,SR(e)] and Var[wSR(e)] fork in the range 2 ≤ k ≤ 5, which appears to cover those values of practical interest. Wehave verified that E[wk,SR(e)] in this range is given as in (1) and the variances are givenin Table 2. For example, we can now conclude that a 4-SR recoding of a 512-bit exponentwill have a weight between 128 and 145 with probability greater than a half, and a weightbetween 125 and 148 with probability at least three quarters. Considering the last row fromTable 2 we see that for 1024 exponents and k = 5, E[wSR(e)]/E[wk,SR(e) ≈ 0.97, and thatthe deviations up to p = 0.75 are quite similar. Thus a large amount of the potential weightreduction from canonical recoding can be achieved by k = 5 for 1024 exponents. Table 2complements the computational results presented in Table 2 of [5] by bounding the deviation

Table 1. The weight distribution of unrestricted k-SR canonical recodings for 512-, 768- and 1024-bit exponents.The columns show the value of α(wSR(e), p), p ∈ {0.50, 0.60, 0.75, 0.90, 0.95, 0.99}.

k E[wSR(e)] Var[wSR(e)] 0.50 0.60 0.75 0.90 0.95 0.99

512 128.25 32.0625 9 9 12 18 26 57768 192.25 48.0625 10 11 14 22 32 70

1024 256.25 64.0625 12 13 17 26 26 81

Page 9: On String Replacement Exponentiation

ON STRING REPLACEMENT EXPONENTIATION 181

Table 2. Canonical k-SR recoding distributions for 512-, 768-, 1024-bit exponents. The columns show the valueof α(wk,SR(e), p), p ∈ {0.50, 0.60, 0.75, 0.90, 0.95, 0.99}. For k = 2, 3, 4, 5, Var[wk,SR(e)] is asymptotic to

2n/27, 18n/343, 172n/3375 and 1592n/29791 respectively, and E[wk,SR(e)] ∼ n2k−2

(2k−1)+ (2k−k−1)2k−2

(2k−1)2 .

n k E[wk,SR(e)] Var[wk,SR(e)] 0.50 0.60 0.75 0.90 0.95 0.99

512 2 170.7 38.0 9 10 13 20 28 62512 3 146.4 26.9 8 9 11 17 24 52512 4 136.7 26.2 8 9 11 17 23 52512 5 132.3 27.4 8 9 11 17 24 53768 2 256.1 56.9 11 12 16 24 34 76768 3 219.6 40.4 9 11 13 21 29 64768 4 204.9 39.2 9 10 13 20 29 63768 5 198.4 41.1 10 11 13 21 29 65

1024 2 341.4 75.9 13 14 18 28 39 881024 3 292.7 53.8 11 12 15 24 33 741024 4 273.3 52.3 11 12 15 23 33 731024 5 264.5 54.8 11 12 15 24 34 75

Figure 1. Weight distribution for optimal signed-digit (OSD) and k-SR canonical recoding of 160-bit exponents,2 ≤ k ≤ 5.

from the expectations. We remind the reader that the deviation bounds in Tables 1 and 2 arebased on Chebyshev’s inequality and more precise information for a given exponent lengthn (say n = 160 as in Figure 1 above) can be obtained by expanding Gk,SR(x, z) as a powerseries.

We now consider the case where k = 2, since the expected canonical 2-SR recoded weightand the optimal1 signed-digit weight are approximately both equal to n/3. It is clear from

Page 10: On String Replacement Exponentiation

182 O’CONNOR

the original SR method paper [5] that one of the motivations for the method was to proposea run-based recoding technique that did not require group inversions. Let wSD(e) be theoptimal signed-digit (OSD) weight of an exponent. Then it is known [5,15] that

E[wOSD(e)] = n

3+ 4

9+ o(1), Var[wOSD(e)] = 2n

27+ 14

81+ o(1),

and using G2,SR(x, z) we can directly prove that

E[w2,SR(e)] = n

3+ 1

9+ o(1), Var[w2,SR(e)] = 2n

27+ 8

81+ o(1).

Thus we see that the two weight distributions agree quite closely in expectation and variance.The similarity is highlighted in Figure 1 where the recoded weight of 160-bit exponents isplotted for the OSD method, and the k-SR method for 2 ≤ k ≤ 5. Here the OSD and 2-SRdistributions are essentially identical, and the 5-SR distribution is already clustering aroundthe expected weight for an unrestricted recoding. We consider 160-bit exponents as fieldsof this size are suitable for use in elliptic curve cryptosystems, as OSD recodings could beused as group inversion is a cheap operation.

4. Conclusion

Our main result has been to derive Gk,SR(x, z), the gf describing the probability distributionof canonical k-SR recodings, which is the parameter of interest in the analysis of SRexponentiation. Extending the method presented in [15], we have used regular languages tocharacterize the canonical k-SR parsing of the exponent which leads directly to Gk,SR(x, z).We were also able to show that 2-SR recodings and optimal signed-digit recodings produceweights that are distributed very similarly, but the 2-SR recoding has the advantage ofrequiring no group inversions.

Though the canonical k-SR representation produces low weight representations of inte-gers, other SR recodings may produce lower weight representations. The problem of findingoptimal (minimal ) weight SR recodings was considered by Han, Gollman and Mitchell [6]where several heuristic algorithms were considered. It is not clear how the regular expressionmethod outlined in this paper could be used to provide analytical information concerningoptimal SR representations, and this is topic of further work. The reason that the method issuccessful on the canonical representation is that this form of recoding can be expressed asa pattern matching problem on the exponent, in particular, recognizing and recoding runsof a certain length. The regular expression method is potentially applicable to optimal SRrepresentations if a similar characterization in terms of regular patterns can be found.

Acknowledgments

I would like to thank the referees for correcting several errors in the original manuscript,and for general remarks that improved the manuscript overall.

Page 11: On String Replacement Exponentiation

ON STRING REPLACEMENT EXPONENTIATION 183

Note

1. We use the word optimal here since there are several schemes for signed-digit recoding. By optimal here werefer to the schemes that produce sparse forms [5], where adjacent digits have a product of zero.

References

1. J. Bos and M. Coster, Addition chain heuristics. Advances in Cryptology, CRYPTO 89 (G. Brassard, ed.),volume 218, Lecture Notes in Computer Science, Springer-Verlag (1990) pp. 400–407.

2. N. Chomsky and P. Schutzenberger, The algebraic theory of context-free languages, Computer Programmingand Formal Languages (P. Braffort and North Holland Hirchberg, D., eds.) (1963) pp. 118–161.

3. W. Diffie and M. Hellman, New directions in cryptography, IEEE Transactions on Information Theory,Vol. 22, No. 6 (1976) pp. 472–492.

4. T. ElGamal, A public key cryptosystem and signature system based on discrete logarithms, IEEE Transactionson Information Theory, Vol. 31, No. 4 (1985) pp. 473–481.

5. D. Gollman, Y. Han and C. Mitchell, Redundant integer representations and fast exponentiation, Designs,Codes and Cryptography, Vol. 7 (1996) pp. 135–151.

6. Y. Han, D. Gollman and C. Mitchell, Minimal weight k-SR representations, Cryptography and Coding(C. Boyd, ed.), volume 1025, Lecture Notes in Computer Science, Springer-Verlag (1995) pp. 34–35.

7. J. Hopcroft and J. Ullman, An Introduction to Automata, Languages and Computation, Reading, MA,Addison-Wesley (1979).

8. L. Hui and K.-Y. Lam, Fast square-and-multiply exponentiation for RSA, Electronics Letters, Vol. 30,No. 17 (1994) pp. 1396–1397.

9. C. K. Koc, Higher-radix and bit recoding techniques for modular exponentiation, International Journal ofComouter Mathematics, Vol. 40 (1991) pp. 139–156.

10. D. E. Knuth, The Art of Computer Programming: Vol. 2, Seminumerical Algorithms, Addison-Wesley (1981).11. N. Koblitz, CM curves with good cryptographic properties. Advances in Cryptology, CRYPTO 91

(J. Feigenbaum, ed.), volume 576, Lecture Notes in Computer Science, Springer-Verlag (1992) pp. 279–287.12. K. Koyama and T. Tsuruoka, Speeding up elliptic curve cryptosystems using a signed binary window method.

Advances in Cryptology, CRYPTO 92 (E. F. Brickell, ed.), volume 740, Lecture Notes in Computer Science,Springer-Verlag (1992) pp. 345–357.

13. A. Menezes, P. van Oorschot and S. Vanstone, Handbook of Applied Cryptography, CRC press (1996).14. F. Morain and J. Olivos, Speeding up the computations on an elliptic curve using addition-subtraction chains,

Theoretical Informatics and Applications, Vol. 24, No. 6 (1990) pp. 531–544.15. L. J. O’Connor, An analysis of exponentiation based on formal languages. Advances in Cryptology,

EUROCRYPT 99 (J. Stern ed.), volume 1592, Lecture Notes in Computer Science, Springer-Verlag (1999)pp. 375–388.

16. G. Reitwiesener, Binary arithmetic, Advances in Computers (F. L. Alt, ed.) (1960) pp. 232–308.17. R. L. Rivest, A. Shamir and L. Adleman, A method for obtaining digital signatures and public key cryp-

tosystems, Communications of the ACM, Vol. 21, No. 2 (1978) pp. 120–126.18. R. Sedgewick and P. Flajolet, An Introduction to the Analysis of Algorithms, Addison-Wesley Publishing

Company (1996).19. J. A. Solinas, An improved algorithm for arithmetic on a family of elliptic curves. Advances in Cryptology,

CRYPTO 97 (B. S. Kaliski, ed.), volume 1294, Lecture Notes in Computer Science, Springer-Verlag (1997)pp. 357–371.

20. Y. Yacobi, Exponentiating faster with addition chains. Advances in Cryptology, EUROCRYPT 90 (I. B.Damgard, ed.), volume 473, Lecture Notes in Computer Science, Springer-Verlag (1991) pp. 222–229.