information security, coding theory and related combinatorics 2011 by cool release

Upload: tad-electronics-tadelectronics

Post on 03-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    1/459

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    2/459

    NATO Science for Peace and Security Series

    This Series presents the results of scientific meetings supported under the NATO Programme:Science for Peace and Security (SPS).

    The NATO SPS Programme supports meetings in the following Key Priority areas: (1) DefenceAgainst Terrorism; (2) Countering other Threats to Security and (3) NATO, Partner andMediterranean Dialogue Country Priorities. The types of meeting supported are generallyAdvanced Study Institutes and Advanced Research Workshops. The NATO SPS Seriescollects together the results of these meetings. The meetings are co-organized by scientists fromNATO countries and scientists from NATOs Partner or Mediterranean Dialogue countries.The observations and recommendations made at the meetings, as well as the contents of thevolumes in the Series, reflect those of participants and contributors only; they should notnecessarily be regarded as reflecting NATO views or policy.

    Advanced Study Institutes (ASI) are high-level tutorial courses to convey the latestdevelopments in a subject to an advanced-level audience.

    Advanced Research Workshops (ARW) are expert meetings where an intense but informalexchange of views at the frontiers of a subject aims at identifying directions for future action.

    Following a transformation of the programme in 2006 the Series has been re-named and re-organised. Recent volumes on topics not related to security, which result from meetingssupported under the programme earlier, may be found in the NATO Science Series.

    The Series is published by IOS Press, Amsterdam, and Springer Science and Business Media,Dordrecht, in conjunction with the NATO Emerging Security Challenges Division.

    Sub-Series

    A. Chemistry and Biology Springer Science and Business MediaB. Physics and Biophysics Springer Science and Business MediaC. Environmental Security Springer Science and Business MediaD. Information and Communication Security IOS PressE. Human and Societal Dynamics IOS Press

    http://www.nato.int/sciencehttp://www.springer.comhttp://www.iospress.nl

    Sub-Series D: Information and Communication Security Vol. 29

    ISSN 1874-6268 (print)ISSN 1879-8292 (online)

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    3/459

    Information Security, Coding

    Theory and Related CombinatoricsInformation Coding and Combinatorics

    Edited by

    Dean CrnkoviUniversity of Rijeka, Rijeka, Croatia

    and

    Vladimir TonchevMichigan Technological University, Houghton, Michigan, USA

    Published in cooperation with NATO Emerging Security Challenges Division

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    4/459

    Proceedings of the NATO Advanced Study Institute on Information Security and RelatedCombinatoriesOpatija, Croatia31 May - 11 June 2010

    2011 The authors and IOS Press.

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, ortransmitted, in any form or by any means, without prior written permission from the publisher.

    ISBN 978-1-60750-662-1 (print)ISBN 978-1-60750-663-8 (online)Library of Congress Control Number: 2010941318

    PublisherIOS Press BVNieuwe Hemweg 6B1013 BG AmsterdamNetherlands

    fax: +31 20 687 0019e-mail: [email protected]

    Distributor in the USA and CanadaIOS Press, Inc.4502 Rachael Manor DriveFairfax, VA 22032USAfax: +1 703 323 3668e-mail: [email protected]

    LEGAL NOTICEThe publisher is not responsible for the use which might be made of the following information.

    PRINTED IN THE NETHERLANDS

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    5/459

    Preface

    This book contains papers based on lectures presented at the NATO Advanced Study

    Institute "Information Security and Related Combinatorics", held in the beautiful town

    of Opatija at the Adriatic Coast of Croatia from May 31 to June 11, 2010. On behalf

    of all participants, we would like to thank the NATO Science for Peace and Security

    Programme for providing funds for the conference, as well as the local sponsors, which

    included the Ministry of Science and Education of the Republic of Croatia, the Croatian

    Academy of Sciences and Arts, the Primorsko-goranska County, the University of Rijekaand its Mathematics Department, the Foundation of the University of Rijeka, the Society

    of Mathematicians and Physicists, the Login Co., the Opatija Tourist Board, the City of

    Opatija, the City of Rijeka, and Brodokomerc.nova.

    The Advanced Study Institute had fourteen lecturers: K.T. Arasu (USA), C. Col-

    bourn (USA), F. Fuji-Hara (Japan). W. Haemers (The Netherlands), M. Jimbo (Japan),

    J.D. Key (USA), H. Kharaghani (Canada), C. Lam (Canada), S. Magliveras, (USA), J.

    Moori (South Africa), T. Shaska (USA), L. Storme (Belgium), V.D. Tonchev (USA), R.

    Wilson (USA), and was attended by over 60 graduate students and junior scientists from

    Albania, Armenia, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Germany, Italy,Macedonia, The Netherlands, Russia, Turkey, and USA.

    The unifying theme of the conference was combinatorial mathematics used in appli-

    cations related to information security, cryptography, and coding theory.

    The book will be of interest to mathematicians, computer scientists and engineers

    working in the area of digital communications, as well as to researchers and graduate stu-

    dents who are willing to learn more about the applications of combinatorial mathematics

    to problems arising in communications and information security. The majority of papers

    are surveys on topics that are subject to current research and are written in a tutorial text

    book style that makes this volume a good source as an additional text for a course in dis-

    crete mathematics or applied combinatorics. The book can be used in graduate coursesof applied combinatorics with a focus on coding theory and cryptography.

    Dean Crnkovic and Vladimir Tonchev

    vInformation Security, Coding Theory and Related CombinatoricsD. Crnkovi and V. Tonchev (Eds.)

    IOS Press, 2011

    2011 The authors and IOS Press. All rights reserved.

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    6/459

    This page intentionally left blank

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    7/459

    ContentsPreface v

    Dean Crnkovi and Vladimir Tonchev

    Crypto Applications of Combinatorial Group Theory 1Ivana Ili and Spyros S. Magliveras

    Generating Rooted Trees ofm Nodes Uniformly at Random 17Kenneth Matheis and Spyros S. Magliveras

    On Jacobsthal Binary Sequences 27Spyros S. Magliveras, Tran van Trung and Wandi Wei

    Applications of Finite Geometry in Coding Theory and Cryptography 38A. Klein and L. Storme

    The Arithmetic of Genus Two Curves 59

    T. Shaska and L. Beshaj

    Covering Arrays and Hash Families 99Charles J. Colbourn

    Sequences and Arrays with Desirable Correlation Properties 136K.T. Arasu

    Permutation Decoding for Codes from Designs, Finite Geometries and Graphs 172J.D. Key

    Finite Groups, Designs and Codes 202J. Moori

    Designs, Strongly Regular Graphs and Codes Constructed from SomePrimitive Groups 231

    Dean Crnkovi, Vedrana Mikuli Crnkovi and B.G. Rodrigues

    Matrices for Graphs, Designs and Codes 253Willem H. Haemers

    Finding Error-correcting Codes Using Computers 278Clement Lam

    Quantum Jump Codes and Related Combinatorial Designs 285Masakazu Jimbo and Keisuke Shiromoto

    vii

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    8/459

    Unbiased Hadamard Matrices and Bases 312Hadi Kharaghani

    Multi-structured Designs and Their Applications 326Ryoh Fuji-Hara and Ying Miao

    Recent Results on Families of Symmetric Designs and Non-embeddableQuasi-residual Designs 363

    Mohan S. Shrikhande and Tariq A. Alraqad

    Codes and Modules Associated with Designs and t-uniform Hypergraphs 404Richard M. Wilson

    Finite Geometry Designs, Codes, and Hamadas Conjecture 437Vladimir D. Tonchev

    Subject Index 449

    Author Index 451

    viii

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    9/459

    Crypto applications of combinatorialgroup theory

    Ivana Ilic and Spyros S. Magliveras

    CCIS, Department of Math. Sciences, Florida Atlantic University,

    Boca Raton, FL 33431, USA

    e-mail: [email protected], [email protected]

    Abstract. The design of a large number of cryptographic primitives is based on

    the intractability of the traditional discrete logarithm problem (tDLP). However,

    the well known quantum algorithm of P. Shor [9] solves the tDLP in polynomial

    time, thus rendering all cryptographic schemes based on tDLP ineffective, should

    quantum computers become a practical reality. In [5] M. Sramka et al. generalize

    the DLP to arbitrary finite groups. The DLP for a non-abelian group is based on a

    particular representation of a chosen family of groups, and a choice of a class of

    generators for these groups. In this paper we show that for P SL(2, p) = , ,p an odd prime, certain choices of generators (, ) must be avoided to insure thatthe resulting generalized DLP is indeed intractable. For other types of generating

    pairs we suggest possible cryptanalytic attacks, reducing the new problem to the

    earlier case. We note however that the probability of success is asymptotic to 1p

    as p . The second part of the paper summarizes our successful attack of theSL(2, 2n) based Tillich Zmor cryptographic hash function [2], and show how toconstruct collisions between palindromic strings of length 2n + 2.

    2000 Mathematics Subject Classification: 68P25, 94A60.

    Keywords. Discrete logarithm, finite groups, intractability, representations and pre-

    sentations of groups, P SL(2, p), public key cryptosystems, Tillich-Zmor hashfunction.

    Introduction

    In a recent quote, P. Nguyen states Due to Shors algorithms for computing prime fac-

    torizations and discrete logarithms on quantum computers, most of present day public

    key cryptosystems must be considered insecure , if sufficiently large quantum computersbecame available. ... One interesting line of research in this direction is the use of com-

    putational problems in non-abelian groups ... [6]. In this article we discuss recent re-

    sults on the generalized discrete logarithm problem (GDLP) in the family of non-abelian

    simple groups P SL(2, p), p an odd prime. In particular we examine these groups in theirrepresentations as matrices over GF(p), and investigate weak generator choices for thegeneralized DLP problem. In the second part of the paper we summarize the interest-

    1Information Security, Coding Theory and Related Combinatorics

    D. Crnkovi and V. Tonchev (Eds.)

    IOS Press, 2011

    2011 The authors and IOS Press. All rights reserved.

    doi:10.3233/978-1-60750-663-8-1

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    10/459

    ing approach in [2] which culminated with the demise of the well known Tillich-Zmor

    cryptographic hash function [13].

    1. Preliminaries

    The authors of [5] generalize the discrete logarithm problem from finite cyclic groups

    to arbitrary finite groups. We restate the definition. Let G be a finite group generated by1, . . . , t, i.e., G = 1, . . . , t. Denote by = (1, . . . , t), the ordered tuple ofgenerators of the group G. As defined in [5], for a given G, the generalized discretelogarithm problem (GDLP) ofwith respect to is to determine a positive integer k anda (kt)-tuple of non-negative integers x = (x11, . . . , x1t, . . . , xk1, . . . , xkt) such that

    =k

    i=1

    (xi11 . . . xitt ) .

    We can write this formally as = x. The (kt)-tuples (x11, . . . , x1t, . . . , xk1, . . . , xkt)are called the generalized discrete logarithms of with the respect to = (1, . . . , t).

    Denote by

    Sk = ki=1

    (xi11 . . . xitt ) | xij Znj

    where nj denotes the order of element j . Then, the smallest positive integer k0 such thatfor all k k0 G Sk is called the depth of group G with respect to (1, . . . , t). Therecould be more than one generalized discrete logarithm of with respect to . Actually,there will be infinitely many generalized discrete logarithms: ifx is a generalized discretelogarithm of with respect to and ifx

    = 1, then, the catenations x||x and x ||x arealso generalized discrete logarithms of with respect to .

    The generalization of the discrete logarithm problem to finite groups has potentialapplications in cryptography. To be able to construct secure cryptographic primitives

    based on the generalized discrete logarithm problem in finite groups, care must be taken

    to ensure that the groups along with their representations and choice of generators have

    an intractable generalized discrete logarithm problem.

    The traditional discrete logarithm problem is generally considered computationally

    intractable. However, there exist groups and their representations in which the problem

    can be solved efficiently. For example, inZn, the additive group of integers modulo n, thediscrete logarithm can be easily computed. For a given element in Zn and generator

    ofZn, it is easy to find a non-negative integer x such that x = . Since is a generator,gcd(n, ) = 1, and the multiplicative inverse in the ring (Zn, +, ) of can be computedby the extended Euclidean algorithm. In general, one may speak of a tractable/intractable

    GDLP problem for a given infinite family of pairs {(G, A)}L indexed by L, wherethe G are groups in a common representation , and A a particular set of generators forG.

    2 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    11/459

    The generalized discrete logarithm problem may be tractable for some groups and

    generators in representation . We examined the groups P SL(2, p) as potential candi-dates for cryptographic applications, but our results show that when P SL(2, p) is rep-

    resented by matrices, the generalized discrete logarithm problem with respect to severaltypes of generating sets does not provide the required strength.

    As is customary, we denote by Z the ring of integers. We also denote by Z+ thepositive integers, and by Z0 the non-negative integers.

    2. Generalized discrete logarithm problem in P SL(2, p)

    Suppose that for an odd prime p the group G = P SL(2, p) is represented by matricesofSL(2, p), up to a factor I, where I is the 2 2 identity matrix. Suppose further thatG is generated by two elements, i.e., G = A, B. We have examined the tractabilityof the generalized discrete logarithm problem in this setup with respect to different gen-

    erating pairs of elements (A, B). The results of our research show that the hardness ofcomputation of the generalized discrete logarithm problem will depend not only on the

    group representation, but also on the choice of generators. To perform a detailed analy-

    sis on whether the generalized discrete logarithm can be computed efficiently, we con-

    sidered the following cases: 1) group G is generated by special elements: A = ( 1 10 1 ),

    and B = (1 0

    1 1 ); 2) group G is generated by two elements both of order p; 3) groupG is generated by two elements, one of which is of order p; 4) group G is generated bytwo elements none of which is of order p. We have analyzed the first two cases in [4].Suppose that M =

    a bc d

    G, with a,b,c,d Fp, the field of order p.The matrices

    A =

    1 10 1

    , B =

    1 01 1

    are both of order p, non-commuting and generate G, i.e., G = A, B. Moreover, theauthors of [5] show that the depth of group G with the respect to the (A, B) is two, sothat the element M G can be written as M = AiBj AkB. We have

    AiBjAkB =

    1 i0 1

    1 0j 1

    1 k0 1

    1 0 1

    .

    Hence,

    a b

    c d

    =

    1 + ij + ((1 + ij) k + i) (1 + ij) k + i

    j + (jk + 1) jk + 1

    .

    By equating corresponding entries in the previous equality we obtain the system of

    equations

    3I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    12/459

    1 + ij + ((1 + ij) k + i) = a

    (1 + ij) k + i = b

    j + (jk + 1) = cjk + 1 = d

    which can be solved for i,j,k, by computing Grbner basis of the ideal I = 1 +k + ij + ijk + i a, k + ijk + i b, j +jk + c,jk + 1 d . A Grbner basisfor the above ideal is computed over the set of rational numbers: [ jic +ja c, k +

    id b,jibc +ji jab a + bc + 1,jid jb + d 1, ad bc 1 ], which yields thefollowing system of equations: in i,j,k, Zp.

    jic +ja c = 0k + id b = 0

    jid jb + d 1 = 0

    whose solutions in i,j,k,l represent the generalized discrete logarithms of M with re-

    spect to (A, B). The solutions are given by the following proposition:

    Proposition 2.1 LetA, B andM be as above. Then, there exists a non-negative integer

    n < p such that nd b = 0 overZp, and such that the 4-tuple (i,j,k,) with i = n,j = (1 d)(nd b)1, k = b nd, = (1 d)(nc a)(nd b)1 + c provides asolution to M = AiBj AkB.

    Proof. It can be directly verified that the given values for i,j,k, satisfy the abovesystem of equations. The existence of n is ensured since M P SL(2, p) and hence band d can not simultaneously be equal to zero. 2

    We have shown that the generalized discrete logarithm problem can be solved ef-

    ficiently in P SL(2, p) with respect to the special given generators (A, B) as defined

    above. Further, as in [4], we construct an algorithm for computing the generalized dis-

    crete logarithm problem in P SL(2, p) with respect to any two generators of order p. As-

    sume that C, D are two non-commuting elements of order p in P SL(2, p). Then, since

    any two non-commuting elements of order p from P SL(2, p) generate the whole group,

    it follows that P SL(2, p) = C, D. To determine non-negative integers i,j,k, suchthat: M = CiDj CkD, we look for an element g G which satisfies C = g1Asgand D = g1Btg, for some non-negative integers s, t < p and where A and B are the

    matrices defined above.

    Then,

    4 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    13/459

    M = CiDj CkD

    = (g1Asg)i(g1Btg)j (g1Asg)k(g1Btg)

    = (g1Asig)(g1Btj g)(g1Askg)(g1Btg)

    = g1AsiBtj AskBtg

    Denote by x = si, y = tj, v = sk and w = t. Then, gM g1 = AxByAvBw. LetM1 = gM g

    1. Obviously, M1 G and M1 = AxByAvBw. We have transformedthe generalized discrete logarithm problem of P SL(2, p) with respect to (C, D) to thegeneralized discrete logarithm problem of P SL(2, p) with respect to (A, B) which weare able to solve as described earlier.

    To determine an element g for which the conditions C = g1Asg and D = g1Btg

    hold simultaneously, we write the system of equations: gC = Asg and gD = Btg, forsome non-negative integers s, t < p. Since, g = ( g1 g2g3 g4 ), we obtain a system of equationsin g1, . . . , g4 and s and t from which an element g is determined. The existence of suchan element g is ensured since P SL(2, p) acts doubly transitively by conjugation on its(p + 1) Sylow-p subgroups. Then, for any two pairs of p-Sylow subgroups, and hencefor the particular pairs (A, B) and (C, D), there exists an element g G suchthat (C, D) = (Ag, Bg) .

    The third case in our analysis of hardness of the generalized discrete logarithm prob-

    lem in P SL(2, p), with respect to a pair of generators, is when one of the generators

    is of order p. Suppose now that P SL(2, p) = A, B where |A| = p. Note that theorder of element B can only be divisor of the order of the group p(p2 1)/2. Given anelement M P SL(2, p) our goal is to write M in terms of the generators (A, B). In theconstruction of a word in A and B that represents element M, we will use the result ofthe following proposition.

    Proposition 2.2 If G = P SL(2, p) = A, B where |A| = p, then G = A, AB,where AB = B1AB.

    Proof. Every two non-commuting elements of order p from P SL(2, p) generatethe whole group. So we prove that elements A and AB are non-commuting of order

    p. Conjugate elements have the same order, so |AB| = |A| = p. Now, suppose thatelements A and AB commute. Then, AB is in the centralizer of element A, i.e., AB CG(A) = A. So, AB = Ai for some i {0, . . . , p 1}. But then, B normalizesA, hence, A is a proper normal subgroup of A, B. But P SL(2, p) is simple, thusA, B can not be all ofP SL(2, p), a contradiction to the fact that A and B generate G.2

    The proposition that follows provides an upper bound for the depth of P SL(2, p)with respect to two generators one of which is of order p and its proof provides an

    algorithm for constructing a word in generators A and B that represents a given elementM.

    Proposition 2.3 Suppose that G = P SL(2, p) = A, B, where |A| = p, with nofurther assumptions on |B| = m. Then, the depth of G with respect to the generatingtuple (A, B) is less than or equal to four.

    5I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    14/459

    Proof. Let C = AB = B1AB. By Proposition (2.2) the group P SL(2, p) isgenerated by elements A and C, both of order p. The generalized discrete logarithmproblem can be solved efficiently in P SL(2, p) represented by matrices, with respectto two generators of order p. By the method described earlier, the generalized discretelogarithm (i,j,k,) can be found such that M = AiCj AkC. To represent the elementM in terms of the generators A and B we write the following sequence of equalities.

    M = AiCj AkC

    = Ai(B1AB)jAk(B1AB)

    = AiB1Aj BAkB1AB

    = AiBm1AjBAkBm1AB

    Therefore, the generalized discrete logarithm of M P SL(2, p) with respect togenerating tuple (A, B), where |A| = p and |B| = m is (i, m 1, j, 1, k , m 1, , 1).It follows that every element M from P SL(2, p) = A, B, where |A| = p and |B| =m can be represented as M = Ax1By1Ax2By2Ax3By3Ax4By4 for some integersx1, x2, x3, x4 {0,...,p 1} and y1, y2, y3, y4 {0,...,m 1}. The propositionfollows. 2

    The described method for writing element M as a word in generators A and B doesnot assure obtaining the shortest possible word that represents M in these generators.

    Next, we take a look into a possible strategy for writing an element M of groupP SL(2, p) in terms of two generators none of which is of order p. Suppose that we havean efficient method for constructing an element of order p in terms of the generators Aand B. In the following proposition we will use the notation wp(A, B) to represent aword in A and B which is of order p as an element ofG.

    Proposition 2.4 If G = P SL(2, p) = A, B where the orders of A and B arerelatively prime to p, and if P = wp(A, B), is a word in A and B, of orderp as anelement ofG, then G =

    A, P

    orG =

    B, P

    .

    Proof. Let N be the normalizer in G of P, i.e. N = NG(P). Then, at leastone of the elements A, B is not in N. Otherwise if A, B were both in N, then A, Bwould be a subgroup of N, that is G = A, B N, and therefore we would havethat N = G. This would imply that P is a non-trivial, proper, normal subgroup of G,contradicting the fact that G is simple. Without loss of generality, suppose that A / N.Then A, P = P SL(2, p), because the only proper subgroups of P SL(2, p) contain-ing P are subgroups of the normalizer of P. Similarly, if B / N, it follows thatP SL(2, p) = B, P. 2

    If A, B and P are as in Proposition 2.4 we can solve efficiently the generalizeddiscrete logarithm problem with respect to (A, P) since P SL(2, p) = A, P and |P| =p. Therefore, we can solve the generalized discrete logarithm problem with respect to(A, B). Given M P SL(2, p) = A, B and P = wp(A, B) as in the Proposition 2.4we can write element M as a word in A, B as follows. Without loss of generality, assumethat A / NG. Conjugate element P by element A, i.e., compute PA = A1P A. Based

    6 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    15/459

    on the Proposition 2.2, P SL(2, p) = P, PA. Based on the proof of the Proposition 2.3,if|A| = s, we have:

    M = P

    i

    (P

    A

    )

    j

    P

    k

    (P

    A

    )

    = PiAs1Pj APkAs1PA

    = wp(A, B)iAs1wp(A, B)

    j wp(A, B)wp(A, B)kAs1wp(A, B)

    A

    The direct consequence is that the depth of the P SL(2, p) with respect to the generatorsboth of order relatively prime to p, will depend on the word P = wp(A, B).

    We examine a bit further possible attacks to the GDLP for G = P SL(2, p) based onProposition 2.4. A word of shortest possible length in A and B to produce an element oforder

    pis

    ABor

    BA. We will consider the case where

    |A| = |B| = d = (p 1)/2and

    |AB| = p. This condition occurs systematically in P SL(2, p), however, unfortunatelyfor the cryptanalyst, the probability of this occurrence goes to zero as p .

    We will need some well known facts about the group P SL(2, q), q = pm, p an oddprime, which we state below, without proof, as a proposition. In what follows standsfor Eulers function.

    Proposition 2.5 Suppose that G = P SL(2, q), q = pm, p an odd prime. Then,

    (a) The Sylow-p subgroup ofG is elementary abelian of orderq,(b) If x G is of orderd, then d divides (q1)/2, ord = p, ord divides (q+1)/2,(c) There is a single conjugacy class of subgroups of order(q 1)/2, and these are

    cyclic. Similarly, there is a single conjugacy class of subgroups of order (q+ 1)/2,and they are cyclic.

    (d) If x G is of order d = 2 dividing (q 1)/2 then x belongs to one and onlyone cyclic subgroup ofG of order(q 1)/2.

    (e) If d = 2 divides (q 1)/2 there are (q1)2

    conjugacy classes of element of

    orderd in G.(f) If x

    G is of orderd

    |(q

    1)/2, d

    = 2, then the centralizerCG(x) is

    x

    , while

    the normalizer NG(x) is dihedral of orderq 1.

    We will now examine the very special case where G = P SL(2, p) is generated bytwo elements of order (p 1)/2. Similar results can be derived for the other possiblecases. In what follows, Let X be the set of all elements of order d = (p 1)/2 in G.We will consider the action of G by conjugation on X X. Note that all pairs (A, B)in a Gorbit on X X share almost all critical properties of interest to our problem, asconjugation by an element g G induces an automorphism ofG. For example if(A, B)generate G so does (A, B)g = (Ag, Bg), for g G. Similarly, the order of AB is thesame as the order ofA

    g

    Bg

    = (AB)g

    , etc. Thus it suffices to examine one representativefrom each orbit ofG on X X.

    Since G acts transitively by conjugation on the cyclic subgroups of order (p 1)/2,without loss of generality, we will select one such subgroup, say C and one fixed gener-ator x C, so that C = x. Now, CG(x) = {y G | xy = yx} = x = C. We havethe following consequences of Proposition 2.5:

    7I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    16/459

    Proposition 2.6 If G and X are as above, and d = (p 1)/2, then:

    (a) |X| = (d)p(p + 1)/2,(b) Letx be any fixed element ofX. In the action of C = CG(x) on X by conju-

    gation there are exactly (d) orbits of length 1, and v = ((d)p(p + 1) 2)/2dorbits of length d.

    (c) Of the v orbits Oi of length d exactly 2(d) 2 are such that if y Oithen |xy| = p.

    Proof. (a) Since each of the (d)/2 conjugacy classes of elements of order d has

    |G|/d = p(p + 1) elements, it follows that |X| = [(d)p(p + 1)]/2.

    (b) C = CG(x) = x has exactly (d) elements y of order d in it, and since theseelements commute with x, the orbit yC = {y} and has length 1. If y X \ C thenK = CG(y) = y, and KC = {1}, hence the orbit yC has exactly |C| elements. Thus,the number of orbits of length d is [((d)p(p+1))/2(d)]/d = [(d)(p(p+1)2]/2d.(c) We will only give an idea about the proof here. The result follows from calculations

    in the center of the group ring ZG. In particular, if {Ki}ci=1 are the conjugacy classesof G, they form a basis for the center ofZG and KiKj =

    ck=1 aijk Kk, with the

    aijk computable from the character table of G. We have that X is the sum of the (d)/2

    classes {Ki} with elements of order d. Thus, in the group ring, the number of elementsin xX of order p is the sum of the coefficients of the two classes Kp and Kp+ in

    1|Kx|

    Kx(d)/2

    i=1 Ki =1

    p(p+1)

    (d)/2i=1 KxKi . Since each Corbit on X\ C are of

    length d, we further divide by d for the number ofCorbits. 2

    We are now able to state a proposition which is not of much help to the cryptanalyst,

    but which lends evidence to the notion that strong generators may be possible for a GDLP

    based system.

    Proposition 2.7 Let G = P SL(2, p) and letd, X andx X be as above. If we selecta second elementy X randomly, then the probability that the order of xy is p is2((d)1)(p1)

    (d)p(p+1) which is of course asymptotic to2

    p as p .

    Proof: Having fixed x X, by Proposition 2.6 the number of elements y X suchthat |xy| = p is 2((d) 1)d. Since |X| = (d)2 |G|d we have:

    P r{|xy| = p} = 2((d) 1)d(d)

    2p(p + 1)

    = 4((d) 1)d(d)p(p + 1)

    = 2((d) 1)(p 1)(d)p(p + 1)

    ; 2p

    hence the result. 2

    It is clear of course that if(A, B) X X with |AB| = p, then A, B = G.

    8 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    17/459

    3. Relations

    By solving the generalized discrete logarithm problem for a finite group with respect to

    a given set of generators we are factorizing group elements in terms of the generators.By equating two different factorizations of the same group element, we obtain a relation.

    This observation holds in any finite group as we discuss in the next section.

    Let G be a finite group generated by 1, . . . , t, i.e., G = 1, . . . , t. Denote by = (1, . . . , t) the ordered tuple of generators of the group G. For a given G,assume that

    =k

    i=1(xi11 . . .

    xitt )

    i.e., = x, where x = (x11, . . . , x1t, . . . , xk1, . . . , xkt). Recall that x =(x11, . . . , x1t, . . . , xk1, . . . , xkt), the generalized discrete logarithm with respect to thegenerators = (1, . . . , t), is not unique, in fact there will exist infinitely many dis-tinct y = (y11, . . . , y1t, . . . , ys1, . . . , yst) such that =

    y =s

    i=1 yi11 . . .

    yitt . For

    any such y we have:

    k

    i=1xi11 . . .

    xitt =

    s

    i=1yi11 . . .

    yitt .

    In this way we obtain non-trivial relations among the generators. Further, by collect-

    ing different relations we may obtain a presentation of the group : G = X|R, whereX is the set of generators, and R a set of relations of the above type, sufficiently manyto completely determine the group.

    Relations of particular interest in cryptography are those which represent the identity

    element of the group, that is of the form 1G = a word in the generators. Moreover, in afinite group G we can always convert a presentation of the form G = X|R, into one ofthe form G = X|R, where R is a set of relations of the type: ki=1 xi11 . . . xitt =1G.

    The length of word w =k

    i=1 xi11 . . .

    xitt in the symbols 1, . . . , t, where

    the xij are non-negative integers, is defined to be the integer |w| =k

    i=1

    tj=1 xij .

    Moreover, if w1 and w2 are words in the symbols 1, . . . , t and : w1 = w2 is arelation, the length of the relation is defined to be the integer || := |w1| + |w2|.

    IfG is a finite group generated by 1, . . . , t, a relation in the 1, . . . , t is said tobe short if || = O(log (|G|)), otherwise is said to be long. Relations of importanceto cryptographic hash functions of the Tillich-Zmor type are those which are short.

    We turn to our group of interest, P SL(2, p), and examine the length of some relationsthere.

    Let G = P SL(2, p), and consider the elements A = ( 1 10 1 ), B = (1 01 1 ) in G. The

    matrices A and B are both of order p, non-commuting and thus generate P SL(2, p). Aswe have seen earlier, the depth of P SL(2, p) with respect to the generating tuple (A, B)

    9I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    18/459

    is two. Therefore, the identity matrix I P SL(2, p) can be written as I = AiBjAkBfor some non-negative integers i, j, k and . In the next proposition we establish that forany prime p, any relation of the form I = AiBj AkB in P SL(2, p) is long.

    Proposition 3.1 Let A, B and I be matrices in P SL(2, p) as above. Then, a solution(i,j,k,) to the generalized discrete logarithm problem I = AiBj AkB is such thateitheri +j + k + p or i = j = k = = 0.

    Proof.

    AiBjAkB =

    1 i0 1

    1 0j 1

    1 k0 1

    1 0 1

    .

    Therefore, 1 00 1

    =

    1 + ij + ((1 + ij)k + i) (1 + ij)k + i

    j + (jk + 1) jk + 1

    .

    Then, jk + 1 = 1 (mod p) and hence jk = 0 (mod p). By using jk = 0 (mod p),we obtain

    1 0

    0 1 = 1 + ij + k + i k + i

    j + 1 So, j + = 0 (mod p) and k + i = 0 (mod p) i.e., j + = s1p, s1 Z0 andk + i = s2p, s2 Z0. Ifs1 1, then j + p. Hence, i + j + k + p. Ifs1 = 0,i.e., j + = 0, then j = = 0. Similarly, s2 1 leads to i +j + k + p, and s2 = 0leads to k = i = 0. The length of the word 1G = A

    iBj AkB, is i + j + k + p ori = j = k = = 0. Thus, i +j + k + p. 2

    We remark that since for p > 7, p > 3logp > log(|P SL(2, p)|), any relation of theform I = AiBj AkB is long, for all p > 7.

    4. The demise of the Tillich-Zmor hash function

    Let V = {0, 1} be the Kleene closure of{0, 1}, i.e. the set of all binary sequences ofarbitrary but finite length. Moreover, for n Z+, denote by Vn = {0, 1}n the set ofall binary sequences of length n. For a given parameter n Z+, by a hash function wemean any function h : V Vn. If v V, we denote by vr the reversal of v, i.e. thereflection ofv with respect to a central axis. For example if v = 00111, vr = 11100.

    Definition 4.1 For a fixed parametern Z+, a hash function h : V Vn is said to bea cryptographic hash function ifh has the following additional properties :

    1. preimage resistance: For essentially all y Vn it is computationally infeasible tofindx V such thath(x) = y,

    10 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    19/459

    2. 2nd-preimage resistance : For any given x V it is computationally infeasible todetermine any x V, such thatx = x andh(x) = h(x), and

    3. collision resistance: It is computationally infeasible to find any x, x V such thatx = x

    andh(x) = h(x

    ).

    It is clear that the three properties are not independent, but if for a given h a crypt-analyst succeeds in breaching any one of the three, then h is considered compromised.However, a satisfactory attack on the collision resistance property must also satisfy a

    rather severe length requirement that the lengths of x and x must be polynomial in theparameter n.

    In their paper Hashing with SL2 [13], Tillich and Zmor propose a cryptographichash function based on computing matrix products in the non-abelian group SL(2, q).

    A brief history of the evolution of the Tillich-Zmor hash function (TZ) is given in theintroduction of [2]. We give here a brief description of the scheme in its final form and a

    summary of the main steps that led to its cryptanalysis.

    4.1. The final version of the Tillich-Zmor hash function

    Input parameters are a positive integer n, and an irreducible polynomial q(x) of degreen over the field of two elements F2 = GF(2). Let F be the finite field of order 2

    n

    represented as F = F2[x]/(q(x)). Let be a root ofq(x) and define

    s0 :=

    11 0

    , s1 :=

    + 11 1

    Then the matrices s0 and s1 generate the group G = SL(2,F) of all unimodular matrices

    over F. The Tillich-Zmor hash function h is defined as follows:

    For bitstring v = b0b1 bm V define h(v) := sb0sb1 sbm G.

    Note that h maps a binary string v of arbitrary length to a matrix in G which requires4 entries from F, thus maps V to V4n. Any satisfactory attack must work for any n andany irreducible polynomial q(x) of degree n over F2. Thus, we note that the problem isspecific to the representation ofF as well as to the generators. The orders k, ofs0 ands1 could be very large, for example any divisors of2

    n + 1 or 2n 1 and can be efficientlycalculated. Ifk or is small, then the system can be effectively attacked because one canwrite a short relation, such as sk0 = I, or s

    1 = I, I the identity of G. Thus a successful

    attack must assume nothing about the orders k and .

    In the proposition that follows we prove the existence of short relations in any finitegroup G generated by two elements.

    Proposition 4.1 LetG be a finite group generated by two elements A andB. Then thereexist a relation : w1 = w2 where w1 andw2 are two different words in A andB, suchthat |w1| + |w2| O(log2|G|).

    11I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    20/459

    Proof. We construct the blocks of all words of successive lengths in A and B. LetB0 = {I}, where I is the identity of the group G. Let Bk be the collection of all wordsin A and B of length k. Then |Bk| = 2k.

    Let n be the positive integer such that n+1k=0 |Bk| > |G| and such thatnk=0 |Bk| |G|. Since

    nk=0 |Bk| = 2n+1 1 we can write 2n+1 1 |G|,

    i.e., 2n+1 |G| + 1. By taking logarithms of both sides of the inequality, we obtain thatn + 1 log2(|G| + 1).

    By the pigeon-hole principle, two distinct words, say w1 and w2 belonging to {B0 B1 Bn+1} must correspond to the same element of G. Then, |w1| + |w2| 2(n + 1) 2log2(|G| + 1) = O(log2(|G|)). 2

    Of course the proof can be generalized to any finite group G generated by k gener-

    ators. A direct consequence of Proposition 4.1 is that short relations in two generatorsdo exist in SL(2, q). In particular, for G = SL2(2n), |G| = 2n(22n 1), and there

    are short relations of length at most 6n. The question, of course, is how does one findthem ?

    4.2. Experimentation

    Early cryptanalytic experiments [2] were restricted to cases in which the defining irre-

    ducible polynomial q(x) was of degree small enough to allow brute force searching forcollisions. Data analysis of experimental results showed that for every input q(x) of de-gree n, collisions of words of length 2n + 2 were obtained and that among those colli-sions there were colliding palindromes. Computations were preformed on a standard PC,

    using computer algebra system Magma [1]. For example,

    Example 4.1 With irreducible polynomial q(x) = x5 + x4 + x3 + x + 1 used to definethe fieldF25 = F2[x]/(q(x)) and with Tillich-Zmor generators s0, s1, the followingcollisions of palindromes of length 2n + 2 occur:

    h(0

    palindrome 00110

    v

    01100 vr

    0) = h(1

    the same palindrome 00110

    v

    01100 vr

    1)

    h(011101101110) = h(111101101111)

    Experimental results showed that for each tested choice ofF2n = F

    2[x]/(q(x)) two bit

    strings v1, v2 {0, 1}n, |v1| = n, |v2| = n, with

    h(0vivri0) = h(1viv

    ri1) (i = 1, 2),

    are obtained.

    12 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    21/459

    4.3. The successful attack

    It was shown in [2] how to construct collisions between palindromes of length 2n+2 for

    any defining irreducible polynomial of degree n, that is, pairs (u, v) V V such thatu = v and h(u) = h(v). It was demonstrated that the attack is practical: by construct-ing collisions for the challenge parameters. The method finds collisions of length a few

    hundred bits on a standard PC within a second. For the challenge polynomial of largest

    degree

    x2039 + x10 + x9 + x8 + x7 + x5 + x4 + x2 + 1

    computation still took a few seconds.

    With very few exceptions we will only state lemmas, propositions or theorems herebut will refer to [2] for their proof.

    4.3.1. Change of generators

    Recall that for a root of irreducible q(x) of degree n in F2[x]

    s0 :=

    11 0

    , s1 :=

    + 11 1

    and

    h(b1 . . . bm) := sb1 sbm G

    by conjugating the pair of generators (s0, s1) by any element ofG we clearly get anotherpair of generators of G. In particular, conjugating (s0, s1) by s0 yields (s

    s00 , s

    s01 ) =

    (s0, s10 s1s0) = (c0, c1). Computation results in:

    c0 := 11 0 , c1 = + 1 11 0 ,and the two new generators c0 and c1 define a new hash function by:

    h(b1 . . . bm) := cb1 cbm G

    We have the following:

    Lemma 4.1 Let v, v V. Then h(v) = h(v) if and only if h(v) = h(v).

    Lemma 4.1. transforms the original problem with respect to s0, s1 into the equivalentproblem of finding short collisions with respect to the new, symmetric generators c0,c1. This is critical in our solution. More generally, conjugating by any element t Gtransforms the generators and hash values but preserves collisions.

    13I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    22/459

    4.3.2. The structure of palindrome collisions

    Since a solution must be independent of the choice of the irreducible polynomial q(x),we proceed to work in SL2(F2[x]). Accordingly, matrices C0, C1

    SL2(F2[x] are

    defined with polynomial entries as follows:

    C0 :=

    x 11 0

    , C1 =

    x + 1 1

    1 0

    ,

    and a new hash function H : V SL2(F2[x]) is defined by:

    H(b1 . . . bm) := Cb1 Cbm SL2(F2[x])

    We further have:

    Lemma 4.2 Let v V be a palindrome, and write H(v) = a bc d . Then b = c, i. e.,H(v) is symmetric. Moreover, deg(a) = |v|, and max{deg(b), deg(d)} |v|.

    Now, define function : V F2[x]22 by:

    (v) := H(0v0) + H(1v1)

    For a given irreducible polynomial q(x) , (v) ( 0 00 0 ) mod q(x) if and only ifh(0v0) = h(1v1) is a collision in SL2(F2[x]/(q(x))).

    Lemma 4.3 Ifv V is a palindrome of length |v|, then (v) = ( a aa 0 ), where a F2[x]has degree |v|. Moreover, a is the upper left entry of H(v).

    Lemma 4.4 Ifv V is a palindrome of even length, then H(v) =

    a2 bb d2

    for some

    a,b,d

    F2[x].

    Proof. Ifu V denote by ur the reversal of u. Let v = uur for some u V. Theproof is by induction on |u|. When |u| = 0 the hash H(uur) is the identity matrix andthe statement holds trivially.

    Suppose now we extend a string u of given length by one bit, yielding a palindromev = (u)(ur) with {0, 1}. By the induction hypothesis we have that H(v) =H(uur) =

    a2 bb d2

    , so that:

    H(v) = C a2 bb d2C = (x + )2a2 + d2 (x + )a2 + b(x + )a2 + b a2 Consequently, both diagonal entries ofH(v) are squares, and the result follows. 2

    Combining Lemmas 4.3 and 4.4 yields:

    14 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    23/459

    Corollary 4.1 Let v V be a palindrome of even length. Then (v) =

    a2 a2

    a2 0

    for

    some a F2[x] with deg(a) = |v|/2. In particular, the entry a2 is the upper left entryofH(v).

    Further, from the proof of Lemma 4.4. we are able to deduce the following recurrence

    relation:

    Corollary 4.2 Let bn . . . b1b1 . . . bn V be a palindrome of length 2n. Then, for0 i n, the square root pi of the upper left entry of H(bi . . . b1b1 . . . bi) is givenby

    pi = 1, if i = 0;

    x + b1 + 1, if i = 1;

    (x + bi)pi1 +pi2, if 1 < i n.

    Now, for the given irreducible polynomial q = q(x) F2[x] of degree n, weseek a palindrome v V of length 2n such that (v) = H(0v0) + H(1v1) ( 0 00 0 )(modulo q(x)) in F2[x].

    4.3.3. Mesirov and Sweet

    In view of Corollaries 4.1. and 4.2. , finding such a v V can be accomplished bydetermining a second polynomial p(x) F2[x] of degree n 1 such that:

    1. gcd (q(x), p(x)) = 1,

    2. during the execution of the Euclidean algorithm with input (q(x), p(x)), the suc-cessive quotients are all of degree 1,

    3. the degree of each remainder is only one less than the

    degree of the respective divisor.

    This will ensure a Euclidean algorithm chain of maximal length and adherence to

    the recurrence relation in Corollary 4.2. The existence of such a polynomialp(x) followsfrom a 1987 result by J.P. Mesirov and M.M. Sweet [8].

    Proposition 4.2 [Mesirov and Sweet [8]] Given any irreducible polynomial q of degreen over F2, there is a sequence of polynomials pn, pn1, . . . , p0 with pn = q andp0 = 1, such that the deg (pi) = i, and pi

    pi2 mod pi1.

    Once we know a polynomial p = pn1 , as mentioned in Proposition 4.2. , whichmatches our given polynomial pn = q, the Euclidean algorithm will uniquely completethe sequence pn, pn1, . . . , p1, p0 = 1. The linear quotients x + i (i = 1, . . . , n)occurring in Euclids algorithm allow us to derive the bits bi of the palindrome in Corol-lary 4.2. This has been a brief summary of the cryptanalysis of the last variant of the

    15I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    24/459

    Tillich-Zmor cryptographic hash function published in [2]. A much more comprehen-

    sive development occurs in [2] including an efficient algorithm for determining from ir-

    reducible q(x) = pn(x) the two solutions for p(x) = pn1(x) satisfying the Mesirov-Sweet conditions. The paper [2] also contains the solution to all suggested challenge

    parameters for the Tillich-Zmor hash function.

    5. Conclusions

    With the advent of P. Shors quantum algorithms for solving the traditional DLP in lin-

    ear time on a quantum computer, attention has been drawn to the generalized discrete

    logarithm problem in non-abelian groups. In this paper we consider the family of groups

    P SL(2, p), p an odd prime, and expose certain bad choices for generators, for whichGDLP can be easily solved. We delineate some strategies for solving the GDLP in thesegroups, but point out that, for these strategies, the probability of success goes to zero

    as p gets large. We still believe that if generators are chosen wisely, the GDLP in theP SL(2, p) will be intractable. In a related problem, we summarize the general successfulattack presented in [2], which breaches the Tillich-Zmor hash function. In this segment,

    we give no proofs for the structure lemmas, and no details in the final solution based on

    a theorem of Mesirov and Sweet [8].

    References

    [1] Wieb Bosma, John Cannon, and Catherine Playoust. The Magma algebra system. I. The user language.

    J. Symbolic Comput., 24(3-4):235-265, 1997

    [2] Markus Grassl, Ivana Ilic, Spyros Magliveras, Rainer Steinwandt. Cryptanalysis of the Tillich-Zmor

    hash function. To appear in the Journal of Cryptology, 2010. Cryptology ePrint Archive: Report

    2009/376, 2009. Available at: http://eprint.iacr.org/2009/376

    [3] Derek Holt, Bettina Eick, Eamonn A. OBrien. Handbook of computational group theory. Chapman &

    Hall/CRC Press, Boca Raton, 2005.

    [4] Ivana Ilic and Spyros S. Magliveras. Weak discrete logarithms in non-abelian groups, to appear in the

    Journal of Combinatorial Math. and Comb. Computing (JCMCC), 2009.[5] Lee C. Klingler, Spyros S. Magliveras, Fred Richman, Michal Sramka. Discrete logarithms for finite

    groups. Computing 85, (2009), pp. 319.

    [6] P. Nguyen, New Trends in Cryptology, European project STORK - Strategic Roadmap for Crypto" (IST-

    2002-38273).

    [7] Alfred Menezes, Paul C. van Oorschot, Scott A. Vanstone. Handbook of Applied Cryptography, CRC

    Press, 1996.

    [8] Jill P. Mesirov and Melvin M. Sweet. Continued Fraction Expansions of Rational Expressions with

    Irreducible Denominators in Characteristic 2. Journal of Number Theory 27 pp. 144148, 1987.

    [9] P. W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum

    computer. SIAM J. on Computing, 26(5), pp. 1484-1509, 1997.

    [10] Michal Sramka. New Results in Group Theoretic Cryptology. Ph.D. Thesis, Florida Atlantic University,

    Boca Raton, FL 2006.

    [11] Douglas R. Stinson. Cryptography: Theory and Practice, 2nd ed, CRC Press, New York, NY, 2002.

    [12] Michio Suzuki. Group Theory I. Springer-Verlag, New York, 1982.

    [13] Jean-Piere Tillich and Gilles Zmor. Hashing with SL2. LNCS 839, Advances in Cryptology

    CRYPTO 94, pp. 4049, 1994.

    16 I. Ili and S.S. Magliveras / Crypto Applications of Combinatorial Group Theory

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    25/459

    Generating rooted trees ofm nodesuniformly at random

    Kenneth Matheis and Spyros S. Magliveras

    CCIS, Department of Mathematical Sciences, Florida Atlantic University,

    777 Glades Road, Boca Raton, FL 33431

    [email protected], [email protected]

    Abstract. A rooted tree is an ordinary tree with an equivalence condition: two trees

    are the same if and only if one can be transformed into the other by reordering

    subtrees. In this paper, we construct a bijection and use it to generate rooted trees

    (or forests) of any specified nodecount m uniformly at random. As an application,

    Raddum and Semaev [6] Raddum and Semaev propose a technique to solve sys-

    tems of polynomial equations over F2 as occurring in algebraic attacks on block

    ciphers. This approach is known as MRHS. In [3] Geiselmann, Matheis, and Stein-

    wandt propose an ASIC hardware design to implement MRHS, and they show that

    the use of ASICs seems to enable significant performance gains over a software

    implementation of MRHS. What hasnt been asserted is the total time complexityof their platform, though individual components runtimes are provided. If one sup-

    poses that deletions in MRHS occur as rooted trees generated uniformly at random,

    then one application of the proposed algorithm would be to contribute to such a

    time complexity; experiments are generated to provide statistical averages of key

    quantities.

    Keywords. rooted tree, rooted forest, uniform random generation, genetic

    programming, MRHS, PET SNAKE

    Introduction

    We view a rooted tree as an equivalence class of ordinary trees, where two trees are

    equivalent if one can be transformed into the other by re-ordering subtrees [7]. Similarly,

    we view a rooted forest as an equivalence class of forests, where two rooted forests

    are equivalent if one can be transformed into the other by re-ordering the rooted trees.

    Alternately, we may consider a rooted forest as nothing more than the subtrees of a rooted

    tree (of one node more) whose root is hidden.

    The idea of a rooted tree has been around since 1875 [2] when countings for smaller

    nodecounts have been computed. Since then there have been a few proposals constructing

    bijections between all rooted trees and N.For each m N, we define Tm to be the set of rooted trees of m nodes, and Fm to

    be the set of rooted forests ofm nodes. Our contribution is an implicit construction of abijection between Fm and Z|Fm|. Such a bijection can then be used to generate a rootedforest ofm nodes uniformly at random.

    As an immediate application to cryptography, we note that some statistics generated

    from these trees can be used to help calculate a time estimate for PET SNAKE. Now, In

    17Information Security, Coding Theory and Related Combinatorics

    D. Crnkovi and V. Tonchev (Eds.)

    IOS Press, 2011

    2011 The authors and IOS Press. All rights reserved.

    doi:10.3233/978-1-60750-663-8-17

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    26/459

    [6] Raddum and Semaev proposed a technique known as MRHS (Multiple Right Hand

    Sides) to handle polynomial systems of equations over F2. This algorithm is particu-lary well-suited for describing systems of equations for an algebraic key recovery attack

    against common block ciphers such as AES or DES, but a complete time estimate of

    it was not forthcoming. Later, the hardware platform PET SNAKE [3] was designed to

    implement MRHS attacks in hardware, but PET SNAKEs time estimate is also hard to

    calculate. The statistics mentioned can help contribute to such a time estimate.

    Related Work

    It has been established that we can map rooted trees to natural numbers [5] and that there

    is a rooted tree for every natural number [4]. If one relaxes the equivalence condition and

    merely examines arbitrary trees, then a uniform random generation algorithm is known

    [1] by modeling them using a context-free grammar for use in a genetic algorithm. How-ever, it is not clear that it is possible to create rootedtrees using a context-free grammar,

    so we do not use this algorithm. We instead develop a different algorithm which, as it

    happens, shares some features with the one in [1].

    Further, much information about rooted trees is available in [8, sequence A000081],

    and some of those facts will be used in this paper.

    Structure of the Paper

    We first discuss the construction of the bijection between Fm and Z|Fm|. Once this isestablished, we review the relevant details about MRHS and show how Fm is related tothe processing of PET SNAKE (notably the deletion count therein). Finally, we generate

    some statistics based on Fm for m 1000 and relate those to time estimate processingfor PET SNAKE.

    1. Generating Rooted Forests Uniformly at Random

    We begin with some notation: we define the natural numbersN

    to be{1, 2, 3, . . . }

    , the

    whole numbers W to be N{0}, and for each n N, we define segn to be {1, 2, . . . , n}.In order to generate a rooted forest of m nodes uniformly at random, we first con-

    struct some data tables dynamically (so that no unneeded space is allocated), and then

    we perform many lookups on those tables.

    We view a rooted forest ofm nodes as being constructed by a collection of r rootedtrees a1, a2, . . . , ar, for some r segm, with respective nonincreasing node counts c1,c2, . . . , cr such that

    ci = m.

    We then construct sequences of counts bi such that b11, b12, . . . , b1s1 are the s1counts starting with c1 that are equal to c1, and b21, b22, . . . , b2s2 are the s2 counts start-

    ing with c1+s1 that are equal to c1+s1 , and so on, and we suppose there are d such se-quences. This breaks up the counts into subsequences of equal-valued terms. For exam-

    ple, if the counts c were 9, 8, 8, 8, 7, 7, 6, 4, 3, 3, 3, 3, 2, 1, 1, 1, then b1 has one term(namely 9), b2 has three terms (all of which are 8), b3 has two terms (both of which are7), b4 has one term (namely 6), and so on, ending with b8 having three terms (all of whichare 1) and d = 8.

    18 K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    27/459

    Since we envision the trees Tk (for any k N) as being ordered, for each i segdwe must count the number of ordered arrangements of si trees in Tbi1 . Call this numberBi. We then calculate the number of rooted forests with this count sequence as

    Bi.

    In order to correlate a number in Z|Fm|

    to a forest in Fm

    , we must have a way to obtain

    the number of forests of subtrees with any nonincreasing count sequence. As one might

    imagine, this is done recursively using the building blocks described below.

    1.1. Setup

    The setup phase of the algorithm consists of building three tables. First, for each i segm, |Ti| is calculated using the recurrence formula

    |Ti| = 1i 1

    i1k=1

    d | k

    d |Td| |Tik|with |T1| = 1 [8]. This takes O(i2) time for each i, totalling a time ofO(m3).Then an m m table R called the runtable is created. Its purpose is to store forest

    counts in the following way: for any two i, j segm, Rij is the number of sequencesu : segj Ti such that u1 u2 uj ; in other words, it is the number of non-decreasing j-length sequences ofi-node rooted trees. (Note that we have not mentionedhow to order the trees Ti, but certainly one exists. Indeed, a side effect of this process

    is to construct the bijections fm : Z|Fm|11onto

    Fm, which in turn constructs the bijec-

    tions tm : Z|Tm|11onto

    Tm, so for two trees p and q, p q if and only if the index thatconstructs p is less than or equal to the index that constructs q. Since R simply concernsitself with the number of sequences u, and not the individual sequences themselves, werun into no difficulty constructing R.)

    To calculate these values, we take advantage of the following theorem:

    Theorem 1

    (n N)(k W) n

    i=1

    i + k

    k + 1

    =

    n + k + 1

    k + 2

    To prove this, simply use induction on n. Five lines are all that are necessary. We now make an observation about sequences of finite length. (To be clear, we make

    no claim about the originality of Theorems 1 and 2, but absent appropriate references,

    proofs are provided to justify their correctness.)

    Theorem 2 For each i, j N, let Sij be the number of nondecreasing sequences u :segj segi. Then

    (j N)(i N)

    Sij =

    i +j 1

    j

    .

    19K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    28/459

    To prove this, we proceed by induction on j.(1): Let i N. Then Si1 is the number of nondecreasing sequences u : seg1

    segi. But there are only i such things, one for each choice ofu1. Hence, Si1 = i =

    i1

    =

    i+111 . Thus, (1).Let k N and assume (k): (i N) Sik = i+k1k .(k + 1): Let i N. Consider Si(k+1). These are all the nondecreasing sequences

    u : segk+1 segi. Now let us consider the possibilities for u1. If u1 = 1, then theremaining terms comprise a k-length nondecreasing sequence to segi. But the numberof such sequences is just Sik. Now, if u1 = 2, the remaining terms comprise a k-lengthsequence to segi 1. But the number of such sequences is the same as the number of k-length sequences to segi1 (just subtract 1 from each term), which is S(i1)k. Similarly,for each v segi, ifu1 = v, then the remaining terms comprise a k-length sequence tosegi segv1, whose count is the same as the count ofk-length sequences to segi(v1)(by subtracting v 1 from each term), which is S(iv+1)k. Hence,

    Si(k+1) = Sik + S(i1)k + + S1k

    =i

    v=1

    Svk

    =

    i

    v=1v + k 1

    k by Ind. Hyp.

    =

    i + k

    k + 1

    by Theorem 1

    =

    i + (k + 1) 1

    k + 1

    .

    Thus, (k + 1). The rest follows by the Principle of Mathematical Induction and routinesteps.

    Since, for each j segm, Rij is the number of nondecreasing sequences from segjto Ti, this is the same as the number of nondecreasing sequences from segj to seg|Ti|,

    which is|Ti|+j1

    j

    by Theorem 2. Hence, we build the R table by populating it with this

    binomial coefficient for each i segm and j segm such that j mi ; j is restrictedin this way since, for any choice of tree size i in an m-node forest, you can only have atmost mi such trees. As a side effect, we see that |Ti| is stored in Ri1 by this process.

    As a point of interest, note that we had to use this binomial simplification when pop-

    ulating the R table. Otherwise, since |Ti| is asymptotically 0.4399 2.9558i i3/2 [8,sequence A000081], asking the computer to perform the sum listed in the proof of The-

    orem 2 would become infeasible very quickly.We construct two more tables, the two-dimensional partable denoted P, and the

    three-dimensional table lentable denoted L. For each i, j segm, Pij is the number ofrooted forests of i nodes whose first tree has j nodes. It could be that the first few treeshave j nodes, so we keep track of this using the lentable: Lijk is the number of rootedforests ofi nodes whose first k trees have j nodes. To calculate Lijk , we use

    20 K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    29/459

    Lijk = Rjk ifi jk = 0

    j1q=1 P(ijk)q otherwiseThen,

    Pij =

    ij

    k=1

    Lijk .

    Finally, we recognize that |Fm| = |Tm+1| gives us no intuitive breakdown of all thecounts, but

    |Fm| =m

    j=1

    Pmj

    does. Note that, though we concern ourselves with how a given number of nodes mbreaks down into each partition ofm, this setup prevents us from having to loop througheach partition ofm, which also would be infeasible very quickly.

    We remark that the storage for R is O(m

    2

    ) but is significantly less than m

    2

    since,for each i segm, we only populate Rij when j mi . Further, for similar reasons thestorage for L is O(m3), but is significantly less than m3.

    1.2. Teardown

    In order to generate a forest in Fm uniformly at random, we first generate a numberr in Z|Fm| (called an index) uniformly at random. Then, we go through the process ofwhittling down r by successively discovering which count sequence to use for that forest,and which indices to use for each tree of that forest. (Such data collectively is called a

    decomposition of the index r.) After the decomposition is constructed, we recur on eachtree size of the decomposition, noting that if the ith tree has ci nodes, it can be viewedas a forest (of its subtrees) of ci 1 nodes, the root itself being one node. The recursionterminates when we are faced with generating a forest of one node with index zero, at

    which point we return a leaf.

    1.2.1. Composing Decompositions

    For any forest ofn nodes whose first tree can have as many as h nodes (called the headsize), composing a decomposition is itself a recursive process which relies on three algo-

    rithms which we call PTCOORDS, LENREM, andRUNCOORDS. The process produces two vectors, sizes and idxs.

    PTCOORDS identifies which column ofPn that r is in (say its the jth) and reducesthe index and provides a new head. LENREM identifies which tower of Lnj that thereduced index is in (say its the kth) and produces a re-reduced index, remainder index,and a remaining node count for subsequent trees. RUNCOORDS converts the re-reduced

    21K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    30/459

    index into a sequence of indices for each of the k trees. We then recur on the remaining

    node count, the new head minus 1, and the remainder index, and we append a sequence

    ofk entries ofj to the front of the results sizes, and also the sequence of indices to the

    front of the results idxs.This process is started with a call to DECOMP(m,m,r).

    Algorithm 1 DECOMP

    Require: A nodecount n, a head h, an index r Z|Fn|.

    1: set sizes and idxs to be empty lists2: ifn 0 or h < 1 then3: return (sizes, idxs)

    4: else ifh > n then5: h n6: end if

    7: (r, h) PTCOORDS(n,h,r)8: (k, n, r, x) LENREM(n, h, r)9: frontidxs RUNCOORDS(h, r, k)

    10: set frontsizes to be a list ofk copies ofh

    11: (backsizes, backidxs) DECOMP(n, h 1, x)12: return (append(f rontsizes, backsizes), append(f rontidxs, backidxs))

    Algorithm 2 PTCOORDS

    Require: A nodecount n, a head h, an index r Z|Fn|.

    1: r r, h h2: while Pnh r do3: r

    r

    Pnh

    4: h h 15: end while

    6: return (r, h)

    LENREM sends two of its outputs to RUNCOORDS, which uses a binary search to

    determine what the indices should be for each of the k trees of nodesize h, based on

    the re-reduced index r. This approach is needed since r is an index into one of the

    |Th |+k1

    k nondecreasing sequences from segk to seg|Th |, but this quantity is a sum(as per Theorem 1), so we have to figure out where r is in that sum without examining|Th | 2 individual binomial coefficients, as |Th | can get very large. (Indeed, this is thepart that is significantly different than the uniform random generation algorithm in [1].)

    We remark in passing that, after the Setup phase, building a rooted forest ofm nodes

    corresponding to an index r takes slightly more than O(m) time but definitely within

    O(m2) time.

    22 K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    31/459

    2. Application to MRHS and PET SNAKE

    Now that we have a reliable method to generate rooted forests uniformly at random,

    one application would be to compute relevant statistics from them to help predict PET

    SNAKEs run time. We recall the relevant facts about MRHS and PET SNAKE. MRHS

    operates on a collection of pairs of matrices called symbols, and one phase of its process-

    ing is called the Agreement Phase, where each symbol must be agreed to each other sym-

    bol. Sometimes the act of agreeing a pair of symbols induces a deletion in one (or both)

    symbols; other times, nothing changes. If a deletion occurs, then the process starts over:

    each symbol must be re-agreed to each other symbol. This continues until no (more)

    deletions are detected, at which point the symbols are said to be pairwise agreed. Hence,

    for a body ofn symbols, at least

    n2

    agreements must be performed. In software, each

    agreement must be performed one at a time. PET SNAKE is a hardware design employ-

    ing lots of processors, and it uses them to perform half of the n2 agreements simul-taneously. If no deletion is detected, it then performs half of the remaining agreements

    simultaneously. And so on.

    Since some deletions cannot occur until other deletions occur first, we choose to

    model the deletions as a collection of rooted trees. In each tree, each node symbolizes a

    deletion after two symbols are agreed, and each child of a node symbolizes deletions that

    can now occur as a result of the parent nodes deletions taking place. In the beginning of

    an agreement phase, it is certainly possible that many deletions do not depend on each

    other, so these deletions are the roots of the trees in this collection.

    We observe that the order of subtrees of a given node is irrelevant; it does not matter

    which subtree is the first subtree, which is the second, and so on; hence the choice of a

    rooted forest is appropriate. We notice that at any stage, PET SNAKE will perform half

    of the agreements necessary simultaneously, so at any point, about half of the deletions

    that can be performed will be performed on average.

    Now, if a deletion gets performed, then that deletions children will then be available

    to be deleted. Examining the consequences for the model, we see that only the roots of

    the trees in the forest are available for deletion, so when such a deletion is performed,

    the corresponding root must be eliminated. This, however, means that that roots children

    are now roots in the forest. This operation of deleting a root and promoting its children

    we call a lift.

    Algorithm 3 LENREM

    Require: A nodecount n, a new head h, a reduced index r.

    1: k n/h2: while Lnhk r do3: r r Lnhk4: k k 15: end while6: n n (k h)7: c Lnhk/Rhk8: r r/c9: x r mod c

    10: return (k, n, r, x)

    23K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    32/459

    Algorithm 4 RUNCOORDS

    Require: A new head h, a re-reduced index r, a length k.

    1: set idxs to be an empty list2: top Rh1, prev 03: for t {k, k 1, . . . , 2} by 1 do4: i top + 1, j 1, mid (i +j)/25: total top+t1t , pen total 1 r6: found false7: while not found do8: c1

    mid+t1

    t

    > pen, c2 pen

    mid+t2

    t

    9: found (c1 and c2)

    10: ifnot found then

    11: ifc1 then12: i mid13: else

    14: j mid15: end if

    16: mid (i +j)/217: end if

    18: end while

    19: prev top mid +prev20: insert prev onto the back of the list idxs21: top mid, r r (total mid+t1t )22: end for

    23: insert r +prev onto the back of the list idxs24: return idxs

    Hence, about half of the roots are lifted in a given stage. (Such an action we will

    refer to as a parallel lift.) The agreement phase is not complete until all the nodes in the

    forest are eliminated. To get a handle on time estimates, it is pertinent to ask how many

    roots exist at a given time, and how many times to we expect to perform parallel lifts until

    the forest is eliminated. Since we do not have theoretical answers to these questions, we

    assume that the m deletions in an agreement phase occur as a forest of m nodes chosenuniformly at random. With this assumption, we design an experiment as follows: for

    various m 1000, we perform the Experimental Procedure (see Figure 1) several times(say, s times). Throughout each procedure run, we count the number of roots that theforest has (once before each parallel lift) so as to calculate the average when the forest is

    eliminated, and we also count the number of times we have to parallel lift.

    Once the number of parallel lifts and the average number of roots are calculated,

    we do it again for the same forest. This is repeated s times. Once these s procedures are

    complete, we choose another rooted forest ofm nodes uniformly at random and performthe procedure again. We construct t such forests (each giving rise to s procedures), and aglobal average of number of parallel lifts required and number of roots appearing at any

    point are calculated.

    This procedure was performed for s = t = 1000 and m {50, 100, 150, . . . , 1000}and the results are summarized in Table 1.

    24 K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    33/459

    Construct a rooted forest ofm nodes uniformly at random. While it is nonempty,

    take note of the number of roots of the forest,

    uniformly at random choose half of the roots, and lift them from the forest. Calculate the average number of roots the forest had.

    Figure 1. Experimental Procedure

    Table 1. Experimental Procedure Results (s = t = 1000)

    m Avg parallel lifts Avg roots m Avg parallel lifts Avg roots

    50 25.4741 3.9869 550 100.283 11.4762

    100 38.7107 5.3268 600 105.187 11.9226

    150 49.2455 6.330 650 109.466 12.4351

    200 56.9224 7.3193 700 112.619 13.01

    250 65.1864 7.9930 750 119.717 13.1435

    300 71.7676 8.7246 800 123.128 13.6412

    350 78.5635 9.3096 850 125.295 14.2402

    400 83.6236 10.0201 900 129.423 14.5746

    450 89.7707 10.4778 950 133.577 14.9439

    500 94.8623 11.0328 1000 135.625 15.4812

    If we multiply the average roots by the average parallel lifts and plot this result for all

    twenty pairs, we discover that the plot forms a near-straight line of slope approximately4019

    . This isnt too surprising, since in each parallel lift we eliminate about half the roots,

    and the roots multiplied by the parallel lifts (if we eliminated every root per lift) should

    give us the total number of nodes in the forest. Further, if we multiply the number of

    roots by itself and plot this, we get a near-straight line of slope approximately 938

    . From

    these two observations, we propose the following:

    Proposition 1 A rooted forest of m nodes chosen uniformly at random will have, on

    average,

    9

    38 m 0.4866

    m roots on average as its corresponding set of deletions get

    deleted through an agreement phase.

    Further, the number of parallel lifts required to eliminate such a forest is on average

    approximately 4019 m/0.4866

    m 4.3264m.

    These estimates can be used in conjunction with estimates of how many deletions to

    expect per agreement phase to help predict the runtime of PET SNAKE.

    As a point of interest, if we choose not to lift half of the roots, but instead all of them,

    we can use a similar procedure to determine the average depth and the average number

    of nodes per depth for these trees.

    25K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    34/459

    3. Conclusion

    We have provided a way to implicitly construct bijections between Tm and Z|Tm|, andbetween Fm andZ

    |Fm|with reasonable time and space consumption, for any m

    N, and

    we hope that this proves useful in many environments. One such environment is in the

    realm of cryptography, where we aid in the construction of a time estimate for a hardware

    platform implementing an algebraic attack on block ciphers. Another might be in genetic

    programming to create initial trees corresponding to non-context-free grammars.

    References

    [1] W. Bhm and A. Geyer-Schulz, Exact Uniform Initialization for Genetic Programming, Foundations of

    Genetic Algorithms IV (1997), 379403.[2] A. Cayley, On the Analytical Forms called Trees, American Journal of Mathematics 4 (1881), 266268.

    [3] W. Geiselmann and K. Matheis and R. Steinwandt, PET SNAKE: A Special Purpose Architecture to

    Implement an Algebraic Attack in Hardware. Cryptology ePrint Archive, Report 2009/222 (2009), avail-

    able at http://eprint.iacr.org/2009/222.

    [4] F. Gbel, On a 1-1 Correspondence between Rooted Trees and Natural Numbers, Journal of Combina-

    torial Theory B 29 (1980), 141143.

    [5] D. Matula, A Natural Rooted Tree Enumeration by Prime Factorization, SIAM Review 10 (1968), 273.

    [6] H. Raddum and I. Semaev, Solving Multiple Right Hand Sides Linear Equations, Designs, Codes and

    Cryptography 49 (2008), 147160.

    [7] F. Ruskey, Information on Rooted Trees. The Combinatorial Object Server, University of Vic-

    toria, Canada (2003), available at http://www.theory.cs.uvic.ca/~cos/inf/tree/

    RootedTree.html.

    [8] N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences. AT & T Research Labs (2009), avail-

    able at http://www.research.att.com/~njas/sequences/.

    26 K. Matheis and S.S. Magliveras / Generating Rooted Trees of m Nodes Uniformly at Random

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    35/459

    On Jacobsthal BinarySequences

    Spyros S. Magliveras a, Tran van Trung b and Wandi Wei a

    a CCIS, Department of Math. Sciences, Florida Atlantic University,

    Boca Raton, FL 33431, USA

    e-mail: [email protected], [email protected] for Experimental Mathematics, University of Duisburg-Essen,

    Essen, Germany

    e-mail: [email protected]

    Abstract. Let = {0, 1} be the binary alphabet, and A = {0, 01, 11} be theset of three strings 0, 01, 11 over . Let A denote the Kleene closure of A, Z0

    the set of nonnegative integers, and Z+ the set of positive integers. A sequence

    in A is called a Jacobsthal binary sequence. Let J(n) denote the set of Jacob-sthal binary sequences of length n. For k Z+, {s1, s2, . . . , sk} Z

    0, and

    n 1 s1 > s2 > . . . > sk 0, let J1(n; s1, s2, . . . , sk) denote the sub-set J1(n; s1, s2, . . . , sk) = {an1an2 . . . a1a0 J(n) : asi = 1 (1

    i k)}, of J(n), and let N1(n; s1, s2, . . . , sk) = |J1(n; s1, s2, . . . , sk)|.When k = 1, a formula for N1(n; s) has been derived recently. In this paper weconsider the general case of N1(n; s1, s2, . . . , sk), and study some other specialtypes of Jacobsthal binary sequences. Some identities involving these numbers are

    also given.

    Keywords. Jacobsthal numbers, combinatorial identities, combinatorial enumeration

    Introduction

    Let = {0, 1} be the binary alphabet, and A = {0, 01, 11} the set of three strings0, 01, 11 over . Let A denote the Kleene closure of A, Z0 the set of nonnegativeintegers, and Z+ the set of positive integers. A sequence in A is called a Jacobsthalbinary sequence. Let J(n) denote the set of Jacobsthal binary sequences of length n andlet |J(n)| denote the cardinality ofJ(n).

    The Jacobsthal numbers are defined by the recursion

    Jn = Jn1 + 2Jn2, n > 2 (1)

    together with the initial values

    J0 = J1 = 1. (2)

    Note that some other authors use the initial values J0 = 0, J1 = 1 instead. Using theinitial values in (2), a known result can be stated more conveniently as

    27Information Security, Coding Theory and Related Combinatorics

    D. Crnkovi and V. Tonchev (Eds.)

    IOS Press, 2011

    2011 The authors and IOS Press. All rights reserved.

    doi:10.3233/978-1-60750-663-8-27

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    36/459

    |J(n)| = Jn. (3)

    Jn is also called the nth Jacobsthal number. For convenience, we also define

    Jm = 0, m Z, m < 0. (4)

    Based on (4), we state an obvious fact and a known result as a lemma for easy reference.

    Lemma 1 The recursion (1) can be extended as

    Jt = Jt1 + 2Jt2, t Z, t = 0.

    The value ofJn (n

    Z0) can be computed by

    Jn =1

    3(2n+1 + (1)n), n Z0. (5)

    The Jacobsthal numbers have applications in such areas as tiling, graph matching, alter-

    nating sign matrices, etc. ([1,2,4,5]).

    Let

    k Z+, {s1, s2, . . . , sk1, sk} Z0 ; n 1 s1 > s2 > . . . > sk 0. (6)

    Let J1(n; s1, s2, . . . , sk) denote the following subset ofJ(n):

    J1(n; s1, s2, . . . , sk) = {an1an2 . . . a1a0 J(n) : asi = 1 (1 i k)},

    i.e., the subset of Jacobsthal binary sequences that have the digit 1 at each of the sthi (1 i k) positions from the right. Let N1(n; s1, s2, . . . , sk) = |J1(n; s1, s2, . . . , sk)|. R.Grimaldi[4] considers the case where k = 1, establishing a recursion for N1(n; s1) and

    then deriving the following formula:

    N1(n; s) =1

    3(2n + (1)n + (1)ns2s) (7)

    = Jn 2s

    3(2ns + (1)ns1). (8)

    For the general case, finding a formula for N1(n; s1, s2, . . . , sk) by using a recursionseems extremely difficult. In this article we employ a different approach to dealing with

    this problem, namely, considering the following dual problem of N1(n; s1, s2, . . . , sk).

    Let

    r Z+, {t1, t2, . . . , tr1, tr} Z0, n 1 t1 > t2 > .. . > tr 0. (9)

    Let J0(n; t1, t2, . . . , tr) denote the following subset ofJ(n):

    28 S.S. Magliveras et al. / On Jacobsthal Binary Sequences

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    37/459

    J0(n; t1, t2, . . . , tr) = {an1an2 . . . a1a0 J(n) : ati = 0 (1 i r)},

    i.e., the subset of Jacobsthal binary sequences that have the digit 0 at each of the tthi (1 i

    r) positions from the right. Let N0

    (n; t1

    , t2

    , . . . , tr

    ) =|J

    0(n; t

    1, t

    2, . . . , t

    r)|.

    In the next section we present characterizations of the sets J(n) and J0(n; t1, t2, . . . , tr).Based on them, some combinatorial identities involving Jn, N0(n; t1, t2, . . . , tr) andN1(n; s1, s2, . . . , sk) are derived in Section 3. From these identities, formulas forN0(n; t1, t2, . . . , tr) and N1(n; s1, s2, . . . , sk) are obtained in the last section.

    1. Characterizations of the sets J(n) and J0(n; t1, t2, . . . , tr)

    For easy reference we state a trivial fact, that is

    Lemma 2 For any i, j Z+, J(i)||J(j) J(i + j), where J(i)||J(j) = {a||b : a J(i), b J(j) and stands for the concatenation operation on strings.

    We now characterize the set J(n). We need

    Lemma 3 Letl Z+. The string of the 0-digit followed by l1 1-digits is a Jacobsthalbinary string of length l.

    Proof. Ifl = 2m + 1 for some m Z0, the l 1 = 2m 1-digits in can be regarded asm copies of the string 11. Since both strings 11, 0 A, we know A. Ifl = 2m forsome m Z0, the last l 2 = 2m 2 1-digits in can be regarded as m 1 copies ofthe string 11. Since both string 11, 01 A, we know A. 2

    Theorem 1 For any n Z+, a binary sequence of length n is in J(n) if and only if it isan all-1 sequence of even length or its first 0-digit from the left is preceded by an all-1

    subsequence of even length.

    Proof. Since the string 1 A but the string 11 A, the all-1 sequence of length n isin J(n) if and only ifn is even. Therefore, in what follows we only need to consider thecase in which the sequence an1an2 . . . a1a0 has at least one 0-digit.

    Let ani be the first 0-digit from the left. Then

    an1 = an2 = . . . = an(i1) = 1.

    Since the two strings 1, 10

    A, in order for an1an2 . . . a1a0 to be in J(n), the

    subsequence an1an2 . . . an(i1) has to be formed by copies of the element 11 A.This is impossible when i 1 is odd.

    We now prove that when i 1 is even, the sequence an1an2 . . . a1a0 is in J(n) byinduction on the number, say u, of 0-digits in the sequence. For the case where u = 1,let ai = 0,. By Lemma 3, the subsequence aiai1 . . . a1a0 J(i + 1). Recalling that

    29S.S. Magliveras et al. / On Jacobsthal Binary Sequences

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    38/459

    an1an2 . . . ai+1 J(n i 1) we know an1an2 . . . a1a0 J(n) by Lemma 2.This establishes the induction basis.

    For the inductive step, suppose that u > 1 and the conclusion is true for any sequencehaving exactly u1 0-digits. Let al be the first 0-digit from the right in a sequence havingu 0-digits. By Lemma 3, we know alal1 . . . a0 = 011 . . . 1 . . . a0 J(l + 1). By the in-duction hypothesis, an1an2 . . . al+1 J(n l 1). Therefore, an1an2 . . . a1a0 J(n) by Lemma 2. This completes the induction. 2

    From this theorem, one can obtain the known formula (5) for |J(n)|.

    Corollary 1

    |J(n)| = 2n+1 + (1)n3

    ,

    Proof. Let J(n, i) denote the set of such Jacobsthal binary sequences that have theirfirst 0-digit at the (2i + 1)st position from the left, and n the set consisting of the all-1sequence of length n when 2 | n, and n = when 2 n. Then

    J(n) = (

    0i(n1)/2

    J(n, i) ) n

    is a partition ofJ(n). By Theorem 1, when n = 2m (m Z+), we have :

    |J(n)| = m1i=0 22m(2i+1) + 1 = 12 m1i=0 4(mi) + 1 = 12 mi=1 4i + 1 == 2

    m1i=0 4

    i + 1 = 2( 4m1

    3 ) + 1 =2n+1+(1)n

    3 .

    When n = 2m + 1 (m

    Z0), we have :

    |J(n)| = mi=0 22m+1(2i+1) = mi=0 22(mi) = mi=0 22i ==m

    i=0 4i = 4

    m+113 =

    2n+1+(1)n

    3 . 2

    By Theorem 1 we can give a characterization of the set J0(n; t1, t2, . . . , tr). Recall thatthe parameters satisfy (9):

    r Z+, {t1, t2, . . . , tr1, tr} Z0, n 1 > t1 > t2 > .. . > tr 0.

    Theorem 2 For any n Z+, the binary sequence an1an2 . . . a1a0 of length n is inJ0(n; t1, t2, . . . , tr) if and only if the subsequence an1an2 . . . at1+1 is in J(n1t1)andati = 0 (1 i r).

    30 S.S. Magliveras et al. / On Jacobsthal Binary Sequences

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    39/459

    Proof. Let aj be the first 0-digit from the left. Then j t1. By Theorem 1,an1an2 . . . a1a0 J(n) if and only if the entries before aj are all 1s, i.e.,2|n 1 j, which is the necessary and sufficient condition for an1an2 . . . at1+1 tobe in J(n

    1

    t1). 2

    It is somewhat surprising that whether an1an2 . . . a1a0 J0(n; t1, t2, . . . , tr) or notis determined only by the subsequence an1an2 . . . at1+1 and ati = 0 (1 i r),but is independent of the digits aj (0 j t1 1, j = ti).

    Based on these theorems, some combinatorial identities involving Jn, N0(n; t1, t2, . . . , tr)and N1(n; s1, s2, . . . , sk) can be established, which will be presented in the next section.

    2. Some Combinatorial Identities Involving Jn, N0(n; t1, t2, . . . , tr) andN1(n; s1, s2, . . . , sk)

    In this section some combinatorial identities involving Jn, N0(n; t1, t2, . . . , tr) andN1(n; s1, s2, . . . , sk) are proved. Applying them to obtain formulas for N0(n; t1, t2, . . . , tr)and N1(n; s1, s2, . . . , sk) will be the task of the next section.

    We need a simple lemma :

    Lemma 4 For any n Z0,2n = 3Jn1 + (1)n.

    Proof. Recalling that J1 = 0 (cf. (4)), we know that the statement is true when n = 0.When n Z+, the statement is equivalent to (5). 2

    We can now state the following

    Theorem 3

    N0(n; t1, t2, . . . , tr) = [3Jt1r + (1)t1r+1]Jnt11 (10)N0(n; t1, t2, . . . , tr) = Jnr + (1)nt11Jt1r (11)

    Proof. By Theorem 2, for a sequence an1an2 . . . a1a0 in J0(n; t1, t2, . . . , tr), thereare |J(n t1 1)| = Jnt11 many choices for the subsequences an1an2 . . . at1+1.For each of these choices, there are two choices for each of the digits aj (0 j t1 1, j = t2, t3, . . . , tr). Noting that atj = 0 (1 j r), we have

    N0(n; t1, t2, . . . , tr) = |J(n t1 1)| 2t1+1r= Jnt112

    t1r+1.

    By Lemma 4,

    2t1r+1 = 3Jt1r + (1)t1r+1.

    31S.S. Magliveras et al. / On Jacobsthal Binary Sequences

  • 7/28/2019 Information Security, Coding Theory and Related Combinatorics 2011 by Cool Release

    40/459

    Ther