encoding constants in watermarking graph

Upload: jyothimonc

Post on 14-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Encoding constants in Watermarking graph

    1/5

    Encoding constants in Watermarking Structure

    A Graph-based Software Watermarking Technique

    Riya Rajan

    Computer Science and Engineering DepartmentCollege of Engineering SNGIST

    Ernakulam, Kerala

    [email protected]

    Jyothimon C

    Computer Science and Engineering DepartmentCollege of Engineering SNGIST

    Ernakulam, Kerala

    [email protected]

    AbstractSoftware watermarking technique embeds an

    identification mark, i.e. a watermark value within software to

    discourage software theft. There are several graph theoretic

    watermarking techniques which encode the watermark values as

    graph structures and this graph structure is embedded in

    application programs. In this paper we propose an efficient

    algorithm which encodes both constants in the program and

    watermark value in a graph called Reducible Permutation

    Graph. Since both watermark value and constants in theprogram are encoded in a single graph structure, any

    modification on this graph will lead to execution failure. This

    property causes the watermarking system resilient to attacks.

    Moreover, our encoding and decoding algorithm have low time

    complexity and can be easily implemented.

    Keywords- Software watermarking, Watermark value,

    encoding, Reducible Permutation Graph, Self Inverting

    Permutation.

    INTRODUCTION

    Development in Internet Technology, wide spread

    use of peer-to-peer resource sharing technology, software

    products are spreading faster. A survey conducted by BusinessSoftware Alliance [1] shows that more than 41% of software

    using worldwide are pirated.

    To protect intellectual property rights of software

    products become a serious challenge to the software industry.

    Software Watermarking is a technique to prevent or

    discourage software piracy and copyright infringement.

    This paper is structured as follows. In section 2, we

    describe the formal definition of graph based software

    watermarking. Preliminary section will be discussed in section

    3. Different models of attacks that can occur in graph based

    watermarking algorithms are expressed in section 4.

    We propose an algorithm which effectively

    overcomes the previously mentioned attacks is discussed insection 5. In section 6, we conclude by discussing the main

    advantages of our proposed algorithm.

    SOFTWARE WATERMARKING

    Software Watermarking

    Software Watermarking can be described as the

    problem of embedding a structure w into a program P such

    that w can be reliably located and extracted from P even after

    P has be subjected to code transformations. Precisely, given aprogram P, a watermark w, a key k, the software

    watermarking problem can be described by the following two

    functions:

    embed( P, w, k) P and

    extract( P, k ) w.

    A. Classification of Watermarking algorithmsWatermarking algorithms can be generally divided

    into static algorithms and dynamic algorithms [2,3,4]. A

    watermark is stored inside a program code in a certain format,

    and it does not change during the program execution is called

    static software watermarking. A dynamic watermark isbuiltduring program execution, perhaps only after a particularsequence of input.

    According to the representation of

    watermark information there are two types of static

    watermarks: data watermarks and code watermarks. Data

    watermark stores watermark information as program data, and

    can be stored anywhere inside a program, such as in comments

    or in variables. A code watermark is represented by choosing

    particular sequence of instructions has an equivalent effect.

    There are three dynamic watermarking techniques [2]: Easter

    Egg Watermarking, Execution Trace Watermarking and

    Dynamic Data Structure Watermarking.

    B. Characteristics of a good watermarking algorithmsDate rate, stealth, and resilience are the

    characteristics of software watermarking [2]. The data rate

    expresses the quantity of hidden data that can be embedded

    within the cover message. The stealth expresses how

    imperceptible the embedded data is to an observer, and the

    resilience express the hidden messages degree of immunity to

    attack by an adversary [10,11,12].

    Identify applicable sponsor/s here. (sponsors)

  • 7/27/2019 Encoding constants in Watermarking graph

    2/5

    C.Attacks against WatermarksA successful attack against the watermarked program

    Pw prevents the recognizer from extracting the watermark

    while not seriously harming the performances or correctness

    of the program Pw. Attacks against software watermarks can

    be classified into subtractive attacks, distortive attacks and

    additive attacks. A subtractive attack is one where the attacker,perhaps with some tool support, tries to locate and remove the

    watermark. If the attacker cannot locate the watermark and is

    willing accept some degradation in quality of watermarked

    program, he can apply distortive transformations uniformly

    over the watermarked program is known as distortive attacks.

    An additive attack is one where the attacker inserts his own

    watermark in an attempt to override owners watermark, or at

    least make it plausible that the owners watermark was not

    inserted before the attackers [5,6,7].

    We can classify the most relevant existing software

    watermarking techniques as Graph-based software

    watermarking, register based software watermarking, thread

    based software watermarking, obfuscation based softwarewatermarking, branch based software watermarking, program

    slicing based software watermarking and abstract

    interpretation based software watermarking[6,7,12,13]. In this

    paper we propose a graph based software watermarking

    algorithm in which constants used in the programs and

    watermark information are encoded into a graph called

    Reducible Permutation Graph. This graph is embedded into

    the graph. Hence any modification in the graph will destroy

    the proper execution of the program.

    GRAPH BASED SOFTWARE WATERMARKING

    There are several software watermarking algorithms have been

    proposed that encode watermarks as graph structures [9, 10,

    11]. In general, such encoding make use of an encoding

    function encode which converts a watermarking number w

    into a graph G, Encode(w)G and also a decoding function

    decode that convert the graph G into the number w,

    decode(G)w.

    We usually call the pair (encode, decode) as graph

    codec. From a graph theoretic point of view we are looking for

    a class of graph G and corresponding codec.

    (encode, decode)G.Collberg and Thomborson[2,4] proposed the first dynamic

    graph watermarking scheme CT to overcome problems with

    static watermarking schemes. Static watermarks are highly

    fragile and therefore susceptible to semantics preservingtransformation attacks. Dynamic graph watermarking schemes

    are similar to static graph watermarking except the graph is

    built at run-time.

    Preliminaries

    This section describes an efficient and easily implemented

    algorithm for encoding numbers as reducible permutation

    graphs through the use of self-inverting permutations. This

    section also gives some basic definitions required for

    understanding how to produce a self inverting permutation

    from a watermark value. An algorithm for converting

    watermark value W to Self inverting permutation and the

    reverse process proposed by Maria Chroni and Nikolopolos

    [14,15] is described. They also presented an algorithms for

    embedding the reducible permutation graphs into the program

    code and explains how we can extracts the reducible

    permutation graphs from the program code.

    We consider finite graphs with no multiple edges. Fora graph G, we denote V (G) and E (G) the vertex set and edge

    set of G, respectively. Next they introduce some definitions

    that are key-objects in our algorithms for encoding numbers as

    graphs. Let be a permutation over the set N n=1, 2, 3,,n.

    We think of permutation as a sequence (1, 2, 3, n).

    Definition 1:

    The inverse of a permutation (1, 2, 3, n) is the

    permutation (q1,q2,,qn) with qi=qi. A self inverting

    permutation is a permutation that is its own inverse: i = i. By

    definition, every permutation has a unique inverse, and the

    inverse of the inverse is the original permutation. Clearly, a

    permutation is a self-inverting permutation if and only if all its

    cycles are of length 1 or 2; hereafter, we shall denote a 2-cycleas c = (x, y) and an 1-cycle as c(x), or, equivalently, c = (x, x).

    Definition 2:

    Let C1,2={c1=(x1, y1), c2=(x2 ,y2),, ck=(xk ,yk)} be the

    set of all the cycles of a self-inverting permutation such that

    xi < yi (1

  • 7/27/2019 Encoding constants in Watermarking graph

    3/5

    ENCODING CONSTANTS IN WATERMARKING STRUCTURE

    A program consists of n number of functions or

    objects. These functions or objects are the real block of

    programs having some unique properties with respect to the

    program requirements. Constants are usual in most of the

    program code which actually decide the proper execution in

    most of the time. We use this property of constants in our new

    Watermarking technique.

    Suppose P is the program consists of n number offunctions or objects and C1,C2,...,Cn are the constants

    collected. We select one constant from a function. If there is

    more than one constant, select one. If there is no constants add

    any constant. W is the value of the watermark to be encoded

    and B is the binary representation of W. Suppose p is the

    number of bits needed to represent W in binary. The or SIP

    have 2p+1 bits. Select 2p+1 constant from Program P.

    Suppose C = C1,C2,...Cm are the constants collected

    from the program P and W is the watermark value and k is the

    key used to extract the Watermark value. We also use the

    algorithm Encode W to SIP and Encode RP G from SIP

    proposed by Maria Chroni et al. for encoding the watermark

    value W to Self Inverting Permutation and ReduciblePermutation Graph from Self Inverting Permutation. RP G 0 is

    the new form after encoding the constants into RP G. We

    embed the new graph RP G 0 into the program. The algorithm

    for encoding both constants C and Watermark W is explained

    in this section.

    Algorithm computing Constants from Program

    Suppose C consists of series or sequence of constants C1, C2,

    , Cm collected from the program P having n functions or

    objects (where m

  • 7/27/2019 Encoding constants in Watermarking graph

    4/5

    the value of the constant corresponding to RPG. A memory

    location which holds the starting location of the reducible

    permutation graph is represented by 'start'. k is the sequence

    from which the decoding start.

    This algorithm uses property of graph traversal

    algorithm hence if there are n number of nodes in the graph

    the maximum time taken to execute the algorithm is O(n). The

    algorithm does not take any extra space other than to store the

    graph of n nodes. So the space complexity is also O(n).

    Algorithm to Extract W from RPG

    We know that RPG is encoded graph of constants C,

    hence to produce the RP G is the primary objective to extract

    the watermark. We use the key value k to extract the

    watermark which is inserted by the owner to prove ownership

    mark. The inputs of the algorithm are RPGand W. Both RPG

    and RPGdo not have any structural difference other than the

    node values.

    If n is the number of nodes in the reducible

    permutation graph RPG'. Then O(n) is the time complexity to

    replace n node values. Both the algorithm decode SIP from

    RPG and Decode W from SIP takes O(n) time and space

    respectively. Hence O(n) is the time complexity of above

    algorithm.

    IMPLEMENTATION AND RESULTS

    All software watermarking algorithms need an

    embedder to embed the watermark code in to the program

    structure and a recognizer to recognize the watermark that we

    encoded either in the program code or in a separate section

    other than the program code which helps the user to locate the

    watermark. We can find out or compute the number of

    constants in the program either by manual cross check or with

    a separate program for locating the constants.

    The Reducible Permutation Graph have error

    correcting properties which means that small modification onthe graph does not affect the graph from executing the

    watermark value. Hence their algorithm is more efficient when

    it compares with other graph based algorithms. Removal of a

    node or some edges would seriously affect extraction of

    watermark difficult. In the newly proposed algorithm we

    proposed a technique to encode both watermark value and

    constants in the program into graph called reducible

    permutation graph. Even though we encode constants and

    watermark value into a single graph, the property of the

    reducible graph does not change. Because we do not make any

    modification on the graph structure but only the node values

    changed.

    Next we would like check the correctness of the

    algorithm. We have seen that the simply collects the constants

    from the program. Then Reducible Permutation graph is

    generated with a watermark value the correctness of the

    algorithm is already proven. We compute the factors of the

    least common multiples of the constant sequence which is a

    simple arithmetic computation done to compute the algorithm.

    There are algorithms which helps us to find the least common

    factors of constants which is a simple computation. These

    factors are supplied to the Reducible permutation graph. A

    simple replacement procedure is carried out here to replace the

    node values. Which is also a graph traversal algorithm. Hence

    the algorithm always produce a RPG' which is an encoded

    form of constants and watermark value.

    CONCLUSION

    We succeeded in developing the method of encoding a

    sequence of numbers in an encoded watermarking graph. We

    achieved double encoding of numbers. This is an entirely new

    concept in software watermarking. Our algorithm's e

    efficiency depends on the graph we select to encode multiple

    sequences of values. The algorithm extract the constants

    without harming the graph structure which is very simple

  • 7/27/2019 Encoding constants in Watermarking graph

    5/5

    compared to other methods. But we can use the similar

    technique used in the extraction of watermark value in

    extracting the constant. Then the algorithm produces more

    stealth we leave it as a problem to investigate in future, we

    also leave as problem that to and a graph where we can do

    double encoding of sequence of values in the graph structure.

    REFERENCES

    [1] Business Software Alliance. Sixth annual BSA and IDCglobal software piracy study. Technical Report, Business

    software Alliance, 2008.

    [2] Christian Collberg and Clark Thomborson. Software

    watermarking: Models and Dynamic embeddings. In

    principles of Programming Languages 1999, POPL'99,

    January 1999.

    [3] William Feng Zhu. Concepts and Technologies in Software

    Watermarking and Obfuscation. PhD Thesis, The University

    of Auckland, 2007.

    [4] Christain Collberg, Stephen Kobourov, Edward Carter, and

    Clark Thomborson. ErrorCorrecting graphs for software

    watermarking. In proceedings of the 29th Workshop on Graph

    Theoretic Concepts in Computer Science, Pages 156-167,2003.

    [5] X. Chen, D. Fang, J. Shen, F. Chen, W. Wang, L. He, A

    Dynamic Graph Watermark Scheme of Tamper Resistance,

    Fifth International Conference on Information Assurance and

    Security, IEEE Computer Society, ISBN 978-0-7695-3744-3,

    Pages 3-6, 2009.

    [6] William Zhu and Clark Thomborson. Extraction in

    software watermarking. In Sviatoslav Voloshynovskiy, Jana

    Dittmann, and Jessica J, pages 175-181. ACM, 2006. ISBN 1-

    59593.

    [7] S. Jamal, H. Zaidi and Hongxia Wang. On the Analysis of

    Software Watermarking, 2nd International Conference on

    Software Technologies and Engineering, IEEE, ISBN 978-1-

    4244-8666-3, pages VI 26-VI 30, 2010.

    [8] L. Zhang, Y. Yang, X. Niu, and S. Niu, A Survey of

    software Watermarking, Journal of Software, volume 14,

    pages 268-277, 2003.

    [9] James Hamilton, Sebastian Danicic. An Evaluation of the

    Resilience of Static Java Bytecode Watermarks Against

    Distortive Attacks, IAENG International Journal of Computer

    Science, 2011.

    [10] C. Collberg, A. Huntwork, E. Carter, and G. Townsend.

    Graph Theoretic software watermarks: Implementation,

    analysis, and attacks. In workshop on Information Hiding,

    2004.

    [11] Ramarathnam Venkatesan and Vijay Vazirani. Technique

    for producing through watermarking highly tamper-resistant

    executable code and resulting watermarked code so formed,

    May 2006. Microsoft Corporation, US Patent: 70521208.

    [12] Ramarathnam Venkatesan, Vijay Vazirani and SaurabhSinha. A graph theoretic approach to software watermarking.

    Inproceedings of the 4th International Workshop on

    Information Hiding, 2001.

    [13] Robert Davidson and Nathan Myhrvold. Method and

    system for generating and auditing a signature for a computer

    program, June 1996. Microsoft Corporation, US Patent

    5559884.

    [14] Maria Chroni and Stavros D. Nikolopoulos. Encoding

    watermark integers as self-inverting permutations. In

    proceedings of the 11th International Conference on Computer

    Systems and Technologies and Workshop for PhD Students,

    pages 125-130, Sofia, Bulgaria, 2010. ACM. ISBN 978-1-

    4503-0243-2.[15] Maria Chroni and Stavros D. Nikolopoulos. Efficient

    Encoding of Watermark Numbers as Reducible Permutation

    Graphs. In proceedings of the 10th International Conference

    on Computer Systems and Technologies and Workshop for

    PhD Students, 25-130, Sofia, Bulgaria, 2009.

    [16] Maria Chroni and Stavros D. Nikolopoulos. Efficient

    encoding of Watermark numbers as Cographs using Self

    inverting permutations. In proceedings of the 12th

    International Conference on Computer Systems and

    Technologies and Workshop for PhD Students, pages 142-

    148, Sofia, Bulgaria, 2011. ACM ICPS 578, 2011.

    [17] Maria Chroni and S.D. Nikolopoulos. An Embedding

    graph based model for software watermarking, 8th

    International Conference on Information Hiding and

    Multimedia Signal Processing, IEEE proceedings, 2012.