software watermarking via opaque predicates: implementation

ICECR 2004

Software Watermarking via Opaque Predicates:

Implementation, Analysis, and Attacks

Ginger Myles

Christian Collberg

{mylesg,collberg}@cs.arizona.edu

University of Arizona

Department of Computer Science

– p. 1

ICECR 2004

What is Software Watermarking?

Technique used to aid in the prevention of software piracy.

embed(P, w, key) → P ′

recognize(P ′, key) → w

Watermark: w uniquely identifies the owner of P .

Fingerprint: w uniquely identifies the purchaser of P .

Watermarked Program

OriginalProgram

EmbedWatermark

ExtractWatermark

WW KK

P P′

– p. 2

ICECR 2004


Static: the watermark is stored directly in the data or codesections of a native executable or class file. Make use of thefeatures of an application that are available at compile-time.

Dynamic: the watermark is stored in the run-time structuresof the program.

– p. 3

ICECR 2004


Blind: the recognizer is given the watermarked program andthe watermark key as input.

Informed: the recognizer is given the watermarked programand the watermark key as input and it also has access to theunwatermarked program.

– p. 4

ICECR 2004

Why use Software Watermarking?

Discourages illegal copying and redistribution.

A copyright notice can be used to provide proof of ownership.

A fingerprint can be used to trace the source of the illegalredistribution.

Does not prevent illegal copying and redistribution.

– p. 5

ICECR 2004

How can we watermark software?

Insert new (non-functional or nonexecuted) code

Reorder code where it does not change the functionality

Manipulate instruction frequencies

� �

switch ( E ) {

case 1 : { · · ·}

case 5 : { · · ·}

case 9 : { · · ·}

}� �

⇒

� �

switch ( E ) {

case 5 : { · · ·}

case 1 : { · · ·}

case 9 : { · · ·}

}� �

– p. 6

ICECR 2004

Opaque Predicates

Used to make it more difficult for an adversary to analyze thecontrol-flow of the application.

Can make it more difficult to identify that certain portions ofthe application are superfluous.

– p. 7

ICECR 2004

Opaque Predicates

Opaque Predicate: A predicate P is opaque at a programpoint p, if at point p the outcome of P is known at embeddingtime.

Opaque Method: A boolean method M is opaque at aninvocation point p, if at point p the return value of M is knowat embedding time.

– p. 8

ICECR 2004

Opaque Predicates

Variety of techniques have been suggested

number theoretic results

pointer aliases

concurrency

Example: x(x + 1) (mod 2) ≡ 0

– p. 9

ICECR 2004

Arboit Algorithm

Originally proposed by Genevieve Arboit at ICECR-5

Embeds the watermark at select branching points in theapplication through the use of opaque predicates.

– p. 10

ICECR 2004

Arboit Algorithm

Two techniques were originally proposed:

1. An opaque predicate is appended to the predicate at theselected branch.

� �

c l a s s C{

vo id m1( i n t a , i n t b ){

. . .

i f ( a <= b ) { . . . }

e l s e { . . . }

. . .

}

}� �

W⇒

� �

c l a s s C{


. . .

i f ( ( a <= b) &&

( c∗c >= 0){ . . .}

e l s e { . . . }

. . .

}

}� �

– p. 11

ICECR 2004

Arboit Algorithm

Two techniques were originally proposed:

2. A call to an opaque method is appended to the predicate atthe selected branch.

� �

c l a s s C{


. . .

i f ( a <= b ) { . . . }

e l s e { . . . }

. . .

}

}� �

W⇒

� �

c l a s s C{

boolean m2(){

i n t c = 1 ;

r e t u r n ( c∗c >= 0);

}


. . .

i f ( ( a <= b) &&

m2 ( ) ) { . . . }

e l s e { . . . }

. . .

}

}� �

– p. 12

ICECR 2004

Arboit Algorithm

Watermark Encoding:

Watermark string is encoded as an integer.

Split into k pieces {w1, ..., wk} where 0 ≤ wi ≤ n.

Order of the pieces is unimportant.

– p. 13

ICECR 2004

Arboit Algorithm

Watermark Encoding:

wi is encoded in opaque predicate in one of two ways:

1. Use constants in the predicate.4|x2(x + 1)(x + 2) encodes the value 6.Can insert new constants in the predicate. Can encodethe value 42 by multiplying both sides by 18.(18)(4)|(18)x2(x + 1)(x + 2).

2. Assign a rank to each opaque predicate in the library.

– p. 14

ICECR 2004

Arboit Algorithm

Added features to improve strength:

Identify local variables to use in opaque predicate throughslicing.

If wi is encoded using rank and opaque methods, it may bepossible to reuse methods instead of adding k new methods.

– p. 15

ICECR 2004

Dynamic Arboit Algorithm

Motivation:

To examine if converting a known static algorithm to adynamic algorithm created a technique which is inherentlymore resilient to attack.

It is believed that truly dynamic algorithms are more resilient.

– p. 16

ICECR 2004

Dynamic Arboit Algorithm

Uses the program’s execution state for both embedding andrecognition.

The application is run with a specific input.

The order of X’s and O’s placed on a Tic-Tac-Toe board.

The execution path is used to identify the branching points.

– p. 17

ICECR 2004

Implementation

Implemented in Java using the BCEL bytecode editor.

Incorporated into the SandMark framework.

Static algorithms can be applied to an entire application or asingle class file,but the dynamic can only be used on an entireapplication.

– p. 18

ICECR 2004

Arboit Algorithm Evaluation

Performed a variety of empirical tests to evaluate eachalgorithm’s overall effectiveness.

Implementation within SandMark facilitated the study ofmanual attacks and the application of obfuscations.

The evaluation examined six software watermarking properties.

– p. 19

ICECR 2004

Watermark Evaluation Properties

Credibility: The watermark should be readily detectable forproof of authorship while minimizing the probability ofcoincidence.

Data-rate: Maximize the length of message that can beembedded.

Perceptual Invisibility (Stealth): A watermark should exhibit

the same properties as the code around it so as to makedetection difficult.

Part Protection: A good watermark should be distributedthroughout the software in order to protect all parts of it.

Overhead: A watermark should have little impact on theperformance of the application and the embedding/recognitionprocedure should not be costly.

– p. 20

ICECR 2004


Resilience: A watermark should withstand a variety of attacks

– p. 21

ICECR 2004



Subtractive Attack: The adversary attempts to remove allor part of the watermark.

AliceBob

W K

P′′P

′P

K

– p. 22

ICECR 2004



Additive Attack: The adversary adds a new watermark.

Alice Bob

W W1

W1

WW

AdditiveAttack

P′

P

K

K1

KP′′

– p. 23

ICECR 2004



Distortive Attack: The attacker applies a series ofsemantics-preserving transformations to render thewatermark useless.

AliceBob

W W’ W’

DistortiveAttack

K

PK

P′ P

′′

– p. 24

ICECR 2004



Collusive Attack: The adversary compares two differentlyfingerprinted copies of the software to identify the location.

Alice Bob

F1

F2

CollusiveAttackP1

PP

K1

K2

P2

– p. 25

ICECR 2004

Summary of Results

Technique 1 is stronger than Technique 2.

Technique 1 has a lower overhead, was more resilient toattack, and demonstrates a higher degree of stealth.

Demonstrated that the dynamic algorithm is only minimallystronger than the static version.

– p. 26

ICECR 2004

Summary of Contributions

We added features to the original techniques to improve thestrength.

slicing to identify live variables

method reuse

Presented a novel extension of the technique to study staticversus dynamic algorithms.

Implemented and evaluated all techniques.

– p. 27

ICECR 2004

A shameless plug to conclude

http://www.cs.arizona.edu/sandmark

– p. 28

software watermarking via opaque predicates: implementation

Documents