preventing reverse engineering by obfuscating
DESCRIPTION
Preventing Reverse Engineering by Obfuscating. Bharath Kumar. Reverse Engineering. Process of backtracking through the software process Obtaining source code from binary/ byte code. Understanding programs to realize intent. Intellectual property issues. Example. public class TreeEnum { - PowerPoint PPT PresentationTRANSCRIPT
Preventing Reverse Engineering by Obfuscating
Bharath Kumar
Reverse EngineeringProcess of backtracking through the
software processObtaining source code from binary/ byte
code.Understanding programs to realize
intent. Intellectual property issues.
Examplepublic class TreeEnum
{
Vector treesBasedOnNumber;
private Vector getTrees(int numberOfNodes)
{
if (numberOfNodes == 0) return null;
if (numberOfNodes <= treesBasedOnNumber.size()) {
// System.out.println("Trying for " + (numberOfNodes - 1) + " with " + treesBasedOnNumber.size());
Object o = treesBasedOnNumber.get(numberOfNodes - 1);
if (o instanceof Vector) return (Vector)o;
else return null;
}
return null;
}
Reverse engineered!public synchronized class TreeEnum
{
Vector treesBasedOnNumber;
private Vector getTrees(int i)
{
if (i == 0) return null;
if (i > treesBasedOnNumber.size()) return null;
Object object = treesBasedOnNumber.get(i - 1);
if (object instanceof Vector) return (Vector)object;
else return null;
}
Approaches against reverse engineeringLegal battlesService based softwareThin mobile codeCode encryptionDistributing binariesObfuscation
ObfuscationObfuscate – “to confuse”Alter code so as to confuse reverse
engineer, but preserve functionalityBehavior preserving transformations on
code that preserve function but reduce readability or understandability
How do we confuse the reader?
Software metrics Program length
Complexity of program increases with the number of operators and operands in P.
Cyclomatic complexity Complexity increases with the number of predicates in a
function. Nesting complexity
Complexity increases with the number of nesting level of conditionals in a program.
Data flow complexity Complexity increases with the number of inter-block
variable references.
Software metrics… Fan-in/fan-out complexity
Complexity increases with the number of formal parameters to a function, and with the number of global data structures read or updated in the function.
Data structure complexity Complexity increases with the complexity of the
static data structures in the program. Variables, Vectors, Records.
OO Metrics Complexity increases with
Level of inheritance Coupling Number of methods triggered by another method Non-cohesiveness
A classification of obfuscations Layout transformations
Change formatting information
Control transformations Alter program control and computation
Aggregation transformations Refactor program using aggregation methods
Data transformations Storage and encoding information
Some metrics for obfuscations Assume complexity of a program be E(P)
(based on metrics) Potency of a transformation is the level of
complexity it introduces. E(P`)/E(P) – 1
Resilience of a transformation measures how well it can deal with a deobfuscation ‘attack’ On a scale of trivial to one-way
Execution cost Free, cheap, costly, dear
Quality of an obfuscation A combination of potency, resilience, and
execution cost
Control transformationsOpaque predicates
S1; S2; S1; if (Pred) S1; S2; if (Pred) S2;
Opaque constructs – always evaluate one way (known to obfuscator), unknown to deobfuscator.
Trivial and weak opaque constructs.
Control transformations Insert dead or irrelevant code Extend loop conditions Convert a reducible to a non-reducible flow
graph Redundant operands Parallelize code Replacing standard library routines by custom
routines
Aggregation transformations Inline and outline methods Interleave methodsClone methodsLoop transformations
Loop blocking Loop unrolling Loop fission
Ordering transformations
Data transformationsChange encoding
Pack variables into bigger variables Pack variables into arrays
Convert static to procedural dataRestructure arraysAltering inheritance hierarchies
Opaque constructsThe pointer aliasing problem
Shown to be NP-hard or even undecidableDynamic structures for producing
opaque constructs.Opaque constructs using threads.
Deobfuscation Almost all obfuscating transforms have a
deobfuscating transform Essentially boils down to evaluating opaque
constructs Program slicing Pattern matching Statistical analysis Data flow analysis Theorem proving