proteus: virtualization for diversified tamper-resistance bertrand anckaert ghent university,...
TRANSCRIPT
Proteus:Virtualization for Diversified
Tamper-Resistance
Bertrand AnckaertGhent University, Belgium
Mariusz JakubowskiRamarathnam VenkatesanMicrosoft Research, USA
The 6th ACM Workshop on Digital Rights ManagementOctober 30, 2006 - Alexandria VA, USA
2
Tampering: Applications
0101110 00111001010 00101011001000110001110110010111011011001011101010110100010110111111110001010110110011111001010111001110010111 1 11111111111111110
3
It’s tough to win a battle
o Incentive goes beyond fame• Software piracy: $31 billion• Virtual space resort: $100,000• Virtual sword: 1 human life• …
o Cat and mouse gameo Cracker usually gets the last wordo Protections have usually been broken
relatively quickly
4
101101011100101101
101011001110010110
110001011101010101
111000110011011011
001001110101001101
Can we win the war?
101101011100101101
101101011100101101
101101011100101101
101101011100101101
101101011100101101
5
Why not?
101101011100101101
o Requires more bandwidtho Software aging, tailored updateso Hardware dependencieso Contain private information
…
What will keep a cracker from distributing the cracked program as a whole?
6
o Introo Proteus:o Virtualization foro Diversified Tamper-Resistance
Overview
7
Proteus: definition
(From OED)
8
o Introo Proteus,o Virtualization foro Diversified Tamper-Resistance
Overview
9
Virtualization
o Choose ISA and micro-architectureo Many degrees of freedomo Use freedom for
• Diversity• Tamper-resistance
10
RESULTING BINARY
Overall design
ORIGINALMSIL
BINARY
PROTEUSFRONTEND
VMDESCRIPTION
CUSTOMBYTECODE
BINARY
PROTEUSBACKEND
CUSTOM
VM
Easily decompiled
11
Function Stubification
VM.dll
CUSTOMBYTECODE
BINARY
CUSTOM
VM
public static void main(string [] args){ Object [] array = {args}; InvokeVM(array, PC);}
public Int32 foo(Int32 i, Int32 j){ Object [] array = {this, i, j); Object ret = InvokeVM(array, PC); return (Int32) ret;}
REWRITTEN MSIL BINARY
Entry point of function
12
Virtualization: Design Principles
o Tamper-Resistant ISA:• Complicate analysis• Prevent local modifications• Make observation hard• …
o Traditional ISAs:• Performance, compaction• Portability, verifiability• Automatic garbage collection• …
Complexity of an attack
CISC
RISC
Java Bytecode and MSIL
Our Bytecode
conflicts?
13
While (true){ ExecuteIns
}
DecodeOpcode
EmulateIns
Virtualization: VM Operation
While (true){
}
Fetch
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
14
Virtualization: Choices
We get to design our own• ISA
o Instruction semantics (1)o Opcode encoding (2)o Operand encoding (3)o Fetch cycle (4) o Program representation and counter (5)
• Micro-Architecture (6)
While (true){
ExecuteIns
}
DecodeOpcode
EmulateIns
While (true){
}
FetchFetch
DecodeOperands
ManipulateMethodState
DecodeOperands
ManipulateMethodState
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
(1)
(2)(3)
(5)
(6)
(4)
(1)
(2)(3)
(5)
(6)
(4)
15
o Introo Proteus,o Virtualization foro Diversified Tamper-Resistance
• Instruction Semantics (1)• Opcode and Operand Encoding (2 & 3)• Fetch Cycle (4)• Program Representation and Counter (5)• Micro-Architecture (6)
Overview
16
DecodeOpcode
EmulateIns
While (true){
}
Choice: Instruction Semantics
Fetch
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
(1)
(2)
(3)
(4)(5)
(6)
17
Instruction Semantics
ldloc add
br
pop
callvirt
newobj
newarr
ldc
μOps
SuperIns:
ldloc
ldc
sub
stloc
18
o Semantic overlap
o Limited instruction set• nop• Invertible jump conditions
Instruction Semantics: Tamper-Resistance
ldloc
ldc
sub
stloc
ldloc
ldc
sub
stlocldloc
ldc
sub
stloc
ldloc
ldc
sub
stloc
SuperIns:
ldloc
ldc
sub
stloc
SuperInsSuperIns
SuperInsSuperIns
Tradeoff
pop
19
DecodeOpcode
EmulateIns
While (true){
}
Choice: Opcode and Operand Encoding
Fetch
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
(1)
(2)
(3)
(4)(5)
(6)
20
Encoding Opcodes and Operands
o Any prefix encodingo Tamper-resistant
• physical overlap (unary encoding) 1: add 01: mul 001: sub0001: div
• variable length
Tradeoff
21
o Encoding does not need to be constant
o Instructions to reorder subtreeso Bit sequences get different meaning in
different interpretation stateso Semantic overlap
Variable Encoding
0 1 Root
LeavesADD
0 1
SUB
0 1
MUL DIV
22
DecodeOpcode
EmulateIns
While (true){
}
Choice: Fetch Filters
Fetch
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
(1)
(2)
(3)
(4)(5)
(6)
23
Fetch filters
o Combine bit pattern with• Program counter• Other parts of the program• Key• …
24
DecodeOpcode
EmulateIns
While (true){
}
Choice: Code Representation
Fetch
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
(1)
(2)
(3)
(4)(5)
(6)
25
Splay tree representation
int32 Fac(int32) :
ldarg.0
ldc.i4.1
bne.un.s
ldc.i4.1
ret
ldarg.0
ldarg.0
ldc.i4.1
sub
call
int32 Fac(int32)
mul
ret
LINEAR
1:ldarg.0ldc.i4.1bne.un.s 3
br 2
2:ldc.i4.1
ret
3:ldarg.0ldarg.0ldc.i4.1
subcall 1
4:mulret
SPLAY TREE (1)
1:ldarg.0ldc.i4.1bne.un.s 3
br 2
2:ldc.i4.1
ret
3:ldarg.0ldarg.0ldc.i4.1
subcall 1
4:mulret
SPLAY TREE (2)
26
DecodeOpcode
EmulateIns
While (true){
}
Choice: MicroArchitecture
Fetch
DecodeOperands
ManipulateMethodState
CUSTOMBYTECODE
BINARY
PC
Caller
Evaluation Stack
Local Variables
Local Allocation
Arguments
CURRENT METHODFRAME
(1)
(2)
(3)
(4)(5)
(6)
o ISA determinedo Determine MicroArchitectureo Combine code and auto-generate
codeo Diversify result
27
Heuristic Benefits
o Complicate analysis• Custom bytecode language• Variable instruction length• Variable encoding
o Complicate local modifications• Semantic overlap• Physical overlap
o Complicate global modifications• (blur distinction between code, data and addresses)
o Complicate observing the execution• Constant relocation of the code\data
28
Ultimate goal: prevent class attacks
Sufficient Diversification:
Complexity of converting the attack to another instance
≥
Complexity of attacking the other instance from scratch
29
Sufficient Diversification
o Chain is as strong as its weakest linko If attacking an instance from scratch is
easier than converting an existing attack, the weakest link is the tamper-resistance and not the diversification
≥
101011001110010110
101011001110010110
+
30
Conclusion
o Virtualization gives us the freedom to choose the ISA and MicroArchitecture
o This choice can be used for• Diversity• Tamper-resistance
o And hopefully lead to a provable degree of protection
Questions?