efficient code obfuscation for android · efficient code obfuscation for android master thesis...
TRANSCRIPT
Efficient Code Obfuscation for Android
Master Thesis Defense Presentation
Author: Supervisor:Alexandrina Kovacheva Prof. Alex BIRYUKOV Reviewer: Prof. Jean-Sébastien CORON Advisor: Dr. Ralf-Philipp WEINMANN
2
In the next 30 mins...
3
In the next 30 mins...
4
What is obfuscation?
&
Why is it useful?
5
Introductio
Java
API
DVM + Libs
Kernel
6
Introductio
Java
API
DVM + Libs
Kernel
7
Dalvik Virtual Machine
● Register based (32-bits)● Optimized for Android:
– Slower CPU
– Little RAM: 20 MB
– No swap
– Quick replication (UID)
– Energy efficient
● Instruction set: 218 different opcodes (~26 groups) 38 unused opcodes
8
Build, install, verify → obfuscate?
++ VFY, dexopt
9
Tools: analysis & protection
● Protection– ProGuard– dalvik-obfuscator
– APKfuscator
– DexGuard
● Analysis – androguard
– baksmali (apktool)
– dedexer
– dexdump
– dex2jar
– dexter
– dexguard
– IDA pro
– radare2
– …
10
How are apps protected currently?
11
Case Study
● ~1700 APK files
● Only free apps!
● Two phases:
(1) coarse automation
(2) manual
● Profiling the apps:– ProGuard obfucation
– Base64 strings
– Dynamic code
– Native code
– Crypto code
– Reflection
– Header size
– Encoding
12
Case Study: results (pt1)
● ProGuard
● Reflection ● Header size● Encoding
● Base 64
13
Case Study: results (pt2)
● Base 64 – Multimedia (GIF, JPEG, PNG)
– Text (ASCII, UTF-8 text)
● UTF-8 names of fields and classes
– 文章 :Ljava/util/ArrayList;
● Interesting strings:– http://media.admob.com/ Tel://6509313940
– http://dl.dropbox.com/u/...../inmobi_mraid.js
– plaintext passwords
14
Conclusions
● ProGuard usage popular● UTF-8 names found in very few apps
(breaks some tools)● Reflection & native → hide code from static analysis
But basically... all we wanted, we could find it.
15
What can we do to protect better our apps?
16
(Yet another) obfuscator
A similar approach to dalvik-obfuscator – Four transformations – Design accents on: generic, cheap
17
Adding Native Call Wrappers
● Targets: metadata information extraction● Adds complexity: data flow, control flow
18
Packing Numeric Variables
● Targets: data extraction● Adds complexity: data flow, control flow
19
Packing Numeric Variables (ctd)
20
Packing Numeric Variables (ctd)
21
Packing Numeric Variables (ctd)
22
Strings obfuscation
● Targets: metadata information extraction● Adds complexity: data flow, control flow
23
Add “bad” code
● Targets: defeat popular static analysis tools● Adds complexity: control flow
24
Obfuscator evaluation
● All transformations applied together: < +1MB
● UTF-8 names + our obfuscator = good protection
25
What are our limits?
26
What can(not) we do?
● Static techniques– Encoding
– Reordering code and data
– Merging and splitting code
– Jump exploit limitations
● Dynamic techniques (possible with a custom class loader)– Dynamic code changes
– Code encryption
27
Summary
● Proved applications undergo few protection
● Proposed an obfuscator implementation
(code available on GitHub)
● An attempt to discuss what techniques from
x86 can be applied on Dalvik bytecode
28
Question time
&&
Thank you!