advanced js deobfuscation
TRANSCRIPT
Advanced JS DeObfuscation via AST
Stefano Di Paola CTO + Chief Scientist @MindedSecurity
JS And Obfuscation ❖ JS is super flexible!
❖ 1k+N ways the do the same thing - +N is the JS way
❖ OK from a Dev POV - performances apart
❖ Not Always OK for readability.
❖ SUPER OK for Obfuscation!
Goals of Obfuscation ❖Block-Limit RE
– Intellectual Property preservation
– AV Bypass of Exploits
– WAF Bypass of Cross Site Scripting Payload
3
JS Obfuscators ❖Several Public Obfuscation techniques:
– Eval Packer: http://dean.edwards.name/packer/
– Metasploit JSObfu: https://github.com/rapid7/jsobfu
– JSFuck (From Slackers): http://www.jsfuck.com/
– JJEncode : http://utf-8.jp/public/jjencode.html
– AAEncode: http://utf-8.jp/public/aaencode.html
– Node-Obf: https://github.com/wearefractal/node-obf
– https://github.com/search?p=2&q=obfuscator+JavaScript&type=Repositories&utf8=%
E2%9C%93
– On the wild ...
Some commercial Obfuscator & Malware Obfuscator
JSObfu JSF#*k
Output Example
AAEncode
JJEncode
Output Example
Why Do We Want to Deobfuscate? ❖Defense!
❖Mainly to revert the Scope of Obfuscation:
– AV detection of known Exploits
– Precise WAF identification of Cross Site Scripting Payload
– Intellectual property (yeah that too)
The Final Goal is to create a "Normalized" version of the code that will allow easier comparison and analysis
Deobfuscation from P to P1
❖Semantics preservation: – Semantics preservation is required.
❖Automation: – P1 is obtained from P without the need for hand work (Ideally).
❖Robustness: – All code valid to the interpreter should be parsable by the
deobfuscator. ❖Readability:
– P1 is easy to adapt and analyze. ❖Efficiency:
– Program P1 should not be much slower or larger than P.
Deobfuscation Techniques ❖ Easy way:
– Runtime. Sandboxed Environment to execute payload. (PhantomJS, Thug, JSCli..)
– Pro : Easy – Cons: behavior based. Can't classify by source code. Hard to analyze what's
going on. Possible Auto Pwnage. ❖ Harder Way:
– By hand (!!!) – Pro: Human brain can be used. – Cons: Human brain MUST be used. Slow, High Expertise… A Lot.
❖ Hard/Easy Way: – Runtime + Static Analysis -> Hybrid approach via Partial Evaluation. – Pro: Leads to interesting results. – Cons: Hard to implement. Not trivial to cover all techniques.
Deobfuscation Via Partial Evaluation ❖ Partial evaluator task is to split a program in two parts
– Static part: precomputed by the partial evaluator. (reduced to lowest terms) – Dynamic part: executed at runtime. (dependent on runtime
environment)
Two possible approaches: –Online: all evaluations are made on-the-fly. –Offline: Multipass. Performs binding time analysis to classify
expressions as static or dynamic, according to whether their values will be fully determined at specialisation time.
AST > SubTree Reduction > Deobfuscated code
1.Use JS for JS : Node + Esprima 2.ESPrima Parser > AST > http://esprima.org/demo/parse.html# 3.Traverse AST (Tree Walking) as the interpreter would 4.Reduce Sub trees by applying:
– Constant folding – Encapsulation – Virtual dispatch – ...
5.Rewrite the Code w/ escodegen 6.Hopefully Enjoy the new code
Start from Scratch, oh wait ^_^’! ❖ @M1el already wrote some AST Based deobf for JSObfu:
– https://github.com/m1el/esdeobfuscate https://github.com/m1el/esdeobfuscate/blob/master/esdeobfuscate.js#L109
Super Cool! Alas, is strictly related to JSObfu. We have: – Constant folding w binary ops: +,-,*,/,^ and partial unary ops ~ - .. (On simple
types) – String.fromCharCode execution – function returning constants are “evaluated” and Reduced to their return value – Partial “scope wise” implementation.
❖A very good starting point!
What we want ❖Improve Global Variables management
– "console","window","document","String","Object","Array","eval"... ❖Operations on Native Data (JSFuck … ) +[] .. ❖Global functions execution
– escape, unescape, String.*,Array.*.. ❖Variable Substitution w/ constants or globals
– var win=window; …. t=win > var win=window; …. t=window ❖Scoping and Function Evaluation
– Function evaluation according to variable scoping. Objects Management:
– var t={a:2}; var b=t.a; Possibly Deobfuscate all known obfuscators
Function Evaluation ❖Check for literal returned value
– function xx(){ return String.fromCharCode( 0x61)+"X" } – if (return val is constant )
substitute the value to the whole sub tree. – (JSObf DEMO)
❖Check for independent scope (Closed scope)
– If function is a closure > execute function in a JS environment. – ( Fun.js DEMO)
Dealing W/ Complex Data ❖ Hardest task so far
❖ Similar to Variable Substitution but harder ❖ Deal w/ Arrays and Objects ❖ Deal with dynamic properties
---------------------------- ❖ Ended up creating a scope wise state machine. :O ❖ Partially implemented
var h={w:2}; var t="a"; h[t]=3; var b=h.w+h[t]
JStillery DEMO
Conclusions This research aims to prove that although AST based deobfuscation
is not an easy task, it could lead to quite interesting results.
❖ Offline approach (multi pass + time analysis) could solve particular anti deobfuscation techniques.
❖ BTW Function Hoisting was not covered! In case someone wondered.
❖ Does it work? Depends on the goals, of course ;)
❖ ActionScript would be mostly covered (as ECMAScript compatible)
Contacts + Q&A
Mail: [email protected] Twitter: @wisecwisec Global Corporate Site: http://www.mindedsecurity.com Blog: http://blog.mindedsecurity.com Twitter: http://www.twitter.com/mindedsecurity YouTube: http://www.youtube.com/user/mindedsecurity
Thanks!