![Page 1: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/1.jpg)
JAST: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript
Blackhoodie‘18 | 16/11/18
A. Fass, R. Krawczyk, M. Backes, B. Stock
![Page 2: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/2.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Ø Ex: crypto-trojans using JavaScript as payload
• Increase in the number / impact of cyber attacks
Context
2
![Page 3: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/3.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Ø Offloads the work to the user’s browserDrive-by download attacksRedirections
Crypto mining
Why JavaScript?
Can provide a basis for attacks
Enhance the interactivity of websites
Ø Windows hosts: execution after double-click
Malicious email attachments
3
![Page 4: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/4.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
AST-based analysis
Static analysis
Machine learning
Manual analysis➢ Plethora of attacks
➢ Obfuscation techniques
Motivations
4
![Page 5: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/5.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
• Identifies programmatic and structural constructs
• Provides code abstraction
• Identifies specific and recurrent malicious patterns
• Bypasses obfuscation
• Detects unknown malicious JS variants
Why an AST-Based Analysis?
5
![Page 6: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/6.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Methodology
Syntactic Units Frequency
Analysis
Parser 4-Grams Model
Random Forest
Features Extraction
Machine Learning
Model,
Predictions
Malicious Benign
6
![Page 7: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/7.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Extraction of Syntactic Units
var Euur1V = this [ "l9D" ] ("ev#333399al")
Identifier
ThisExpression
Literal
MemberExpression
Literal
CallExpression
VariableDeclarator
VariableDeclaration
Program
Source: esprima parserhttps://github.com/jquery/esprima
VariableDeclarator
Program
VariableDeclaration
Identifier Call
Literal
Expression
Member
Literal
Expression
This
Expression
var Euur1V = this [ "l9D" ] ("ev#333399al")
Identifier
Expression
Literal
Expression
Literal
Expression
VariableDeclarator
Declaration
Program
7
![Page 8: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/8.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Methodology
Syntactic Units Frequency
Analysis
Parser 4-Grams Model
Random Forest
Features Extraction
Machine Learning
Model,
Predictions
Malicious Benign
8
![Page 9: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/9.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
• Set of 4-grams: reflect short patterns
Frequency Analysis of 4-Grams
[ 15 , 0 , 3 , 0 , 3 , 0 , 12 , 4 , 8 ]
( 15 , 0 , 3 , 0 )
( 0 , 3 , 0 , 3 )
( 3 , 0 , 3 , 0 )
( 0 , 3 , 0 , 12 )
( 3 , 0 , 12 , 4 )
( 0 , 12 , 4 , 8 )2,000
143
677
152
806
257
[ …, 1/6, …, 1/6, … , 1/6, …,1/6, …,1/6, …,1/6, … ]
Serialisation
Frequency
Position in vector
1/6
1/6
1/6
1/6
1/6
1/6
Storage in a vector
4-grams production
var Euur1V = this [ "l9D" ] ("ev#333399al")
9
![Page 10: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/10.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Methodology
Syntactic Units Frequency
Analysis
Parser 4-Grams Model
Random Forest
Features Extraction
Machine Learning
Model,
Predictions
Malicious Benign
10
![Page 11: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/11.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
• Classifier: Random Forest, 500 trees
• Build a generic model to distinguish benign from
malicious JS samples
• Classification of unknown JS instances
Learning and Classification
11
![Page 12: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/12.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Experimental Data Sets
JS type Creation #JS Label Obfuscated
Emails 2017-2018 85,059 Malicious y
Microsoft 2015-2018 Benign y
Games N/A 2,007 Benign n
Frameworks N/A 434 Benign N/A
Atom 2011-2018 137 Benign y
17,668
12
![Page 13: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/13.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Experimental Data Sets
JS type Creation #JS Label Obfuscated
Emails 2017-2018 85,059 Malicious y
Microsoft 2015-2018 Benign y
Games N/A 2,007 Benign n
Frameworks N/A 434 Benign N/A
Atom 2011-2018 137 Benign y
17,668Why email attachments?
➢Common and effective way to spread JScript-Loader
➢Unique copy of the attachment for each recipient
12
![Page 14: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/14.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Experimental Data Sets
JS type Creation #JS Label Obfuscated
Emails 2017-2018 85,059 Malicious y
Microsoft 2015-2018 Benign y
Games N/A 2,007 Benign n
Frameworks N/A 434 Benign N/A
Atom 2011-2018 137 Benign y
3,500 3,500
81,559
Model
x 5
17,66814,168
Why email attachments?
Ø Common and effective way to spread JScript-Loader
Ø Unique copy of the attachment for each recipient
Why these benign samples?
Ø Does not confound obfuscation with maliciousness
Ø Nor unseen / unusual syntactic structures with maliciousness
12
![Page 15: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/15.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Detection Rate
Detection rate =
TP + TN
TP + FP + TN + FN
TP: True Positives TN: True Negatives
FP: False Positives FN: False Negatives
13
![Page 16: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/16.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Detection Rate
JS type Detection rate
Emails 99.46%
Microsoft 99.50%
Games 99.52%
Frameworks 99.03%
Atom 98.98%
Average benign 99.48%
13
![Page 17: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/17.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Detection Rate
Detection rate = TP + TN
TP + FP + TN + FN
TP: True PositivesTN: True Negatives
FP: False PositivesFN: False Negatives
JS type Detection rate
Emails 99.46%
Microsoft 99.50%
Games 99.52%
Frameworks 99.03%
Atom 98.98%
Average benign 99.48%
JAST shows that:
Ø Obfuscation ⇏ maliciousØ Plain text ⇏ benign
JAST leverages the differences in the obfuscation process for an accurate detection
13
![Page 18: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/18.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Related Work
Project Lexical AST Static Dynamic
JAST - ✓ ✓ -
CUJO [Rieck et al.]
✓ - ✓ ✓
PJSCAN [Laskov and Šrndić] ✓ - ✓ -
ZOZZLE [Curtsinger et al.]
- ✓ ✓ ✓
14
![Page 19: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/19.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Comparison with Related Work
Project FPR FNR Static Dynamic
JAST 0.52% 0.54% ü -
CUJO[Rieck et al.]
0.002% 5.6% ü ü
PJSCAN[Laskov and Šrndić]
16% 15% ü -
ZOZZLE [Curtsinger et al.]
0.00031% 9.2% ü ü
15
![Page 20: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/20.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Youden’s J Statistic: J = TPR - FPR
Ø Max value: optimal cut-off point between FP & FN
16
![Page 21: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/21.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Youden’s J Statistic: J = TPR - FPR ➢ Max value: optimal cut-off point between FP & FN
16
![Page 22: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/22.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Comparison with Other Approaches
Index FPR FNR FPR-Alexa 10k
0.7 0.00119% 2.16% 0.0126%
0.8 0 2.91% 0.00168%
0.9 0 4.34% 0.000671%
Comparison with Related Work
Project FPR FNR Classifier Approach
JAST 0.52% 0.54% RF AST
CUJO [Rieck et al.]
0.002% 5.6% SVM Lexical
ZOZZLE [Curtsinger et al.]
0.00031% 9.2% B-NB AST
17
![Page 23: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/23.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Comparison with Other Approaches
Index FPR FNR FPR-Alexa 10k
0.7 0.00119% 2.16% 0.0126%
0.8 0 2.91% 0.00168%
0.9 0 4.34% 0.000671%
Comparison with Related Work
Project FPR FNR Classifier Approach
JAST 0.52% 0.54% RF AST
CUJO [Rieck et al.]
0.002% 5.6% SVM Lexical
ZOZZLE [Curtsinger et al.]
0.00031% 9.2% B-NB AST
17
![Page 24: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/24.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
• Malicious instances with features not in the
training set
• Could be impacted by pollution attacks
Limitations
18
![Page 25: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/25.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
• Combining several classifiers
• Number of (different) AST nodes / 4-grams
• 4-grams presence/absence in addition to
frequencies
Further Defenses
19
![Page 26: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/26.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Outlook: Email-Based Model for Web JS
20
![Page 27: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/27.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Outlook: Email-Based Model for Web JS
20
HTML document / EK:
➢malicious iff at least one inline JS is classified as malicious ➢benign iff all inline JS are classified as benign
![Page 28: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/28.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Outlook: Email-Based Model for Web JS
20
![Page 29: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/29.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Outlook: Email-Based Model for Web JS
➢ Syntax features: core in classifying JS
20
Email-JS Web-JS
Unique copy to each recipient
Identify/exploit software vulnerabilities on the client’s side
- Variable/function name randomization - Data obfuscation - Encoding obfuscation
- Variable/function name randomization
![Page 30: J S : Fully Syntactic Detection of Malicious (Obfuscated ... · Why JavaScript? Can provide a basis for attacks Enhance the interactivity of websites ØWindows hosts: execution after](https://reader034.vdocument.in/reader034/viewer/2022052023/603883d1502a372d4761de1c/html5/thumbnails/30.jpg)
A. Fass, R. Krawczyk, M. Backes, B. Stock – Blackhoodie’18: JaSt – 16/11/2018
Ø Extraction of syntactic units
Ø Frequency analysis of 4-grams
Ø Learning and classification with random forest
Ø Performance evaluation
ü Accuracy of the predictions: 99.46%ü Low false negative rate: 0.54%ü Resilience to common obfuscation transformationsü Resilience to environment-dependent malware
• AST-based analysis to detect malicious JS
Conclusion
Thank you
21
Fass et al. JaSt: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript. In DIMVA 2018
https://github.com/Aurore54F/JaSt @AuroreFass