thwarting emotet email conversation thread …...•scikit-learn python machine learning library...

30
Thwarting Emotet email conversation thread hijacking with clustering Olivier Coutu & Pierre-Luc Vaudry

Upload: others

Post on 03-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Thwarting Emotet email conversation thread hijacking with clusteringOlivier Coutu & Pierre-Luc Vaudry

Page 2: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

2

Olivier CoutuOperations Manager

Pierre-Luc VaudryR&D Director

Page 3: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

PART 1

A War Story:Email conversation thread hijacking

Page 4: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

4

IvanCargo Inc.

JaneShipping Company

Page 5: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

5

Page 6: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

6

Page 7: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

SUSPICIOUS

7

Page 8: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

8

IvanCargo Inc.

TomPharma Vision

Page 9: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

9

Page 10: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

10

SUSPICIOUS

Page 11: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Email conversation thread hijacking: how it works

● Compromised user,

emails siphoned out

● Content of emails replayed

● Attachment name

11

Page 12: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Email thread hijacking: A History

12

2017-05 2018-04 2018-07 2019-04 2019-08

NorthKoreanhackers

First time MO is

observed by ZEROSPAM

Emotetmakes

the news

Emotetis reportedto be using

this MO

We detect ourfirst sizeablewave usingclustering

Page 13: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Erratum regarding Emotet on 2019-08-02

The email conversation thread hijacking campaign samples we had from the

beginning of August turned out, after verification, not to be carrying Emotet. Over

time, we observed various other malware to be using email conversation thread

hijacking, such as Ursnif, QBot, and PredatorTheThief.

13

Page 14: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Suspicious cluster alert!

14

Page 15: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Time To Block

15

NEW

CAMPAIGN

LAUNCH

CLIENT

CONTACTS

US

30

MINUTES

Page 16: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Reducing the Time To Block

16

NEW

CAMPAIGN

LAUNCH

CLIENT

UNAWARE

& SECURE

CLUSTERING REACTION BY

ADJUSTING FILTERING

IN LESS THAN 20 MINUTES

Page 17: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Defending against email thread hijacking

17

Don’t whitelist

your contacts

Quarantine

suspicious files

Enable DMARC

verification

Page 18: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

PART 2

Under the hood:Clustering-basedcampaign detection

Page 19: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Why clustering for email filtering?

19

Clustering is the task of grouping objects that are similar (in some sense)

together into groups (called clusters).

An email campaign is a coordinated set of individual email messages that are

deployed across a specific period of time with one specific purpose.

A spam campaign is a particular case of an email campaign where the means

and/or purpose are not legitimate.

Page 20: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Our email clustering implementation

20

Page 21: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Spam cluster: filtering view

21

Page 22: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Spam cluster: clustering view

22

Page 23: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Inside the clustering pipeline

23

Input:

metadata

from last

12 minutes

Linear

combination

of similarity

scores

DBSCAN

clustering

algorithm

Cluster

analysis

Alignment

with

previous

clusters

Output

to GUI

CYCLE EVERY 2-3 MINUTES

Page 24: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Why DBSCAN over other algorithms?

● Input: precomputed

distance matrix

● Outliers: emails not part

of a campaign

● Chains of similarity inside

a cluster

24DBSCAN visualisation © Naftali Harris, 2012-2019

Page 25: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

How to evaluate and optimize

25

Adjust similarity

and clustering

parameters

manually

Collect a

metadata

corpus

Cluster with

manually

adjusted

parameters

Correct

clustering

results

Compute

Rand score

Optimize

parameters for

best Rand score

on corrected

clustering

Page 26: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

What’s coming?

26

IMPROVE

FILTERING

PUBLISH INDICATORS

OF COMPROMISE

BLOCK ENTIRE

CAMPAIGN

IMPROVE MONITORING OF

CAMPAIGNS OVER TIME

IMPROVE CAMPAIGN

CHARACTERISATION

Page 27: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Clustering: a generic option in your toolset

27

Detect diverse

emerging threats

Applicable to

many data

streams

Quickly

implemented

Page 28: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Acknowledgements

28

Vlad StamateSecurity Analyst

Luca NagyThreat Researcher

Page 29: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

References

DBSCAN• https://www.naftaliharris.com/blog/visualizing-dbscan-clustering/

• Scikit-learn Python machine learning library

Clustering emails• Qian et al. (2010). A Case for Unsupervised-learning-based Spam Filtering

• Chen et al. (2014). Clustering Spam Campaigns with Fuzzy Hashing

• Sarantopoulos, C. (2015). Identification and on-line incremental clustering of spam campaigns

• Patidar et al. (2016). Activity Detection from Email Meta-Data Clustering

• Smirnov (2018). Clustering and classification methods for spam analysis

Emotet timeline• https://unit42.paloaltonetworks.com/unit42-freemilk-highly-targeted-spear-phishing-campaign/

• https://www.zdnet.com/article/emotet-hijacks-email-conversation-threads-to-insert-links-to-malware/

29

Page 30: Thwarting Emotet email conversation thread …...•Scikit-learn Python machine learning library Clustering emails • Qian et al. (2010). A Case for Unsupervised-learning-based Spam

Questions?www.zerospam.ca