http(s)-based clustering for assisted cybercrime investigations
DESCRIPTION
Over the past several years there has been a noticeable rise in the number of reported targeted attacks, which are also commonly referred to as advanced persistent threats (APTs). This is seen by security experts as a landscape shift from a world dominated by widespread malware that infect indiscriminately, to a more selectively targeted approach with higher gain. One thing that is clear about targeted attacks is that they are difficult to detect, and not much research has been conducted so far in detecting these attacks. In this paper, we propose a novel system called SPuNge that processes threat information collected on the users' side to detect potential targeted attacks for further investigation. We use a combination of clustering and correlation techniques to identify groups of machines that share a similar behavior with respect to the malicious resources they access and the industry in which they operate (e.g., oil & gas). We evaluated our system against real data collected by an antivirus vendor from over 20 million customers installations worldwide. Our results show that our approach works well in practice and is helpful in assisting security analysts in cybercrime investigations.TRANSCRIPT
![Page 1: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/1.jpg)
HTTP(S)-Based Clustering for Assisted Cybercrime Investigations
Marco `embyte` Balduzzi, Vincenzo Ciangaglini and Robert McArdle
![Page 2: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/2.jpg)
OWASP AppSec Research Europe 2013 2
Ingredients
![Page 3: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/3.jpg)
OWASP AppSec Research Europe 2013 3
Who am I?
● Italian, (?)● M.Sc. in Comp. Engineering,
Ph.D. in System Security● 10+ years experience● Now with Trend Micro
Research● Bridge academia and industry
@embyte
![Page 4: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/4.jpg)
OWASP AppSec Research Europe 2013 4
RoadMap
● Intro● Target Attacks● Detection● System Overview● SPuNge● Experiments● Conclusions
![Page 5: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/5.jpg)
OWASP AppSec Research Europe 2013 5
Security is hot.. is burning!
● Symantec's Internet Security Threat Report:● Spam volume is decreased, but...● Web-based attacks increased 30%● 5,291 new vulnerabilities discovered in 2012● The number of phishing sites spoofing social
networks increased 125%● 42% increase in targeted attacks in 2012
![Page 6: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/6.jpg)
OWASP AppSec Research Europe 2013 6
Targeted
● Shift from a world dominated by widespread malware that infects indiscriminately, to a more selectively targeted approach
● Just-for-fun era is over?● Espionage, nation-driven, criminal organizations● Specific target / industry, e.g. the energy sector
![Page 7: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/7.jpg)
OWASP AppSec Research Europe 2013 7
![Page 8: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/8.jpg)
OWASP AppSec Research Europe 2013 8
![Page 9: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/9.jpg)
OWASP AppSec Research Europe 2013 9
![Page 10: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/10.jpg)
OWASP AppSec Research Europe 2013 10
How to detect them?● Difficult to detect● Generic detection → difficult to distinguish from
“traditional” widespread attacks● Same techniques, different methodologies
● Assist cybercrime investigations● To reduce the number of normal incidents down
to a more manageable amount for further in-deep analysis.
![Page 11: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/11.jpg)
OWASP AppSec Research Europe 2013 11
Idea!
● Identify groups of similar machines● Share a common network behavior
– With respect to the malicious resources they access/request
– e.g. exploit kits, drive-by-downloads, C&C servers
● Correlate location and industry information● Build “context”
![Page 12: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/12.jpg)
OWASP AppSec Research Europe 2013 12
General Overview
![Page 13: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/13.jpg)
OWASP AppSec Research Europe 2013 13
Working Data● HTTP & HTTPS network traces
– Population of ~20,000,000 installations
● Collected at proxy-level, client-side● Already-known malicious URLs
– Drive-by / web-based malware, fakeAVs, C&C servers, etc...
![Page 14: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/14.jpg)
OWASP AppSec Research Europe 2013 14
Pre-Processing
● Classification: Ignore parental controlled URLs● Network sampling: Keep a single “candidate” event
per network (Class B)● Event sampling: Remove multiple identical requests
from single machine – E.g. Botnet -controlled machines
● Duplicates identification: Remove URLs widely requested (e.g. >50 networks) → Widespread
● Whitelisting: Remove entries known to be useless (by previous iterations)
![Page 15: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/15.jpg)
OWASP AppSec Research Europe 2013 15
Step 1: Clustering
● Given a set of arbitrary elements, without prior information, identifies and assigns the elements to groups (called clusters)
● Patterns in the collected data (URLs)
● Group malicious URLs according to similar Hostname or Request (Path + Query String) – or both
![Page 16: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/16.jpg)
OWASP AppSec Research Europe 2013 16
Host vs Request Clustering
![Page 17: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/17.jpg)
OWASP AppSec Research Europe 2013 17
Distance Function
● Hostname– Levenshtein = distance between strings
– Robert → Roger : Robert → Rogert, Rogert → Roger
● Request– Path: Levenshtein
– Query String: Jaccard● # parameters in common (ignore values)
– http://[hostname]/path1.php?a=10&b=20&c=30
– http://[hostname]/path2.php?a=100&b=200
![Page 18: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/18.jpg)
OWASP AppSec Research Europe 2013 18
Bubble View?
● Red = Hostname● Blue = Request
● Violet = Both (?)
![Page 19: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/19.jpg)
OWASP AppSec Research Europe 2013 19
Step 2: Labeling and Merging
● Merge “similar” clusters, subsets● Assign label to clusters (H/R)
![Page 20: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/20.jpg)
OWASP AppSec Research Europe 2013 20
Step 3: Machines Mapping
● Map URLs into machines → IP addresses●
●
●
●
●
● Exercise: – M1 to which cluster belongs to? M2?
![Page 21: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/21.jpg)
OWASP AppSec Research Europe 2013 21
Step 4: Grouping
● Identify machines that belong to the same cluster (>=1).
● Machines that share a similar malicious behavior
● Scenario: Drive-by-download infection– 1. The victim is redirected to the malicious page
– 2. Served with the right exploit.
● 2 Clusters
![Page 22: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/22.jpg)
OWASP AppSec Research Europe 2013 22
Step 4: Grouping
● Looking for similar victims● Groups of machines (IPs) and clusters (URLs)
![Page 23: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/23.jpg)
OWASP AppSec Research Europe 2013 23
Last step: Analysis & Reporting
● Correlation: industry, country, etc...● Two type of analysis:
– Clusters: N+ machines, operating in the same industry or country, reaching our a cluster of similar URLs (1 cluster)
– Groups: N+ machines sharing C+ clusters
● 2 <= N,C <= 5● Automated reporting for threat analysts
![Page 24: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/24.jpg)
OWASP AppSec Research Europe 2013 24
Findings
![Page 25: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/25.jpg)
OWASP AppSec Research Europe 2013 25
Experiments settings
● Python 2.7 prototype, multi-core● Process data in daily batch (nighttime)● Two machines: Processing and Final Analysis● Evaluation run over 1 week of data
![Page 26: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/26.jpg)
OWASP AppSec Research Europe 2013 26
● Victims: – 3 Organizations, 3 distinct countries
– Operating in the technology sector (manufacture)
● Malware injection into memory space to avoid easy detection (persistent)
● Netblock → Russian Business Network, known to provide support for targeted attacks
![Page 27: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/27.jpg)
OWASP AppSec Research Europe 2013 27
● Victims:– 2 Organizations, Malaysia
– Oil&Gas Industry
● C&C servers reached out by r18nwn.exe● Malware for Industrial Environments● Domains → Registered by a person in China,
associated with Targeted Attacks Operations
![Page 28: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/28.jpg)
OWASP AppSec Research Europe 2013 28
Conclusions
● Increasing number of Targeted Attacks ● Difficult to spoil, similarities with traditional attacks
● SPuNge: Clustering-based techniques to identify potential targeted attack from threat data
● Future work– On-line processing– GPU-assisted processing
– Enhance clustering, more features (e.g. process name, hash)
![Page 29: HTTP(S)-Based Clustering for Assisted Cybercrime Investigations](https://reader033.vdocument.in/reader033/viewer/2022051817/548c59d3b47959dd0c8b6301/html5/thumbnails/29.jpg)
OWASP AppSec Research Europe 2013 29
Thanks!
@embyte