![Page 1: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/1.jpg)
Measuring the speed of the Red Queen’s Race
Richard Harang and Felipe DucauSophos Data Science Team
![Page 2: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/2.jpg)
Who we are
• Rich Harang @rharang; [email protected]
o Research Director at Sophos – PhD UCSB; formerly scientist at U.S. Army Research Laboratory; 8 years working at the intersection of machine learning, security, and privacy
• Felipe Ducau @fel_d; [email protected]
o Principal Data Scientist at Sophos – MS NYU Center for Data Science; specializes in design and evaluation of deep learning models
3
![Page 3: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/3.jpg)
Data Science @ Sophos
You?
2
![Page 4: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/4.jpg)
The talk in three bullets
4
• The threat landscape is constantly changing; detection strategies decay
• Knowing something about how fast and in what way the threat landscape is changing lets us plan for the future
• Machine learning detection strategies decay in interesting ways that tell us useful things about these changes
![Page 5: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/5.jpg)
Important caveats
5
•A lot of details are omitted for time
•We’re data scientists first and foremost, so…oAdvance apologies for any mistakesoOur conclusions are machine-learning centric
![Page 6: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/6.jpg)
6
"Now here, you see, it takes all the running you can do to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!“
(Lewis Carroll, 1871)
![Page 7: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/7.jpg)
Faster and faster…
7
Vandalism
1986 – Brain virus
1988 – Morris worm
1990 – 1260 polymorphic virus
1991 – Norton Antivirus, EICAR founded, antivirus industry starts in earnest
1995 – Concept virus
RATs, loggers, bots
2002 – Beast RAT
2003 – Blaster worm/DDoS
2004 – MyDoom worm/DDoS
2004 – Cabir: first mobile phone worm
2004 – Nuclear RAT
2005 – Bifrost RAT
2008-2009 – Conficker variants
Crimeware, weapons
2010 – Koobface
2011 – Duqu
2012 – Flame, Shamoon
2013 – Cryptolocker, ZeuS
2014 – Reign
2016 – Locky, Tinba, Mirai
2017 – WannaCry, Petya
![Page 8: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/8.jpg)
Two (static) detection paradigms
8
Signatures
• Highly specific, often to a single family or variant
• Often straightforward to evade
• Low false positive rate
• Often fail on new malware
Machine learning
• Looks for statistical patterns that suggest “this is a malicious program”
• Evasive techniques not yet well developed
• Higher false positive rate
• Often does quite well on new malware
Complementary; not mutually exclusive approaches
![Page 9: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/9.jpg)
A crash primer on deep learning
![Page 10: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/10.jpg)
A toy problem
12
Benign filesEach dot is a PE file
Malicious files
![Page 11: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/11.jpg)
What we want
11
Benign regions
Malware region
![Page 12: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/12.jpg)
Training the model
12
Entropy
String length
Malware scoreCompare with ground truth
Error-correct all weights
…and repeat until it works
Randomly sample the
training data
Ground Truth
![Page 13: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/13.jpg)
Recipe for an amazing ML classifier
13
A lot of training time
![Page 14: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/14.jpg)
14
But.
![Page 15: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/15.jpg)
…and six weeks later, we have this.
15
![Page 16: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/16.jpg)
Our model performance begins to decay
16
Some clusters have significant errors
Some clusters still mostly correct
![Page 17: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/17.jpg)
Machine learning models decay in informative ways
• Decay in performance happens because the data changes
• More decay means larger changes in data
17
![Page 18: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/18.jpg)
Model confidence
18
Alice replied: 'what's the answer?''I haven't the slightest idea,' said the Hatter.
(Lewis Carroll, 1871)
![Page 19: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/19.jpg)
Intuition: “borderline” files are likely misclassified
19
![Page 20: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/20.jpg)
Intuition: “distant” files are likely misclassified
20
![Page 21: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/21.jpg)
Do it automatically
21
• “Wiggle the lines” a bit
• Do the resulting classifications agree or disagree on a region?
• Amount of agreement = “Confidence”
https://arxiv.org/pdf/1609.02226.pdfFitted Learning: Models with Awareness of their LimitsNavid Kardan, Kenneth O. Stanley
![Page 22: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/22.jpg)
Do it automatically
22
Key takeaway:
• High confidence ≈Model has seen data like this before!
• Low confidence ≈ This data “looks new”!
https://arxiv.org/pdf/1609.02226.pdfFitted Learning: Models with Awareness of their LimitsNavid Kardan, Kenneth O. Stanley
![Page 23: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/23.jpg)
Looking at historical data with confidence
23
"It's a poor sort of memory that only works backwards," the Queen remarked.
(Lewis Carroll, 1871)
![Page 24: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/24.jpg)
Our model
Go from this… To this…
24
1024 Inputs
512 Nodes
512 Nodes
512 Nodes
512 Nodes
Output
![Page 25: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/25.jpg)
Using confidence to examine changes in malware distribution
25
• Collect data for each month of 2017 (3M samples, unique sha256 values)
• Train a model on one month (e.g. January)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan model
![Page 26: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/26.jpg)
Using confidence to examine changes in malware distribution
26
• Collect data for each month of 2017 (3M samples, unique sha256 values)
• Train a model on one month (e.g. January)
• Evaluate it on data from all future months and record the number of high/low confidence samples
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan model
(etc.)
![Page 27: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/27.jpg)
Look at change in high/low confidence samples
27
• Train January mode; count low-confidence samples for following months
• And for February
• And so on
• Remember: o Low-confidence = “Looks new”
![Page 28: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/28.jpg)
Same thing for high confidence samples
28
• Remember:o High confidence = “Looks like original
data”
![Page 29: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/29.jpg)
Both forms of decay show noisy but clear trends
29
![Page 30: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/30.jpg)
Estimate the rates with a best-fit line
58
![Page 31: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/31.jpg)
Examining changes within a single family
31
“I wonder if I've been changed in the night? Let me think. Was I the same when I got up this morning?”
(Lewis Carroll, 1871)
![Page 32: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/32.jpg)
Confidence over time for individual families
32
Collection of WannaCry/HWorld
samples
January 2017 Training Data
Deep
learnin
g m
od
el
Number of low confidence samples
February 2017 Training Data
March 2017 Training Data
Number of high confidence samples
![Page 33: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/33.jpg)
Confidence over time for individual families• Proportion of samples
for family scoring as high/low confidence vs model month
61
![Page 34: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/34.jpg)
61
Samples first appear in training data
Samples first appear in training data
![Page 35: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/35.jpg)
61
WannaCry/high confidence:dips as low as 70% after appearing in training data
Hworld/high confidence:Never less than 84% after appearing in training data
![Page 36: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/36.jpg)
61
WannaCry/low confidence:9% down to 0.2% after appearing in training data
Hworld/low confidence:1.3% down to 0.0008% after appearing in training data
![Page 37: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/37.jpg)
61
56% of WannaCry samples high-confidence before first appearance in training data; 99.98% detection rate in this subset
![Page 38: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/38.jpg)
Distance measures from training data
• Large distances = larger change in statistical properties of the sampleo New family? Significant variant of
existing one?
• Look at distances from one month to a later one for samples from the same family
38
![Page 39: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/39.jpg)
January 2017 to May 2017
39
• Changes in the feature representation of samples lead to changes in distance
![Page 40: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/40.jpg)
Distances to closest family member in training data
63
![Page 41: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/41.jpg)
41
“This thing, what is it in itself, in its own constitution?”
(Marcus Aurelius, Meditations)
Distance and new family detection
![Page 42: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/42.jpg)
Distance measures from training data
• Distance to the nearest point of any type in the training data
• Examine against model confidence
• Don’t need labels!
42
![Page 43: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/43.jpg)
Distances – July data to nearest point in January dataDrill into clusters potentially worth examining further.
• Mal/Behav-238 – 1468 samples
• Mal/VB-ZS – 7236 samples
• Troj/Inject-CMP – 6426 samples
• Mal/Generic-S – 318 samples
• ICLoader PUA – 124 samples
… And several clusters of apparently benign samples
67
![Page 44: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/44.jpg)
44
“Begin at the beginning," the King said, very gravely, "and go on till you come to the end: then stop.”
(Lewis Carroll, 1871)
![Page 45: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/45.jpg)
Conclusion
•ML models decay in interesting ways: this makes them useful as analytic tools as well as just simple classifiersoConfidence measures – population and family driftoDistance metrics – family stability, novel family detection
69
![Page 46: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/46.jpg)
Practical takeaways
• ML and “old school” malware detection are complementaryo ML can sometimes detect novel malware; compute and use confidence metrics
• The rate of change of existing malware – from the ML perspective – is slowo Retiring seems to be more common than innovation
• There are large error bars on these estimates, and will vary by model and data set, but…o Expect to see a turnover of about 1% per quarter of established samples being
replaced by novel (from the ML perspective) samples
o About 4% per quarter of your most identifiable samples will be retired
70
![Page 47: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/47.jpg)
Additional thanks to…
• Richard Cohen and Sophos Labs
• Josh Saxe and the rest of the DS team
• BlackHat staff and support
• … and John Tenniel for the illustrations
• Code + tools coming soon: https://github.com/inv-ds-research/red_queens_race
71
![Page 48: Measuring the speed of the Red Queen’s Race · 2018-08-09 · Measuring the speed of the Red Queen’s Race Richard Harang and Felipe Ducau Sophos Data Science Team. Who we are](https://reader034.vdocument.in/reader034/viewer/2022042300/5ecb043e64bfc95c8a4c8c25/html5/thumbnails/48.jpg)