![Page 1: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/1.jpg)
Malware AttributionTheory, Code and Result
![Page 2: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/2.jpg)
Who am I?
• Michael Boman, M.A.R.T. project
• Have been “playing around” with malware analysis “for a while”
• Working for FireEye
• This is a HOBBY project that I use my SPARE TIME to work on
![Page 3: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/3.jpg)
Agenda
Theorybehind Malware Attribution
Codeto conduct Malware Attribution analysis
Resultof analysis
![Page 4: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/4.jpg)
Theory
![Page 5: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/5.jpg)
• Malware Attribution: tracking cyber spies - Greg Hoglund, Blackhat 2010
http://www.youtube.com/watch?v=k4Ry1trQhDk
![Page 6: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/6.jpg)
What am I trying to do?
Binary Human
Move this way
![Page 7: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/7.jpg)
What am I trying to do?
Binary Human
BlacklistsNet ReconCommand
and Control
Developer Fingerprints
TacticsTechniquesProcedures
Social Cyberspace
DIGINT
Physical Surveillance HUMINT
![Page 8: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/8.jpg)
What am I trying to do?
Binary Human
BlacklistsNet ReconCommand
and Control
Developer Fingerprints
TacticsTechniquesProcedures
Social Cyberspace
DIGINT
Physical Surveillance HUMINT
![Page 9: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/9.jpg)
BlacklistsNet ReconCommand
and Control
Developer Fingerprints
TacticsTechniquesProcedures
Social Cyberspace
DIGINT
Physical Surveillance HUMINT
![Page 10: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/10.jpg)
BlacklistsNet ReconCommand
and Control
Developer Fingerprints
TacticsTechniquesProcedures
Social Cyberspace
DIGINT
Physical Surveillance HUMINT
Actions / Intent
Installation / Deploym
ent
CN
A (spreader) / C
NE (search &
exfil tool)
CO
MS
Defensive / A
nti-forensic
Exploit
Shellcode
DN
S, Com
mand and C
ontrol Protocol,
Encryption
![Page 11: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/11.jpg)
BlacklistsNet ReconCommand
and Control
Developer Fingerprints
TacticsTechniquesProcedures
Social Cyberspace
DIGINT
Physical Surveillance HUMINT
Actions / Intent
Installation / Deploym
ent
CN
A (spreader) / C
NE (search &
exfil tool)
CO
MS
Defensive / A
nti-forensic
Exploit
Shellcode
DN
S, Com
mand and C
ontrol Protocol,
Encryption
![Page 12: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/12.jpg)
Steps
• Step 0: Gather malware
• Step 1: Extract metadata from binary
• Step 2: Store metadata and binary in MongoDB
• Step 3: Analyze collected data
![Page 13: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/13.jpg)
Step 0: Gather malware
• VirusShare (virusshare.com)
• OpenMalware (www.offensivecomputing.net)
• MalShare (www.malshare.com)
• CleanMX (support.clean-mx.de/clean-mx/viruses)
• Malware Domain List (www.malwaredomainlist.com/mdl.php)
![Page 14: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/14.jpg)
Step 1: Extract metadata from binary
![Page 15: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/15.jpg)
Development Steps
Core “backbone” sourcecode
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Runtime libraries
Time
Paths
MAC Address
Malware
Packing
Machine Binary
Source
![Page 16: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/16.jpg)
Development Steps
Core “backbone” sourcecode
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Runtime libraries
Time
Paths
MAC Address
Malware
Packing
Machine Binary
Source
![Page 17: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/17.jpg)
Development Steps
Core “backbone” sourcecode
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Runtime libraries
Time
Paths
MAC Address
Malware
Packing
Machine Binary
Source
![Page 18: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/18.jpg)
Step 1: Extract metadata from binary• Hashes (for sample identification)
• md5, sha1, sha256, sha512, ssdeep etc.
• File type / Exif / PEiD
• Compiler / Packer etc.
• PE Headers / Imports / Exports etc.
• Virustotal results
• Tags
![Page 19: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/19.jpg)
Identifyingcompiler / packer
• PEiD
• Python
• peutils.SignatureDatabase().match_all()
![Page 20: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/20.jpg)
PE Header information
![Page 21: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/21.jpg)
VirusTotal Results
![Page 22: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/22.jpg)
Tags
• User-supplied tags to identify sample source and behavior
• analyst / analyst-system supplied
![Page 23: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/23.jpg)
Step 2: Store metadata and binary in MongoDB
![Page 24: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/24.jpg)
Components• Modified VXCage server
• Collects a lot more metadata then the original
• Stores malware & metadata in MongoDB instead of FS / ORDBMS
![Page 25: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/25.jpg)
VXCage REST API• /malware/add
• Add sample
• /malware/get/<filehash>
• Download sample. If no local sample, search other repos
• /malware/find
• Search for sample by md5, sha256, ssdeep, tag, date
• /tags/list
• List tags
![Page 26: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/26.jpg)
Step 3: Analyze collected data
![Page 27: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/27.jpg)
Identifying development environments
• Compiler / Linker / Libraries
• Strings
• Paths
• PE Translation header
• Compile times
• Number of times a software been built
![Page 28: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/28.jpg)
Cataloging behaviors
• Packers
• Encryption
• Anti-debugging
• Anti-VM
• Anti-forensics
![Page 29: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/29.jpg)
Result
![Page 30: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/30.jpg)
Have I seen you before?
• Detects similar malware (based on SSDEEP fuzzy hashing)
![Page 31: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/31.jpg)
Different MD5,100% SSDeep match
![Page 32: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/32.jpg)
SSDEEP Analysis (3007)
![Page 33: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/33.jpg)
SSDEEP Analysis (3007)
![Page 34: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/34.jpg)
SSDEEP Analysis (851)
![Page 35: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/35.jpg)
Challanges
• Party handshake problem:
• 707k samples analyzed and counting (resulting in over 250 billion compares!)
• Need a better target (pre-)selection
![Page 36: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/36.jpg)
What compilers / packers are common?
1. "Borland Delphi 3.0 (???)", 54298
2. "Microsoft Visual C++ v6.0", 33364
3. "Microsoft Visual C++ 8", 28005
4. "Microsoft Visual Basic v5.0 - v6.0", 26573
5. "UPX v0.80 - v0.84", 22353
![Page 37: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/37.jpg)
Are there any unidentified packers?
• How to identify a packer
• PE Section is empty in binary, is writable and executable
![Page 38: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/38.jpg)
How common are anti-debugging techniques?
• 31622 out of 531182 PE binaries uses IsDebuggerPresent (6 %)
• Packed executable uncounted
![Page 39: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/39.jpg)
Analysis Coverage
Core “backbone” sourcecode
Tweaks & Mods
3rd party sourcecode
3rd party libraries
Compiler
Runtime libraries
Time
Paths
MAC Address
Malware
Packing
Machine Binary
Source
![Page 40: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/40.jpg)
Future
![Page 41: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/41.jpg)
What am I trying to do in the future
Binary Human
BlacklistsNet ReconCommand
and Control
Developer Fingerprints
TacticsTechniquesProcedures
Social Cyberspace
DIGINT
Physical Surveillance HUMINT
Expand scope of analysis+network +memory +os changes +behavior
![Page 42: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/42.jpg)
What am I trying to do in the future
• More automation
• More modular design
• Solve the “Big Data” issue I am getting myself into (Hadoop?)
• More pretty graphs
![Page 43: DEEPSEC 2013: Malware Datamining And Attribution](https://reader033.vdocument.in/reader033/viewer/2022042700/554ffba3b4c90579108b4f70/html5/thumbnails/43.jpg)
Thank you
• Michael Boman
• @mboman
• http://blog.michaelboman.org
• Code available at https://github.com/mboman/vxcage