abusing file processing in malware detectors for fun and proﬁt

Abusing File Processing in Malware Detectors for

Fun and ProfitSuman Jana and Vitaly Shmatikov

The University of Texas at Austin

• All about sophisticated detection and evasion techniqueso Polymorphism, metamorphism, obfuscation…

Modern malware research

• All about sophisticated detection and evasion techniqueso Polymorphism, metamorphism, obfuscation..

the world’s best malware detector

the world’s simplest virus

no changes to virus content

against

Topic of this talk

Malware researcher’s view of malware detection

malware detectorhoneypot

malware

internet gateway

malware detector

parseinfer file type

How malware detectors work in practice

internet

intranet

• Detection algorithms are type-specific• Parsing depends on file type

• Detectors may skip less vulnerable types like MPEG

efficiency

Why must malware detectors infer file types?

correctness

Parsing in malware detectors

macros

Archive files HTML filesWord documents

find macros extract filesremove whitespace

characters

Parsing in malware detectors

MS CAB

MS CHMJavaScript

PDFMS EXE

Adobe Flash

RTFMS PPT(X)

MS DLL7-zip JAR

Why must malware detectors parse before detection?

• Identify executable contento Macros in Word fileso Code segments in PE, ELFo JavaScript in CHM

• Normalize input to a form suitable for detectiono Decompresso Preprocess HTML

• Separate metadata from content

detectors must parse lots of file

formats

File-type inference and parsingtake place in two different places

internet

infer file type

if uninfected

malware detector user application/OS

parse difference =

potentialevasion

infer file type

Exhibit A (CVE-2012-1419)

• TAR files: ustar at offset 257• mirc.ini files: [aliases] at offset 0

TAR archive

eicar.com\0

header

initial 100 bytes contains the name

of first file ustar

eicar.com

• TAR files: ustar at offset 257• mirc.ini files: [aliases] at offset 0

TAR archive

[aliases].com\0

header

filename changes but the content is

unmodified ustar

Exhibit A (CVE-2012-1419)

eicar.com

Vulnerable detectors

Exhibit B (CVE-2012-1463)

Executable and Linkable Format (ELF)

offset 51 : little-endian2 : big-endian

header

Exhibit B (CVE-2012-1463)

offset 51 : little-endian2 : big-endian

header

2Linux ELF loader does not use this

byte but most malware detectors

Executable and Linkable Format (ELF)

Exhibit C (CVE-2012-1461)

eicar.tar

eicar.tar.1 eicar.tar.2

eicar.tar.gz

most detectors cannot parse such files correctly but

gunzip does

Exhibit D (CVE-2012-1459)

TAR archive layout

most detectors ignore

checksum field

length

header 1

header 2

uninfected file

checksum

Exhibit D (CVE-2012-1459)

TAR archive layout

length

header 1

wrong checksum

header 2

uninfected file most detectors ignore

checksum field

GNU tar ignores header with wrong checksum, extracts

malware

Many more attacks

• 45 different CVE reports for previously unknown evasion exploits

• 9 file formats• 13 applications

36 tested detectors – ALL vulnerable

You might be thinking…

Aren’t these well-known bugs?

Response from AV vendors

• OMG! These exploits completely bypass our detection engineso Patches are being pushed out

Aren’t these the same as browser content-sniffing bugs?

Content-sniffing bugs in browsers

• MIME content sniffing in Web browsers can be exploited for XSS attacks o First reported by Palant (2007) and Nazario (2009)

• Defense for browsers [Barth et al.]: prefix-disjoint signatures… does not work for malware detectorso Signatures for many formats that detectors must deal

with are not prefix-disjoint

These attacks affect only archive formats

Does this affect only archive formats?

• No, we have attacks against ELF, PE, MS CHM, MS Word, etc.

• What is an archive format anyway?o Many modern formats (e.g. PDF, MS Word) allow

embedding different types of content

Behavioral detection will save us

No, behavioral detection will not save you

infer file type

malware detector user application/OS

they must be exactly the same

For behavioral detection to work

here ….

infer file type

Possible solutions

• Write better parsers • On-access scanning

o Does not work in network/cloud detectors

• Better integration of malware detectors with applicationso Applications can share intermediate state after

parsing with cloud/network detectors

nonstarter

only works for archive files

abusing file processing in malware detectors for fun and proﬁt

file type detectors

content detectors

file types

file processing

file ustareicar

tar archive aliases

vulnerable types

tar archive eicar

Documents

abusing duqu, flame, miniflame

abusing target

abusing javascript for fun and profit

the dark side of the forsshe - botconf 2019 · the dark...

firm’s decision - proﬁt...

abusing twitter api & oauth implementation -...

abusing history · abusing history: 4 abusing history: a...

taesoo kim abusing performance optimization weaknesses to...

covert channel by abusing x509...

abusing, & exploiting dhcp

abusing firefox extensions

abusing belkin home automation devices

abusing marital relation20.09.11

extract me if you can: abusing pdf parsers in malware...

· 2016. 11. 3. · the team at cadbury fundraiser...

abusing firefox extensions - def con

abusing firefox extensions - security assessment

taesoo kim abusing performance optimization weaknesses...

abusing html 5 client-side storage

intrusion detection and malware analysis - malware...