abstract - nc state universitymodusoperandi.csc.ncsu.edu/theses/vikram.pdf · abstract mulukutla,...

ABSTRACT

MULUKUTLA, VIKRAM. Wolfsting: Extending Online Dynamic Malware Analysis Systems byEngaging Malware. (Under the direction of Dr. Douglas S. Reeves).

Malware has evolved into a major threat to both personal and business computing, with professional

programmers being hired to create highly customizable and targeted malware packages. Malware analysis

research has also evolved from simple signature based schemes to sophisticated static, dynamic and

hybrid analysis methods that can detect and classify malicious behavior. A useful class of tools, termed

online dynamic malware analyzers, is used by researchers to generate detailed reports of malware binary

execution. These tools consist of virtualized or emulated environments wherein a binary executable is

run and every system call invoked by processes spawned by the executable within the analysis system

is recorded. It is desirable to combine the positive aspects of low overhead and ease of use found in

online dynamic malware analyzers with the key goal of eliciting more identifiable and traceable malicious

behavior. To this end, we present Wolfsting, a fully automated dynamic analysis system that incorporates

virtualization technology, kernel level system call hooking and a unique method of comparing system

calls from isolated executions of a malware instance. Wolfsting augments bare-bones operating system

installations normally used in online malware dynamic analysis systems to contain the exact environment

that malware processes look for, in terms of OS objects such as files, processes and system configuration

settings. Wolfsting also simulates a user attempting to remove malware from a system with the intent

of eliciting defensive behavior from malware such as disabling antivirus and anti-malware functionality.

Results show that by presenting such an environment to malware processes, Wolfsting is able to force

recent and infamous malware instances such as ZBot to execute more of their malicious codebase, with

a fixed analysis time and low runtime overhead.

Wolfsting: Extending Online Dynamic Malware Analysis Systems by Engaging Malware

byVikram Mulukutla

A thesis submitted to the Graduate Faculty ofNorth Carolina State University

in partial fulfillment of therequirements for the Degree of

Master of Science

Computer Science

Raleigh, North Carolina

2010

APPROVED BY:

Dr. Douglas S. ReevesChair of Advisor Committee

Dr. Peng Ning Dr. Xuxian Jiang

Dedication

To my family and friends.

ii

Biography

Vikram Mulukutla did most of his schooling in Bangalore, a city in the southern state of Karnataka

in India. He received his Bachelor of Engineering in Computer Science degree from the BMS College of

Engineering at Bangalore. He then worked for one year as an associate software engineer with IBM’s

India Software Lab, before arriving at the North Carolina State University, where he is currently pursuing

a Master of Science degree in Computer Science. His interests lie in low level programming, security and

mobile OS development.

iii

Acknowledgements

I thank Dr. Douglas S. Reeves, my advisor, for his inspiring thoughts and ideas, unwavering guidance,

and most of all for the creative freedom he granted in allowing me to come up with an idea and pursue it

to completion. I thank Dr. Peng Ning and Dr. Xuxian Jiang for agreeing to be on my thesis committee. I

thank all the members of the Cyber Defense Lab at NCSU, especially Young Hee Park, whose mentorship

and guidance has proven invaluable to me. I thank the Department of Computer Science and NCSU for

creating and maintaining such a beautiful and professional environment that is conducive to work and

research.

I thank my friends for their support and the good times we’ve had throughout this process. I

thank Priyank Kumar, who is also pursuing the Master’s thesis option here at NCSU, for all the lively

discussions during coffee and lunch breaks during the time when we were writing our thesis drafts. I

thank all those who have been involved with the creation and maintenance of Lyx, the LateX frontend

that made writing this thesis a breeze.

I sincerely acknowledge the work of giants whose shoulders this thesis stands upon.

I especially thank my family for their support and commitment to my education all the way to my

Master’s degree.

iv

Table Of Contents

List Of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List Of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Malware and Malware Defense . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Online Dynamic Malware Analysis Systems . . . . . . . . . . . . . . . . . . . . . 2

1.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Malware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.2 Malware Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3.3 Honeypots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3.4 Malware Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.4.1 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.4.2 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.5 Online Dynamic Malware Analysis Systems . . . . . . . . . . . . . . . . . 7

1.3.6 Virtualization terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Chapter 2 Background And Related Work . . . . . . . . . . . . . . . . 10

2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Recent and related work in static analysis . . . . . . . . . . . . . . . . . . . . . . 10

2.2.1 Techniques to counter static analysis - and countering methods . . . . . . 11

v

2.3 Recent and related work in dynamic analysis . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Categories of dynamic analysis . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1.1 Monitoring un-tampered malware execution . . . . . . . . . . . 13

2.3.1.2 Tampering with malware execution to force malicious behavior . 14

2.3.2 Using the output of dynamic analysis . . . . . . . . . . . . . . . . . . . . 15

2.4 Online Dynamic Malware Analysis Systems and Wolfsting . . . . . . . . . . . . . 16

Chapter 3 Wolfsting Motivation and Methodology . . . . . . . . . . . . 18

3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.1 Malware looking for certain resources . . . . . . . . . . . . . . . . . . . . 19

3.2.2 Malware exploiting outdated or very recent software . . . . . . . . . . . . 19

3.2.3 Malware persisting on the user’s machine . . . . . . . . . . . . . . . . . . 20

3.2.4 Focus on certain system calls . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.5 Advantages of multiple executions . . . . . . . . . . . . . . . . . . . . . . 21

3.2.5.1 Randomized identifier strings . . . . . . . . . . . . . . . . . . . . 21

3.2.5.2 Process injection behavior . . . . . . . . . . . . . . . . . . . . . . 21

3.3 Engaging Malware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3.1 Simulating a user attempting to remove the malware . . . . . . . . . . . . 22

3.3.2 Creating fake resources and settings . . . . . . . . . . . . . . . . . . . . . 23

Chapter 4 Wolfsting Design . . . . . . . . . . . . . . . . . . . . . . . 24

4.1 Virtual Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2 Guest OS Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.3 Analysis Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3.1 New Behavior Extractor . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3.2 Housekeeping/Future-Success System Call Filtering Component . . . . . 29

4.3.3 Resource Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

vi

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Chapter 5 Wolfsting Execution . . . . . . . . . . . . . . . . . . . . . 32

5.1 Wolfsting Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2 Two Runs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Chapter 6 Wolfsting Implementation . . . . . . . . . . . . . . . . . . 38

6.1 Virtualized Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.2 Guest OS Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.2.1 Kernel device driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.2.2 Userland program to simulate user actions . . . . . . . . . . . . . . . . . . 39

6.3 Analysis Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.3.1 New Behavior Extractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.3.2 Filtering Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.3.3 Resource Creator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Chapter 7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

7.2 Trojan Dropper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

7.2.1 Analysis by Resource Creation . . . . . . . . . . . . . . . . . . . . . . . . 44

7.2.2 Simulating user actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7.3 AVKiller (Agent2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47


7.4 Vundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


7.4.2 Random Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.5 TDSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7.6 Zeus/ZBot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56


vii

7.6.2 Random Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

7.7 Observations and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.7.1 Overhead analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.7.2 Trace output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.7.3 Registry activity bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7.7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Chapter 8 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 62

8.1 Limitations Of Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

8.2 Limitations Of Wolfsting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

8.3 Techniques to thwart Wolfsting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Chapter 9 Conclusions And Future Work . . . . . . . . . . . . . . . . 66

9.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

9.1.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

9.1.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

9.2.1 Scope for future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

9.2.2 Usefulness in Corporate Environments . . . . . . . . . . . . . . . . . . . . 70

9.2.3 Usefulness to Individual or Home users . . . . . . . . . . . . . . . . . . . . 70

9.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

viii

List of Tables

Table 7.1 Windows registry keys requested by ZBot . . . . . . . . . . . . . . . . . . 57

Table 7.2 New behavior, i.e., new registry values (boldface) accessed by ZBot after

creation of keys in Table 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 58

ix

List of Figures

Figure 1.1 A sample CWSandbox Analysis Report . . . . . . . . . . . . . . . . . . . 2

Figure 1.2 A sample Anubis Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Figure 1.3 Registry changes recorded in CWSandbox . . . . . . . . . . . . . . . . . . 7

Figure 1.4 File changes recorded in Anubis . . . . . . . . . . . . . . . . . . . . . . . . 8

Figure 1.5 Host, Virtual Machine, and the Guest OS . . . . . . . . . . . . . . . . . . 9

Figure 3.1 CWSandbox output displaying queries made by malware for information

on certain folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Figure 4.1 Design overview of Wolfsting . . . . . . . . . . . . . . . . . . . . . . . . . 27

Figure 4.2 ZBot injects code into different processes on every execution, thus system

calls from two different threads may be identical otherwise . . . . . . . . . 28

Figure 4.3 Certain parameters of the NtCreateFile Windows API are used as part

of a key to differentiate between two NtCreateFile calls . . . . . . . . . . 29

Figure 5.1 Wolfsting Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Figure 5.2 A Wolfsting VM guest with the malware binary, userland component and

kernel driver loaded into memory . . . . . . . . . . . . . . . . . . . . . . . 35

Figure 5.3 Two or more system calls with the same purpose . . . . . . . . . . . . . . 37

Figure 6.1 Resource creation component creating resources as per malware requests . 42

Figure 7.1 New behavior recorded in Trojan-Dropper.Win32.VB after Wolfsting cre-

ated the COM3\Debug key. . . . . . . . . . . . . . . . . . . . . . . . . . . 44

x

Figure 7.3 XML Output fromWolfsting shows how Trojan-Dropper.Win32.VB mod-

ifies a registry key to prevent Task Manager from executing . . . . . . . . 45

Figure 7.2 Trojan-Dropper.Win32.VB prevents task manager from executing by mod-

ifying a registry key that sets a dummy debugger process for task man-

ager. The dummy debugger process does absolutely nothing; task man-

ager is never launched. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Figure 7.4 Security descriptor of the Windows folder, displaying permissions for users

in the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Figure 7.5 Wolfsting XML output displaying some of the folders queried (in the

<filename> tags) by the Trojan.Win32.Agent2 malware . . . . . . . . . . 48

Figure 7.6 Wolfsting XML output shows Agent2 setting bad security descriptors for

antivirus folders, making them inaccessible. The exact security descriptor

has not been shown for brevity. . . . . . . . . . . . . . . . . . . . . . . . . 49

Figure 7.7 Security descriptor of an antivirus (AVG) folder nullified after modifica-

tion by the malware instance . . . . . . . . . . . . . . . . . . . . . . . . . 50

Figure 7.8 Registry key modified by Vundo to disable KAVP’s update service (Value

of zero disables updates) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Figure 7.9 Wolfsting output displaying modifications made to Internet Explorer set-

tings, specifically the way IE connects to the network and the Internet . . 52

Figure 7.10 Random identifier string (boldface) used by Vundo - the string changes

during every execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Figure 7.11 Wolfsting analysis approach for TDSS . . . . . . . . . . . . . . . . . . . . 54

Figure 7.12 Trace output (filtered to show only relevant activity) when running TDSS 55

Figure 7.13 ZBot Infection Rate in terms of number of reported infections worldwide

as measured by Microsoft . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

xi

Figure 7.14 ZBot worker thread in explorer.exe stealing username and password in-

formation. The API displayed here is ZwQueryRegistryValueKey, which

retrieves a value corresponding to a given a registry key. This behavior

can only be observed if the requested keys exist in the registry. . . . . . . 59

Figure 7.15 Random identifier strings used by ZBot . . . . . . . . . . . . . . . . . . . 59

xii

Chapter 1

Introduction

1.1 Malware and Malware Defense

Malware has evolved to become a major threat to both business and personal computing,

with the Internet as its primary medium for spreading. Malware research has also advanced to

counter this emerging threat, specifically in the areas of static and dynamic analysis. However,

end users - whether in a company network or at home - still largely depend on antivirus

software to eliminate malware from their systems. Antivirus companies hire analysts who

reverse engineer malware that has been trapped in honeypots to update antivirus databases

with signatures and automatic removal scripts. A useful class of tools to help in this process

are online dynamic malware analysis systems, virtualized environments where malware binaries

are run and closely monitored, generating detailed reports that include traces of almost every

operation executed by the malware binary. The aim of this thesis is to improve the output

of such systems by engaging malware during its execution. In this introductory chapter, we

explore online dynamic malware analysis systems, list some definitions that will help clarify the

context, and provide a short description of how the rest of the thesis is organized.

1

Figure 1.1: A sample CWSandbox Analysis Report

1.2 Online Dynamic Malware Analysis Systems

The purpose of online dynamic malware analysis systems is primarily to assist human mal-

ware analysts in reverse engineering malware binaries. They consist of closely monitored, vir-

tualized or emulated environments in which user submitted binaries are executed for a fixed

amount of time. These systems trace the execution of the binaries in their instrumented en-

vironments. The output of these systems consists of human and/or machine readable reports

that include comprehensive information on all traceable actions performed by the malware,

from low level native API calls to high level network activity. More recent tools have also

begun to represent information graphically to further aid analysts. Two well-known examples

of such tools are CWSandbox [1] and Anubis [2]. Figs. 1.1 and 1.2 show sample output traces

from these two systems respectively.

The motivation of these systems is that static analysis is limited by the progress in binary

encryption and packing technology, disassembly overhead and the inherent complexity involved

in creating a picture of malware intent from static code. The term packing is defined in section

1.3.4.1. With the advent of cheap virtualization and cloud computing technology, actually

running malware in an isolated environment to observe its behavior has become a viable option.

2

Figure 1.2: A sample Anubis Report

While the information provided in these reports is invaluable, the analysis systems themselves

are vanilla machines that usually have bare-bones installations of operating systems. The

motivation of this work is that more useful work can be done within the isolated analysis

system providing tangible benefits to malware analysts and end users.

1.3 Definitions

1.3.1 Malware

This thesis defines malware as any program that, without the user’s knowledge or permis-

sion, executes on the user’s machine with the following intent:

1. Stealing information: With the rapid expansion of personal computing and the Internet,

users frequently store their personal data on their machines. This includes usernames,

passwords, authentication tokens, financial and credit information. All such data is tar-

geted by malware; each new malware family is being increasingly targeted to a particular

type of data.

2. Causing harm to the system by modifying or destroying resources and settings: While

most malware prefer to use stealth techniques to hide from the user or antivirus software,

3

some destructive variants are designed to disable or completely destroy the functionality

of a system.

3. Using the system’s resources without authorization: Malware may also use a computer’s

resources for its own purposes without the knowledge of the user. A large network of

such compromised machines is called a botnet. Botnets are now listed as the top threat

to security on the Internet. Hundreds to thousands of machines are forcefully taken over

by malware and then slaved to either a peer to peer or centralized command network.

1.3.2 Malware Behavior

All programs interact with the operating system (OS) using the Application Programming

Interface (API) provided by the OS. We term malware behavior as the unique set of system calls

made by malware processes and threads on a system. These calls represent the interaction of the

malware with the system and user environment, and hence, malicious intent as described in this

thesis as the set of system calls that enable malware processes to extract information without

permission, to persists on the user’s system, to attempt to infect the network, to download

other malware, or to - without permission -use, modify or delete resources and settings.

1.3.3 Honeypots

A honeypot, as defined in [26] is a closely monitored system that attracts malware by openly

advertising unprotected services and potential attack vectors to fool malware into believing

that the honeypot is a legitimate target. With the rapid advent of virtualization and cheap

commodity hardware in recent years, honeypots have become increasingly cost-effective to build

and maintain. Several universities and security corporations maintain large networks of these

machines, aptly termed honeynets.

4

1.3.4 Malware Analysis

1.3.4.1 Static Analysis

Static analysis involves analyzing malware binaries without executing them, using binary

disassembly or some other technique to reveal the intent of the malware binary code. Mathe-

matical and statistical tools may be used to glean some pattern out of binary code to help in

malware classification or detection. Recent work using these methods is detailed in section 2.2.

The implementation of static analysis systems may require the knowledge of the following:

1. Underlying Hardware: Object code - when disassembled - is a set of processor instructions

and data. For any static analysis method, comprehensive knowledge of the instruction set of

the processor that the malware binary has been compiled for is required.

2. Operating System Executable Format: Comprehensive knowledge of the executable

format including storage layouts, data structures as well as precise knowledge of how the loader

component in the OS loads the binary image into memory is essential to static analysis systems.

Disassembly of object code is inherent to most static analysis implementation. This presents

a challenging problem, namely that of binary packing and encryption. Binary packing, as

defined [14], is the compression and/or encryption of executable code in malware binaries using

various algorithms. These algorithms, which are called packers when implemented, may also

include anti-debugging and anti-virtualization techniques preventing easy analysis. Unpacking

is the reverse process, wherein the compressed executable code is restored to its original form,

as defined in [13]. A lot of research has been done to discover a generic method to unpack

binaries that are packed with any algorithm as discussed in section 2.2. Static analysis may

involve the use statistical algorithms to define parameters and indicators whose measurement

is done to assess the system.

1.3.4.2 Dynamic Analysis

Dynamic analysis involves executing malware binaries in isolated environments (usually

virtualized or emulated) and observing their behavior in real time. Binaries or environments

5

are instrumented to provide detailed reports or traces of the binary’s operation. Data tainting

may also be used to track data as it passes between function calls, over the network, etc. There

are several types of dynamic analysis methods, some categories are listed below:

1. Un-tampered execution tracing: These methods involve running malware instances and

monitoring their execution, generating a trace report that details every operation per-

formed by the malware instance. The malware program’s execution path is not tampered

with or modified in any way once the binary instance begins execution. However, the

environment in which the malware is run (for example, network settings) may be varied,

as discussed in section 2.3.1.1.

2. Extracting or forcing malicious behavior: These methods involve forcing malware in-

stances to execute more of their binary code then they would in a normal execution

environment. This involves modifying the execution path of malware instances; several

approaches have been defined and are discussed in section 2.3.1.2. For example, this may

be accomplished by executing every branch path in a control flow graph of the malware’s

execution.

The output of dynamic analysis, i.e. the traces generated by the above approaches may then

be converted to an intermediate format and stored in databases that can be used to classify

malware or detect new malware. These systems may be categorized as follows:

1. Clustering: The output of dynamic analysis approaches as mentioned above is fed to

clustering algorithms that create clusters according to certain parameters. These clusters

may then be used to detect and/or classify new malware.

2. Graph Matching: The traces output by dynamic analysis approaches above are used to

construct call flow graphs (CFGs) that are subsequently used to detect new malware using

graph comparison algorithms.

3. Generation of intrusion detection signatures: Dynamic analysis output may also be used

6

Figure 1.3: Registry changes recorded in CWSandbox

to generate signatures for host based or network based intrusion detection, as discussed

in section 2.3.1.1.

A more detailed discussion of examples of these systems is presented in section 2.3.2. It is the

aim of this thesis to provide better traces to improve the accuracy of these systems.

1.3.5 Online Dynamic Malware Analysis Systems

Online dynamic malware analysis systems have been defined in section 1.2. The output of

these systems usually consists of a detailed execution trace and possibly other graphical aids

to highlight the behavior of an executable binary file in a particular operating system. For

example, CWSandbox runs Win32 binaries on virtual machines running Windows XP SP3 and

generates reports that include everything from network accesses to Windows registry operations

as shown in Fig. 1.3. Fig. 1.4 shows a trace displaying file activities performed by an executable

named sample.exe. This trace was generated by Anubis.

7

Figure 1.4: File changes recorded in Anubis

1.3.6 Virtualization terminology

1. Host: The underlying physical hardware and operating system on which the virtualization

software runs one or more virtualized machines.

2. Guest OS: The operating system installed in the virtual machines.

3. Snapshot: A virtual machine snapshot is an instantaneous image of the guest OS memory

and the state of the virtual hardware that is saved on the host machine. The snapshot

can be reloaded onto the virtual machine to resume execution from when it was taken.

Fig. 1.5 illustrates how the above components of a virtualized environment fit together.

8

Host Operating System

Virtual Machine

Guest Operating System

Virtual Machine


Virtual Machine


Hardware

Figure 1.5: Host, Virtual Machine, and the Guest OS

1.4 Thesis Organization

This thesis is organized into several modular chapters. The next chapter lists recent advances

in malware research as well as relevant work. Chapter 3 describes the methodology of Wolfsting,

the tool that this thesis describes. Each component of the Wolfsting design is described in

Chapter 4, and a single run of the entire Wolfsting process is described in Chapter 5. Chapter

6 describes the current prototype implementation of Wolfsting. Chapter 7 details some of the

results of running malware binary samples from various families through the Wolfsting process.

The thesis is concluded in Chapters 8 and 9, which list the limitations of this work, and provide

a concluding discussion of Wolfsting and future work respectively.

9

Chapter 2

Background And Related Work

2.1 Overview

The field of malware analysis and defense has seen exponential growth in this decade.

Advances in the area of static and dynamic analysis have resulted in better detection and

classification of malware, as well as making reverse engineering malware easier for security

professionals. This chapter will explore the current state of the art in the field and the relevance

of recent work to this thesis. We first delve into static and dynamic analysis, show how they

have led to the evolution of online dynamic malware analysis systems and finally bring Wolfsting

into the context, providing a basis for the next chapter, which details the motivation for this

work.

2.2 Recent and related work in static analysis

In their seminal work, Wagner and Dean [21] constructed call flow graphs from malware

source code to create models of application behavior that could be used for intrusion detection.

Ganapathy et al. [45] statically analyze C source code to detection possible buffer overrun

vulnerabilities; this is done by modeling C string manipulations (input/output string functions)

as a linear program.

10

Most static analysis methods today analyze malware binaries directly, due to the unavail-

ability of malware source code. Christodorescu et al. [23] discussed a code-obfuscation resilient

architecture to detect malware - the detection mechanism takes as its input a generic automaton

model of malicious code together with an annotated call flow graph of an executable program

and determines if the same malicious patterns exist within the automaton and the CFG. In

a subsequent work, Christodorescu et al. [24] use static analysis to create semantic behavior

models that are immune to code obfuscation techniques; these models are then used for de-

tection. In her PhD thesis, Zhang [36] uses static analysis to create behavioral patterns out

of instruction sequences in malware binaries and then performs clustering to provide detection

capabilities.

This research has also led malware authors to use increasingly complex techniques to counter

the aforementioned methods by employing techniques listed in section 2.2.1. Static analysis is

an important tool simply because it does not require the binary to be executed at any point;

thus the analysis system is completely trustworthy.

Static analysis has also been used in non-malware research. With regards to fingerprinting

for example, Brumley et al. [46] translate binary code to symbolic equations which are then

solved to find deviations in differences in implementations of the same protocol.

2.2.1 Techniques to counter static analysis - and countering methods

1. Packing - Polymorphic Code: As defined in [47], this technique involves compressing

and/or encrypting the executable code in a malware binary. A small code fragment

termed an unpacker runs when the binary is loaded on a victim machine and unpacks the

executable restoring the executable code in memory. Thus no two binaries of the same

malware implementation will share a signature. Fogla et al. [49] showed that detecting

an attack that uses polymorphic blending - using polymorphic techniques to blend in

with normal traffic - is an NP-complete problem. However, there exist approaches to

detection that are resilient to some degree to polymorphism. Chinchani and van den

11

Berg [50] detect malicious code inside network flows by tracing network traffic looking for

typical patterns such as NOP sleds; polymorphic code is detected by looking for evidence

of cycles, registers initialized outside and referenced inside loops etc. Li et al. [51] use a

heuristic approach to generate signatures for polymorphic worms by analyzing invariant

content in the worm binaries. [24], mentioned previously, describes a generic template to

capture the decrypting loop of a polymorphic malware instance. C. Kruegel et al. [52]

use structural information in executables to detect polymorphic worms; they construct

CFGs from exectuables and use graph isomorphism concepts to match these with CFGs

constructed from binary traffic in network flows.

2. Metamorphic code: As defined in [48], these techniques involve a malware program rewrit-

ing itself - and its payload if present - by inserting NOP instructions, swapping registers,

reordering control flow with indirect jumps and other such tweaks; thus each time a mal-

ware program copies itself to another location, it will look different both as a binary and

in memory during runtime, preventing signature based detection and hampering other

forms of static analysis. Chouchane and Lakhotia [53] use code scoring to identify the

metamorphic engine - i.e., the code that produces metamorphic variants of the same bi-

nary code; their method works with rule based engines. Zhang and Reeves [35] describe

MetaAware, a system that identifies metamorphic malware by using an algorithm that

quantitatively compares the similarity of instruction sequences that lead to a system call.

Fascinatingly, a lot of research effort has gone into accurately disassembling binaries that use the

above methods using dynamic analysis methods. Kang et al [13] describe Renovo, an unpacking

system that monitors real time malware execution to determine when a binary has completed

its unpacking routine. Martignoni et al. [14] describe OmniUnpack, a similar system. Dinaburg

et al. [44] describe Ether, an analysis system completely external to the guest OS in which the

malware executes and that is capable of unpacking binaries.

The limitations of static analysis, and the availability of cheap commodity hardware and

analysis tools has led to a great drive in research using dynamic analysis methods, as will be

12

discussed in section 2.3.

2.3 Recent and related work in dynamic analysis

2.3.1 Categories of dynamic analysis

Dynamic analysis has been defined in section 1.3.4.2. Tremendous work has gone into this

area of malware analysis, driven by the limits of static analysis, the advent of virtualization tech-

nology and the easy availability of cheap commodity hardware. We first discuss two categories

of malware analysis as defined in section 1.3.4.2.

2.3.1.1 Monitoring un-tampered malware execution

In this section we discuss methods that do not directly tamper with or modify the execution

path of the malware instance once such execution has begun. CWSandbox and Anubis fall into

this category as they run malware binaries in monitored virtualized and emulated environments

respectively and do not tamper with the malware execution.

Rieck et al. [16] execute botnet binaries repeatedly with varying network and host settings,

allowing the binaries to phone home, i.e., contact the command server specified in their configu-

ration. Signatures for such communication are then generated and used for intrusion detection

at the network level. Repetitive execution is done with the reasoning that a large spectrum

of the kinds of bot traffic will be captured. The fundamental idea is that the host execution

environment is modified to create new variations in the network traffic environment. This the-

sis emphasizes a similar concept in terms of how only the environment in which the malware

binary is executed is modified; Wolfsting tries to modify the system to render to the malware

process those resources and settings it is looking for, so that additional malicious behavior can

be traced directly on the host.

To contrast our work, we explore an interesting PhD. thesis by Cui [20] that describes a

system capable of detecting a variety of malware by identifying non-user initiated actions, such

13

as the first connection made by worms to the Internet, by tracking user actions such as mouse

and keyboard input and matching them to network connections. Thus the system attempts

to infer user intent to detect mismatching process behavior; Wolfsting, on the other hand,

tries to infer malware intent by simulating user actions and presenting the malware with the

environment it is looking for.

[16, 20] are examples of work that force malicious activity without modifying the execution

of the malware instance, without brute force disassembly, static trigger-discovery or branch ex-

ploration - techniques that may lead to possibly impractical overheads. While the aim of these

papers was to create better network intrusion detection signatures and provide advanced mal-

ware detection capabilities at the end host, the same concept of forcing malicious activity only

by modifying the environment in which it executes, while retaining practical and autonomous

aspects, is used by Wolfsting to augment online dynamic analysis systems, as discussed in

section 1.2.

2.3.1.2 Tampering with malware execution to force malicious behavior

This section discussed methods that directly tamper with or modify the execution path (in

memory) of a malware program once it has begun execution. Extensive research has gone into

automatically executing all possible code paths in a malware process. An oft cited work by

Moser et al. [11], involves creating snapshots of malware processes when they encounter for

example, a branch instruction. To explore both branches, the process snapshot is restored to

the original branch point after executing a branch. This allows for a comprehensive execution

of the malware binary code, with gains of upto 3000% in terms of code base coverage.

Another interesting and very relevant work, Brumley et al. [10], discusses the use of mixed

symbolic and concrete execution in a system that attempts to explore trigger based malware

behavior, i.e. malicious activity that is launched based on some trigger condition, such as the

arrival of a particular date or time. Crandall et al.[40] also attempt to detect time based triggers

using a concept called virtual time. The possibility of exploring nearly every execution path even

14

with the overhead experienced by implementations of these approaches is greatly attractive, and

complete code coverage is the aim of many dynamic analysis systems; this thesis is offered as a

cheaper, more practical means of offering more insight into malware activity.

The above approaches exhibit a high overhead in terms of both time as well as processing.

[11] uses upto 20 seconds as the timeout for a single branch of execution; there are several

hundred such branches in a single malware executable. [10] reports analysis time of 28 minutes

for a single binary of the MyDoom virus. Wolfsting executes binaries with a set time limit

of 2 minutes of execution time (with a little extra for overhead as explained in section 7.7.1),

however, note that that these systems provide far greater code coverage than Wolfsting can

possibly provide. Also, [11] uses virtual machine snapshotting to restore the process to its

original state at every branch after subsequent paths have been explored; Wolfsting needs to

restore the guest snapshot only once. Therefore, Wolfsting offers a more practical means of

eliciting malware behavior while compromising on code coverage.

2.3.2 Using the output of dynamic analysis

We now provide a discussion on the research effort put into using the output of the dynamic

analysis approaches in section 2.3.1. Kolbitsch et al extend their work [15] on Anubis to con-

struct behavior graphs out of malware execution traces output by the emulation environment

of Anubis. Their work uses data taint analysis to track arguments between system calls, and

graph comparison algorithms that match behavior graphs to detect and classify malware. In a

similar work [12] by Hu et al., the authors construct function call graphs and develop a database

that uses graph matching algorithms - with optimizations such as pruning - to detect malware.

This thesis may contribute to these systems by providing alternative execution paths that lead

to more unique call flow graphs or function call graphs that would help increase the accuracy

of malware classification as well as reduce false positive rates.

Another approach to using the output of dynamic analysis involves the use of various clus-

tering algorithms on intermediate representations of malware execution traces to construct

15

clusters that are later used to detect and classify malware. In a technical report that extends

their work on CWSandbox, K Rieck et al. [19] transform the trace output by CWSandbox to

an intermediate instruction set. The set of instructions for a single malware binary is termed a

behavioral profile, and clustering is performed on behavioral profiles for large samples of mal-

ware binaries to generate clusters that can be used for malware detection and classification.

Wolfsting may contribute to such a system by exposing more malicious behavior in the trace

output; increasing the accuracy of classification as well as the reduction of false positives.

Both graph matching and clustering is highly scalable and easy to implement; they do not

focus on a single malware instance’s particular execution run. We now delve into how the

above-mentioned systems may be made more accurate and efficient by improving the dynamic

analysis approach that they depend on.

2.4 Online Dynamic Malware Analysis Systems and Wolfsting

This thesis is built upon the ideas and work specifically in the area of dynamic malware

analyzers that operate in a virtualized or emulated environment, specifically providing a method

to force more malicious behavior out of an executable. We have already introduced CWSandbox

- a tool that is capable of running malware in an isolated environment, providing detailed traces

and reports on the malware binary’s execution and Anubis - similar to CWSandbox except that

binaries are run in an emulated environment built using QEmu [9]. CWSandbox inserts hooks

into Windows user API runtime dynamic libraries and loads these libraries into virtual machines

forming part of a cluster. User submitted binaries are run on these virtual machines in which

the hooks trace a wide range of calls in the Windows User API. There is no interaction between

the analysis system and the malware’s execution in Anubis and CWSandbox; this is logical, as

the primary goal of these systems is to be completely automatic.

Section 2.3 highlighted dynamic analysis approaches as well as the systems that use the

output of these approaches. Wolfsting is meant to provide a useful compromise between the

techniques that tamper with in-flight execution of malware versus those that run the executable

16

without any sort of modifications to the execution or environment, by setting up the resources,

settings and other objects on a system that a malware instance is looking for.

The motivation of this thesis is to build upon the work done in systems such as CWSandbox

and Anubis to offer a practical means of extracting more malicious behavior while retaining the

desirable features of requiring no human interaction and exhibiting relatively low overhead in

terms of processing power and analysis time. This motivation and the methodology used are

detailed in Chapter 3.

17

Chapter 3

Wolfsting Motivation and

Methodology

3.1 Overview

As discussed in Chapter 1, online dynamic malware analyzers have proven useful to both

analysts and end users by providing a complete trace of a malware binary’s execution. These

traces have also been successfully used as input to subsequent malware detection methods such

as the clustering algorithm described in [19]. The motivation behind this work is to improve

these systems by actively attempting to engage malware to extract malicious behavior from

it. This will not only prove useful to analysts in terms of possessing more relevant information

about the malware’s behavior, but will also provide better input to subsequent malware analysis

mechanisms. This chapter will describe the motivation behind and the methodology used by

Wolfsting to achieve this goal.

3.2 Observations

The motivation and idea behind this thesis is based on observations of 1) the execution

of malware in a virtualized environment, and 2) the trace output generated by CWSandbox

18

Figure 3.1: CWSandbox output displaying queries made by malware for information on certainfolders

and Anubis using certain malware samples. Examples of these observations that led to the

motivation for Wolfsting are:

3.2.1 Malware looking for certain resources

Malware authors write malicious software with specific payloads that target certain users,

software and private information. This translates to malware processes looking for certain

resources (files, registry keys, processes), which can be observed in the output of the online

dynamic malware analyzers; subsequent malware behavior is dependent on these objects being

present on the system. To bring out this additional behavior, the system calls that attempt to

locate these resources need to succeed, to allow new system calls to be invoked. For example,

Fig. 3.1 shows the output of CWSandbox for a malware sample that attempts to locate certain

antivirus installation folders.

3.2.2 Malware exploiting outdated or very recent software

A malware author frequently chooses not to re-invent the wheel. This is a phenomenon

that is exploited by [35]; malware authors re-use publicly available code or code from popular

malware families. Thus a new malware instance may infact be only a slight variation of another

19

malware family. This implies that malware binaries frequently look for older versions of software

that contain exploitable vulnerabilities. This makes sense from the perspective of the malware

author as well; exploitable systems are frequently ones that are not updated with the latest

software or software updates.

On the other hand, malware may also exploit features in recent software or recent updates

in software. Analysis systems may not have these software or updates. This can be observed

in CWSandbox traces; for example, a trojan called Zeus looks for the phishing filter setting of

Internet Explorer version 8; most analysis systems will have Internet Explorer 6 installed since

this version has the most number of vulnerabilities.

3.2.3 Malware persisting on the user’s machine

The aim of malware is to steal information or use the resources on a victim machine for as

long as it possibly can without being eliminated or in some cases even without being detected.

To this end, malware must react to attempts made by the user or antivirus software on the

user’s machine to remove the malware instance from the victim machine. Several malware

binaries have payloads that are specifically designed to hide malware activities, disable antivirus

functionality and security components in browsers. This behavior can only be observed if the

target software (or those specific software artifacts that the malware accesses) are present on

the system.

3.2.4 Focus on certain system calls

Systems such as those mentioned in section 2.3.1.2 that explore every possible branch path

a malware may take may be improved upon by making them context aware; i.e. selecting those

branches that will lead to malicious behavior. It was observed in the traces output by Anubis

and CWSandbox that the most interesting - i.e., most indicative of potential malicious behavior

- system calls are ones that try to modify resources and settings on the system, such as those

that set values in the Windows registry, those that write to a particular file, those that kill

20

processes etc., and read-queries on certain paths in the registry and certain locations on the file

system. If an analysis system could be made to focus on such resource-interaction system calls,

by filtering out other calls, unique behavior may be extracted. This was partly the motivation

for creating the filtering component as discussed in section 4.3.2.

3.2.5 Advantages of multiple executions

CWSandbox, Anubis and other such systems execute malware once and generate a trace

report. By submitting the same malware binary more than once, some behavior was observed

that could not have been seen in one execution. This behavior is listed below:

3.2.5.1 Randomized identifier strings

Malware uses randomized identifier names to prevent users and antivirus software from

locating the resources it creates. For example, the Vundo trojan uses a random 6 character

name for a Dynamically Loaded Library (DLL) that it creates. This name is different each

time the Vundo instance executes. Wolfsting is capable of comparing two executions of a

malware binary to determine which settings and resources that the malware creates have random

names or identifiers, i.e., these settings and resources are usually identical in content between

different executions, but have different names or identifiers. This is accomplished by allowing

the analysis engine to trace the difference in system calls between two identical unmodified

malware executions. This information will benefit analysts and end users in tracing malware

activity on their machines.

3.2.5.2 Process injection behavior

Process injection is a very common technique used by malware to inject code into other

processes for purposes of stealth. For example, a trojan may inject network communication

code into Internet Explorer to connect to malicious servers. Since the user’s firewall may be

configured to allow outgoing connections initiated from within browser processes, this malicious

21

activity goes undetected. The traces of CWSandbox showed that some malware vary the

distribution of such malicious code amongst processes in each run - in one run a bit of code

may be injected into process A, while in another the same code is injected into another process

B. An example of this behavior seen in the ZBot trojan is discussed in section 4.3.1.

3.3 Engaging Malware

Online malware analyzers such as CWSandbox and Anubis have proven useful to analysts

by providing comprehensive behavioral reports detailing almost every system call executed by

malware in their virtualized environments. The only variation in the traces across multiple

runs is usually caused by malware using randomized strings for filenames, process names and

other object names. Wolfsting is intended to further improve these systems by engaging malware

during its execution. The term ‘Engagement’ as used here translates to the following operations:

• Allowing malware to execute more of its code base by invoking everyday applications

simulating a user attempting to remove the malware.

• Forcing malware to execute more of its code base by creating resources that it attempts

to locate but that are not usually found on bare bones analysis systems

Wolfsting also attempts to recognize identifiers with random names. This and the above oper-

ations are explained in more detail in the following subsections.

3.3.1 Simulating a user attempting to remove the malware

The Internet, while the primary source of malware, is also a vast resource of information

on malware defense and protection. A user who realizes her machine is infected or an antivirus

that has detected the infection will normally follow a sequence of steps to remove the malware.

These include the killing of malware processes, modification of registry keys and files and other

actions. These actions are anticipated by malware authors and code is included in malware

binaries to ensure that the malware persists unaffected on the user’s system. This is interesting

22

malicious behavior that Wolfsting aims to invoke by simulating the user’s or antivirus actions.

The current implementation of Wolfsting - during the malware instance’s execution - simply

launches programs that a user would launch to monitor the system or to attempt to remove

malware processes, files and other objects from the system.

3.3.2 Creating fake resources and settings

The trace output from the Wolfsting driver when any application - benign or malicious -

is executed inside the guest reveals that the program attempts to locate several settings and

resources that do not exist at that point of time inside the guest environment. For example,

an FTP client program may look for previously stored settings in a local configuration file or

at a certain path in the Windows registry. Malware, when run in the guest OS, also exhibits

similar behavior, attempting to access specific settings or resources that normally do not exist

in analysis systems. Wolfsting attempts to collect all such requests made by a suspected or

known-to-be malicious binary and then proceeds to create these settings and resources in a

fresh guest OS snapshot. The aim of this exercise is to reveal more malicious behavior by

presenting the malware with exactly what it is looking for. The system call filtering mechanism

described in section 4.3.2 ensures that uninteresting calls – calls that do not indicate malicious

behavior, such as those that succeed at a later point in time, housekeeping calls - are excluded

from consideration when creating or simulating the required resources or settings.

23

Chapter 4

Wolfsting Design

The goal of Wolfsting’s design is to modularize each component to allow the use of multiple

platforms and tools while implementing the system. To this end, this chapter describes each

component of the Wolfsting environment separately, while Chapter 5 will illustrate how the

components fit together to create a working system.

4.1 Virtual Environment

The Wolfsting environment consists of a set of virtual machines each of which has a virtual

machine snapshot image that can be restored once a run is complete. Wolfsting uses one

virtual machine to perform a baseline run in which the execution path of the malware instance

is not modified in any way. Wolfsting may then be configured to run the malware instance in

one or more virtual machines in which the malware execution may be modified as necessary.

The virtual machines are controlled by a centralized component that performs the following

operations.

1. Starts and stops the guest operating systems

2. Restores snapshots after a single run: To ensure that malware execution is not affected

by previous executions of the same or different malware, each virtual machine is restored

24

to a snapshot image after a single run completes.

3. Installs the Wolfsting guest OS components into each virtual machine: The Wolfsting

guest OS components including the kernel module, logger, configuration and userland

component are loaded into the guest OS before malware is copied and executed.

4.2 Guest OS Components

The components of Wolfsting that operate in the guest OS running on the virtual machines

are as follows.

1. Kernel Hook Module: This is the core module of Wolfsting consisting of system call

hooks that are loaded into the guest OS kernel memory by the controller. The guest OS

snapshot does not have this module pre-loaded, allowing an updated version to be loaded

every time a new malware binary is to be processed. The Kernel hook module should be

capable of tracking malware process injection and creation, to separate malware activity

from benign process or system process activity.

2. Logger Component: This is a simple logger that also resides in kernel memory. It traps

messages from the kernel hook module and writes them to a file that is copied out of

the guest once the malware execution has completed. The logger component separately

records requests for resources or settings that do not exist during the baseline execution.

This is an important step, since it is these calls that will later result in resources and

settings being created for the second run.

3. Userland program launcher component: This is a user level application that runs programs

and tools and simulates actions that a user would take in order to remove malware from

a system. This includes running tools that display and modify system information, and

monitoring software such as Registry Monitor and Process Monitor.

25

4.3 Analysis Engine

The analysis engine consists of two main components:

1. New behavior extractor: Various combinations of traces may be fed into the analyzer.

For example, one trace may include output of a virtual machine with no modifications to

the malware instance’s execution path, while another trace may be from a run where a

certain set of resources were created beforehand. The new behavior extractor compares

these traces and extracts new system calls that were generated as a result of modifying

the malware’s execution path.

2. Housekeeping/Future-success system call filtering component: Not all resource requests or

settings requests will lead to malicious behavior. Calls that are made with slightly differing

arguments that fail at one stage and succeed later (see section 4.3.2), housekeeping calls

and other non-malware specific calls should not be considered by the resource creation

component. These calls must be filtered out to ensure the quality and uniqueness of the

extracted new behavior.

The above components are illustrated in Fig. 4.1 and explained in greater detail in the

following subsections.

4.3.1 New Behavior Extractor

Malware behavior has been defined in Chapter 1 as the set of system calls that a malware’s

processes and threads execute during a single execution run. The Analysis Engine takes as input

the system calls listed in the trace output of a single run of a malware instance in the Wolfsting

guest OS. Each run may have a different initial configuration - for example, in one run certain

resources may be created; in another an attempt to load a kernel driver may be rejected, while

a third run may allow unmodified malware execution. The new behavior extractor component

extracts new behavior, i.e., new system calls from one trace taking behavior from another trace

26

Trace (XML)

Trace (XML)

VM (Unmodified Execution)

VM (Modified Execution )

Future-Success/Housekeeping

system call filter

Manually created filter list (System calls that succeeded at a later

point or whose exclusive purpose is

housekeeping)

New Behavior Extractor

Resource/Setting Creator

Wolfsting driver

Create/Simulate resources /settings

No-create List

Analysis Engine

Controller

(vmrun scripts, userland programs )

Wolfsting driver

Figure 4.1: Design overview of Wolfsting

as a baseline. The algorithm for new behavior extraction is not trivial, due to the following

factors:

1. Multi-threaded, divide-and-work strategies employed by modern malware. For example,

ZBot distributes several tasks into several different threads, and each thread is executed in

a random process’s address space. Thus Wolfsting must recognize that system calls made

by different processes in different executions may actually indicate the same behavior.

Fig. 4.2 illustrates the challenge of recognizing that two system calls are the same.

2. Randomness of malware strings. The most common way to identify that a system has

been infected was to check for certain strings such as filenames, registry key names,

process names etc. However, most modern malware use randomized names that change

every time a new system is affected. Wolfsting does not attempt to recognize this random

behavior. Instead, any new system calls with randomized parameters are reported as

new behavior; this will allow the analyst/user to recognize which resources or settings

may have randomized identifiers or what other string-randomness is associated with a

27

Malware

Explorer.exe

Guest

Code injected

System call to set autorun Key (ZwRegistrySetValue)

Malware

Svchost.exe

Guest

Code injected

System call to set autorun Key (ZwRegistrySetValue)

Figure 4.2: ZBot injects code into different processes on every execution, thus system calls fromtwo different threads may be identical otherwise

particular malware instance.

3. Changes in environment: This is the most difficult factor in identifying behavior that

is not new, i.e., additional system calls invoked due to a software environment or the

creation of a fake resource or setting. While the memory image of the OS, the hard disk

contents and even the system time information is exactly the same in two runs of the

same malware, interaction between the malware instance and the network or the Web

itself may cause some change in the behavior of the malware. Wolfsting cannot currently

recognize the difference between new behavior caused by this factor and new behavior due

to Wolfsting-caused modifications. This problem has not been observed in most of the

experiments carried out with the Wolfsting prototype; malware samples have a far longer

lifetime than the servers they connect to since malware supporting domains are taken

down daily; and those that do connect to servers do not change their behavior between

two executions that are spaced only a few minutes apart.

Thus the new behavior extracting algorithm needs to take into account the aforementioned

factors. To match system calls between separate executions, a unique key is created for each

call that depends on parameters whose combination can be expected to be unique to a single

28

��

NtCreateFile(

FileHandle, DesiredAccess,

ObjectAttributes, FileAttributes,

ShareAccess, CreateDisposition,

CreateOptions,

);

Figure 4.3: Certain parameters of the NtCreateFile Windows API are used as part of a key todifferentiate between two NtCreateFile calls

execution. For example, in the current implementation of Wolfsting, for the Windows NtCre-

ateFile call, Wolfsting uses a combination of parameters as shown in Fig. 4.3 to create a unique

key that allows Wolfsting to differentiate a system call between two isolated executions. Thus,

to extract new behavior, the algorithm simply compares keys to see if any parameter is differ-

ent, i.e. it identifies new system calls invoked during an execution when compared against a

baseline, unmodified execution.

4.3.2 Housekeeping/Future-Success System Call Filtering Component

The Analysis engine requires a filtering component that determines which resources and

settings need to be created and which do not. While the initial motivation for this component

was discussed in section 3.2.4, other factors necessitate its existence as follows:

1. The fact that even benign programs seem to attempt to access several resources and

settings that do not exist on every system. This is because every OS needs to do some

housekeeping before and during the execution of programs. These calls do not relate

to malware activity. Examples of such calls are those that look for execution options

for a particular executable, look for language versions of an executable, compatibility

information checking etc.

2. Some calls succeed at a later point during the malware’s execution in a single run. For ex-

29

ample, operating systems have configurable paths where files and settings maybe located.

A program may follow a search path order to locate a setting or resource. It is necessary

for the filtering component to recognize that a resource that has not been found in one

location may have been found in another at some point later in the trace, as the program

parses all possible search paths. The filtering component simply parses the trace output

of a baseline run and eliminates calls that fit into the categories explained above, allowing

the resource creation component to create only those resources or settings that may lead

to new malware execution behavior.

It is difficult to recognize which calls are housekeeping calls and which are not. The filtering

component contains a manually created list against which it matches system calls found in the

traces. This list was created by observing calls made during the execution of benign applica-

tions such as those found in the Windows system32 folder, browsers etc., and through some

research on Microsoft’s Developer Network [54]. Note that this list is minimalistic in the cur-

rent implementation of Wolfsting; very deep expertise in Windows would be required to create

an accurate list, due to the sheer number of documented and undocumented Windows system

calls.

4.3.3 Resource Creation

The resource creation component is responsible for setting up the environment in terms of

system settings as well expected software installations by using as its input the trace from an

unmodified malware execution run. As explained in section 4.3.2, the filtering component is

required by the resource creation algorithm to prevent creating unnecessary resources that would

either modify the execution of the malware in a way that would not result in additional malicious

behavior or would prevent the malware from functioning properly. For example, whenever an

executable image, whether a program or a dynamic library is executed in Windows, a check is

made to see whether there any special execution options or compatibility issues that need to

be addressed before the program loader takes over. These options do not always exist; in fact,

30

most programs run with default settings that apply to all programs when no additional options

are specified. Wolfsting must not create fake entries in these settings; the malware’s execution

may be unnecessarily modified without producing any new malicious behavior. Wolfsting must

deal with requests for various types of files, directories, processes and other OS objects. The

assumption here is that most malware will request and manipulate resources without checking

their authenticity. As this falls into the category of implementation issues, a more detailed

description can be found in section 6.3.3.

4.4 Summary

This chapter dealt with the design aspects of Wolfsting, and described in detail the design

philosophy behind the various components that make up the Wolfsting system. To bring focus

over the whole system and provide more detail as to how everything fits together before delving

into implementation issues, we explore how a malware binary is processed in the Wolfsting

system in Chapter 5.

31

Chapter 5

Wolfsting Execution

5.1 Wolfsting Processing

The Wolfsting process consists of at least two executions of the malware binary as shown

in Fig. 5.1. After a single execution, the guest OS snapshot is restored, wiping all traces of the

execution. During the first execution, the malware binary is run without any modification to

the environment; this is termed a baseline run. This is a data collection run where the logging

component records all system calls made by malware processes and threads and outputs them

in a trace file. The second execution is preceded by the activation of the resource creation

component of Wolfsting in the guest OS. This component creates all the resources (registry

keys, files, directories) requested by the malware in the first run.

1. The Wolfsting driver is copied into the guest virtual machine and is loaded into memory.

The system call hooks are installed and the logging mechanism is setup to record system

call information.

2. If this is a second run for the malware instance, all non-existent resources requested in

the previous run are created or simulated.

3. The malware instance is run in the guest machine. Internet access is unrestricted; the

32

local network is protected by rendering access to the gateway and DNS server alone.

4. Various tools are run to simulate user activity by the Wolfsting userland program launcher.

5. All malware process activity is recorded, including threads injected into other processes.

If this is the first run, all resources requested but not present on the system are noted in

the trace.

6. All calls leading the kernel code execution such as the NtLoadDriver API in Windows

machines are rejected. Wolfsting may be configured so that the NtLoadDriver call may

be allowed to succeed in the baseline run, and then rejected in the second run, if it is

necessary to observe the difference in the traces due to the malicious driver being loaded

into memory.

7. The trace output is communicated to the host machine and the guest machine is reset to

its clean snapshot.

8. If this is the first run, the trace output is run through an analysis engine that decides what

resources need to be created to evoke additional responses from the malware instance. A

program is invoked on the guest machine to create or simulate these resources and steps

1 to 7 are repeated as a second run.

9. The trace output from two runs is fed into the new behavior extractor that is part of the

analysis engine.

Two serial runs are required to analyze the malware, due to the reasons mentioned in section

5.2 – specifically, calls to locate, query or modify certain settings or resources may succeed at

a later time, and hence the entire unmodified run must be analyzed to filter out such calls

before creating those resources or settings. Fig. 5.2 presents a detailed view of the guest OS

running inside the virtual machine. System calls executed by the malware are intercepted by

Wolfsting’s kernel driver component. The logger component is invoked during each system call

interception to record important and relevant information about that call. At the end of a run,

33

Controller

Guest

Driver (Hooks)

Guest

Driver (Hooks)

Analysis Engine3. Resource

Creation Information

1. Execute Malware

2. Driver XML

Output

5. Create Resources and Re-execute

Malware

4. Restore Snapshot

Malware Malware

Baseline execution Second execution

6. Driver XML Output

7. New behavior (New system

calls)

Figure 5.1: Wolfsting Processing

this trace information is transferred out of the guest and into the host machine for analysis by

Wolfsting’s analyzer component. All attempts made by the malware to load a driver into kernel

space are intercepted and rejected. However, for the sake of providing a trivial analysis to the

user or analyst, the baseline execution instance may allow the driver to load, while the second

execution blocks the driver as normal. The traces from these two executions may be compared

by an analyst to see if there is some behavior that is hidden in the baseline execution, and that

shows up in the second execution.

34

Malware

Hook Functions Logger

Guest OS Kernel

Guest OS Userland

System Calls

Driver

BLOCKED

Load Driver Attempt

XML Output

Userland program launcher

(NtLoadDriver in Windows)

Figure 5.2: A Wolfsting VM guest with the malware binary, userland component and kerneldriver loaded into memory

5.2 Two Runs

Wolfsting executes the malware twice serially because it needs to collect certain information

whose state may change over the course of a single execution. Figs. 5.3a and 5.3b illustrate the

necessity of a second run. Most operating systems use a set of environment settings that system

calls need to parse before loading libraries or executing other programs. For instance, to load

a certain library into memory, the system needs to first locate the library in secondary storage.

This requires the parsing of possible locations where the library may be located. To Wolfsting,

this means that a system call that fails due to a missing resource may succeed subsequently;

however, whether it does succeed and the timing of this success is impossible to predict. This

is shown in Fig. 5.3. The Windows operating system uses a store of system and application

information and settings called a registry. Applications need to parse the registry - organized as

a tree - to search for certain settings; these may be entered at multiple locations in the registry

directory depending on the context they apply to. For example, the registry contains settings

for individual users as well as the entire system. Applications may try to read or modify settings

at all such locations; some of these attempts may fail and others may succeed. Again, it is not

35

possible to determine which of the calls will succeed and when; two non-parallel executions are

required.

Having more than two runs in the Wolfsting process seems like a good idea; at each step,

more malicious behavior may be extracted. However, looking at traces from multi-run exper-

iments with real world malware, we noted that most resource or setting queries are terminal,

i.e. the first query for a registry key or a file or other resources usually has all the information

to locate the resource. Once the resource has been found in the second execution, all behavior

due to the presence of the resource is captured, rendering little benefit from additional runs.

This point is discussed in further detail in section 8.3.

We have discussed in this chapter how a single malware binary is put through the Wolfsting

process. In the next chapter, we delve into the implementation of Wolfsting.

36

<Application Directory >\<SystemDLL .DLL>

Windows\<SystemDLL .DLL>

Windows\System32\<SystemDLL.DLL>

Application thread attempting to locate a system library

(Dynamically Linked Library)

Failed Failed

Success

(a) An application attempts to load a library file into memory. It searches sequen-tially for the DLL in the paths listed in the PATH environment variable. Eachattempt to locate the DLL is actually a single system call that tries to open the DLL(ZwOpenFile) using a possible path. Wolfsting must note the successful third calland ignore the first two. Thus during the first execution run, Wolfsting should notattempt to create the DLL, because the actual DLL is present and will be loadedlater on.

HKCU\Software\Policies\Microsoft \Windows\Safer\CodeIdentifiers \TransparentEnabled

HKLM\Software \Policies\Microsoft\Windows\Safer\CodeIdentifiers \TransparentEnabled

Application thread attempting to locate a certain registry key

Failed

(b) An application attempts to locatea registry key which may exist witherin the HKEY_CURRENT_USER orHKEY_LOCAL_MACHINE subtrees ofthe Windows Registry. The second call tolocate and open a handle to the key succeeds.If Wolfsting attempted to create the key duringthe first run, it would cause unnecessarydeviation in the execution path of the malwarebinary.

Figure 5.3: Two or more system calls with the same purpose

37

Chapter 6

Wolfsting Implementation

The current implementation of Wolfsting consists of the following components, correspond-

ing to the design specified in Chapter 4.

6.1 Virtualized Environment

The host machine was an Intel Core 2 Duo 3GHz processor with 4GB RAM and Windows

Vista as the host OS. VMWare images with Windows XP SP3 were used as guest machines.

This was the hardware and software used to evaluate Wolfsting.

6.2 Guest OS Components

6.2.1 Kernel device driver

System call hooks implemented by Wolfsting were installed as part of a device driver based

on the Regmon source code released by SysInternals [41]. The system call hooks are installed by

swapping pointers in the Windows System Service Dispatch Table (SSDT). Over 40 Windows

Native API functions are hooked inside the driver. Note that using inline hooks - a technique

wherein the first few bytes of in-memory system functions are overwritten to jump to hook

functions - is more secure (but less portable) than using the SSDT pointers technique. System

38

calls that fail during the baseline run due to a resource or setting not being present in the guest

OS are recorded separately (in separate XML tags) to allow the resource creation component

to create them for the second execution.

Most modern malware programs spawn multiple processes and threads and even inject code

into other processes. The driver hooks all the native API calls that can lead to creation of new

processing contexts such as NtCreateProcessEx, NtCreateThread etc., allowing Wolfsting to

the track malware execution across all these threads. Comparing system calls across between

process and thread contexts from two different, isolated executions is accomplished by abstract-

ing away process and thread information and using only resource identifiers (file names, registry

key names) to compare system calls - this is explained in section 4.3.1. Process injection in

Windows is almost inevitably done using the CreateRemoteThread userland API, which in the

kernel translates to NtCreateThread with a target process id specified that is different from the

process id of the process that invoked the userland API call. This information is used to trace

remotely injected threads.

6.2.2 Userland program to simulate user actions

This component is a Win32 program that runs applications that would normally be executed

by a user trying to analyze or disinfect a malware-infected machine. A separate program to

launch these applications renders it easy for the kernel hooks to isolate system calls made by

these applications that are not malware related. Section 7.2 details a malware binary that

prevents such applications from launching using a subtle Windows registry trick.

6.3 Analysis Engine

6.3.1 New Behavior Extractor

This component is implemented as a .NET executable that takes in two XML traces as

input and reports any new system calls as described in section 4.3.1. The program uses the

39

following algorithm to accomplish this task.

1. The XML output trace of the first execution trace consists of entries each of which cor-

responds to a single system call. As explained in section 4.3.1, these system calls are

converted to keys that are unique to a single execution of the malware instance.

2. All the keys generated in step 1 are added to a hashtable H1.

3. The XML output trace of the second execution is converted to keys and added to a

hashtable H2.

4. For each key Ki in H2, if Ki is not present in H1, the system call represented by Ki is

added to the output as new behavior.

6.3.2 Filtering Component

This component is implemented as a .NET executable that parses the XML trace output

of a baseline Wolfsting execution trace and eliminates housekeeping system calls that look for

non-existent resources, or calls that succeed at a later point in the trace as illustrated in figure

5.3. This is accomplished using the following algorithm:

1. The XML output trace of the first execution trace consists of entries each of which cor-

responds to a single system call. As explained in section 4.3.1, these system calls are

converted to keys that are unique to a single execution of the malware instance. The keys

also include the return status of the system calls.

2. All keys generated in step 1 are added to a hashtable H1.

3. For each key Ki in H1, a check is made if the key matches a system call present in a

manually created filter list F. String matching and regular expression matching are used to

accomplish comparison between the parameters in keys. If the system call corresponding

to Ki is found in the filter list, Ki is deleted from H1.

Step 3 above mentions a filter list, the creation of which is described in section 4.3.2.

40

6.3.3 Resource Creator

This component is implemented as a .NET executable that reads the output of the filtering

component and generates scripts to create resources and settings in the guest OS for the second

run. The resource creator implementation is not trivial; we must ensure that the resources and

settings malware is looking for are created (or simulated) in such a manner as to force more

malicious activity, and not cause exceptions or cut short the malware program’s execution.

For example, it is not possible to create fake libraries; malware usually loads libraries with

the express purpose of accessing some function in them. The idea behind responding to other

requests is that malware will frequently attempt to terminate processes, steal or modify files,

change registry settings etc. without exploring their authenticity. Fig. 6.1 lists how Wolfsting’s

resource creation component responds to recorded malware requests.

41

Figure 6.1: Resource creation component creating resources as per malware requests

42

Chapter 7

Results

7.1 Overview

The hardware and software in the experimental setup used to evaluate Wolfsting is described

in section 6.1. This chapter discusses the additional behavior discovered by Wolfsting in some

malware families. Over 100 malware binaries belonging to various malware categories such as

botnets, trojans, worms etc. were run through the Wolfsting process. These binaries were

mostly sourced from VxHeavens [37] and Offensive Computing [38]. VirusTotal.com [39] was

used to ensure that the malware samples were labeled accurately. The following families were

selected as a comprehensive set encompassing the type of results obtained from the experiments.

7.2 Trojan Dropper

Trojan Droppers are a class of malware that act as droppers, i.e., they download other

malware from the web onto the victim machine. An analysis by Wolfsting on one instance

(referred to as Trojan-Dropper.Win32.VB) of this malware family is presented in the following

sections.

43

Figure 7.1: New behavior recorded in Trojan-Dropper.Win32.VB after Wolfsting created theCOM3\Debug key.

7.2.1 Analysis by Resource Creation

Microsoft’s Component Object Model (COM) as defined in Wikipedia [27], is a binary-

interface standard for software componentry, and is used to enable interprocess communication

and dynamic object creation in a large range of programming languages. COM is used in

browsers and other Windows applications to interact with system or network wide objects, and

possesses a debugging component that allows call tracing and security logging. The settings

for debugging are located at HKLM\Software\Microsoft\COM3\Debug, a key in the Windows

registry.

Trojan-Dropper.Win32.VB uses COM to interact with Internet Explorer, to enable mali-

cious activity that would allow recording of user activity on banking sites, as well as injecting

additional fields into web forms to extract more personal confidential data from unsuspect-

ing users. Wolfsting determined that Trojan-Dropper.Win32.VB attempts to locate debugging

settings by recording - in the first, baseline run - the Windows Native API call ZwQueryReg-

istryKey, which opens a handle to a registry key. The call failed since the debugging key did

not exist in its registry of the analysis machine, i.e. the guest OS. In the second run, Wolfsting

created this key in the registry and observed new behavior as illustrated in Fig. 7.1.

These settings illustrate that Trojan-Dropper.Win32.VB attempts to disable COM debug-

ging to hide its interaction with COM objects that allow it to monitor and/or modify browser

sessions. This is new, malicious behavior that would otherwise have not been detected with the

44

missing registry key. Clustering based on the system call trace obtained from Wolfsting would

result in more accurate clusters, and behavioral graphs would have more malicious activity

recorded, if constructed from the new trace.

7.2.2 Simulating user actions

Trojan-Dropper.Win32.VB uses several methods to persist on a victim’s system. This in-

cludes preventing the execution of certain processes that users run to attempt to kill mal-

ware processes or investigate modifications made by malware to the system. For example,

in Windows systems, a program called the task manager lists running processes; users may

run the task manager to attempt to kill Trojan-Dropper.Win32.VB processes. Thus it is in

Trojan-Dropper.Win32.VB’s interest to prevent this user action from successful completion.

Wolfsting was able to extract this new behavior by running the task manager and other pro-

cesses during Trojan-Dropper.Win32.VB’s execution. This is illustrated in Fig. 7.2; Trojan-

Dropper.Win32.VB sets a registry key to configure win32d.exe as the debugger program for the

Windows Task Manager (taskmgr.exe). This information is retrieved during the first call to

query the file attributes of taskmgr.exe, and prevents Task Manager from executing.

Trojan-Dropper.Win32.VB also prevents other processes, such as a program that allows

users to edit the registry and another that provides an interface to edit startup settings for

Windows. Wolfsting is able to record the actions taken by the user level component to launch

task manager, and how Trojan-Dropper.Win32.VB is able to intercept and modify those actions

to disable the program as shown in Fig. 7.2. The registry entry modified is shown in Fig. 7.3.

Figure 7.3: XML Output from Wolfsting shows how Trojan-Dropper.Win32.VB modifies aregistry key to prevent Task Manager from executing

45

Query File Attributes: Taskmgr.exe(ZwQueryFileAttributes)

Open File: C:\Open File: C:\Windows

(ZwOpenFile)


Create Section(ZwCreateSection)

Map View Of Section(ZwCreateSection)

Create Process : Taskmgr.exe(ZwCreateProcessEx)

Open File: C:\Windows\system32\taskmgr.exe

(ZwOpenFile)



(ZwOpenFile)

Query File Attributes: win32d.exe(ZwQueryFileAttributes)


Open File: C:\Windows\Temp\(ZwOpenFile)

Map View Of Section(ZwCreateSection)

Create Process : win32d.exe(ZwCreateProcessEx)

Open File: C:\Windows\system32\taskmgr.exe

(ZwOpenFile)

Create Section(ZwCreateSection)

Query File Attributes: win32d.exe(ZwQueryFileAttributes)

Running Task Manager without Zbot infection

Running Task Manager during Zbot execution

Figure 7.2: Trojan-Dropper.Win32.VB prevents task manager from executing by modifying aregistry key that sets a dummy debugger process for task manager. The dummy debuggerprocess does absolutely nothing; task manager is never launched.

46

7.3 AVKiller (Agent2)

AVKiller is a generic name given to any family of malware that attempts to disable or destroy

antivirus installations. Several methods are used to accomplish this; an analysis by Wolfsting

of an instance of this type of malware, labeled as Trojan.Win32.Agent2.cqgi by Kaspersky

antivirus [55], is presented in the following subsections.


The Windows operating system has a security model that allows objects on the systems

to have Access Control Lists or ACLs. Files, registry keys, processes and other objects have

security descriptors associated with them containing ACLs that specify user permissions. Fig.

7.4 shows the security descriptor for the Windows folder on the guest. The upper white box

displays the list of users and groups in the system that have permissions assigned for the

Windows folder.

This particular instance of malware attempts to locate certain antivirus installations on

the filesystem - these queries are intercepted and recorded by Wolfsting. These folders have

associated security descriptors that contain ACLs for users in the system. The API used by

this malware instance is ZwQueryAttributesFile (succeeds only if the queried folder exists)

that retrieves various types of information including the security descriptor associated with the

installation folder. Fig. 7.5 shows the folders queried by the malware.

Wolfsting creates the folders listed in Fig. 7.5 in the guest OS before the second execution of

the malware instance. The output of Wolfsting as shown in Fig. 7.6 shows additional behavior

displayed by the malware instance.

The calls to ZwSetSecurityObject - a native Windows API responsible for setting security

descriptors on OS objects - are intercepted by Wolfsting’s hook, called HookSetSecurityObject,

as shown in Fig. 7.6. These calls remove all permissions associated with the antivirus folders,

effectively setting a null security descriptor on the folders, rendering them inaccessible to the

user. Hence the antivirus is disabled, as it cannot access any of its files. Fig. 7.7 shows the

47

Figure 7.4: Security descriptor of the Windows folder, displaying permissions for users in thesystem

Figure 7.5: Wolfsting XML output displaying some of the folders queried (in the <filename>tags) by the Trojan.Win32.Agent2 malware

48

Figure 7.6: Wolfsting XML output shows Agent2 setting bad security descriptors for antivirusfolders, making them inaccessible. The exact security descriptor has not been shown for brevity.

security descriptor associated with the folder after the ZwSetSecurityObject system call made

by the malware instance.

Thus, Wolfsting was able to extract behavior that is crucial to classifying this particular

binary as malicious. If the folders were not present as would be the case in a bare bones analysis

machine, the new behavior of setting bad security descriptors would have never been observed.

49

Figure 7.7: Security descriptor of an antivirus (AVG) folder nullified after modification by themalware instance

50

7.4 Vundo

Vundo is a family of trojan horses that are known to infect a user’s computer through

websites that exploit vulnerabilities. Vundo then proceeds to present a fake antivirus UI to

the user, claiming that there are several malware infections on the user’s system. If the user

clicks on a link on the UI, she is taken to a page where she may be asked to pay for antivirus

software that is either fake or completely unnecessary. Vundo has several other payloads and

exhibits other malicious actions as described in [33]. Wolfsting analyzed several Vundo variants

and found the following new behavior.


Kaspersky’s Antivirus Pro software (KAVP), like most other antivirus software, has an

update mechanism that allows it to update its virus signatures. This allows KAVP to detect

the latest malware samples. KAVP uses the windows registry to store its configuration, with

registry keys indicating where KAVP is installed, how frequently KAVP should scan the machine

for viruses and other settings. One particular registry key determines if KAVP should enable

the component responsible for updating its signature database. This key is requested for in the

baseline execution by the Vundo instance. Fig. 7.8 displays the registry key that the Vundo

instance requests, and Wolfsting subsequently creates. In the second execution, Vundo proceeds

to set a value in the key that disables KAVP’s update service. This allows Vundo to drop other

new malware onto the victim’s machine without being detected by KAVP.

Figure 7.8: Registry key modified by Vundo to disable KAVP’s update service (Value of zerodisables updates)

51

Figure 7.9: Wolfsting output displaying modifications made to Internet Explorer settings, specif-ically the way IE connects to the network and the Internet

The Vundo malware sample also attempts to access settings that configure how Internet

Explorer, a popular browser from Microsoft, connects to the Internet. These settings do not

exist in every version of Internet Explorer. Wolfsting observes these queries during the first

execution and creates the keys in the registry that the malware instance is attempting to access.

The keys listed in Fig. 7.9 in the <key> tag were created by Wolfsting. Benign applications

that are using Internet Explorer settings may access these keys to check internet connection

settings. However, changing these settings may be considered malicious behavior. Vundo change

these settings with the assumption that they have been set to values that allow antivirus software

to analyze Internet Explorer browsing sessions for suspicious behavior. Subsequent methods

based on the output of online dynamic malware analyzers, especially those that use clustering

and classification, will give more accurate results with the output in Fig. 7.9 rather than the

output from an unmodified execution.

52

Figure 7.10: Random identifier string (boldface) used by Vundo - the string changes duringevery execution

7.4.2 Random Identifiers

Vundo creates registry keys with a random identifier string and a DLL with the same random

identifier string as illustrated in Fig. 7.10. This behavior was capture by feeding two unmodified

baseline execution traces to the Wolfsting new behavior extractor. Note that it is not necessary

that the same processes or threads create these files or keys in the two executions of the Vundo

instance.

7.5 TDSS

TDSS is a recently developed stealthy trojan horse that displays unsolicited advertising and

redirects websites by hijacking browsers on the victim’s machine. TDSS is a good example

of how malware is evolving to use stealth (rootkit) techniques to hide from or even disable

anti-malware software on end-user machines today. We include this trojan in our results not to

claim that we are able to analyze malware that uses rootkits - in fact, Wolfsting cannot analyze

rootkits as explained in section 8.2. The analysis that follows illustrates how Wolfsting can

53

Baseline execution:TDSS driver allowed to load

Second execution:TDSS driver is not allowed to

load(NtLoadDriver call is blocked by

Wolfsting)

Wolfsting Analysis Engine

Figure 7.11: Wolfsting analysis approach for TDSS

block malware from loading windows drivers and help an analyst get an initial idea some of the

operations that the rootkit is performing. This is done by comparing the execution of TDSS in

the baseline run versus a second execution in which the rootkit Windows driver is not allowed

to load as illustrated in Fig. 7.11.

During an execution of malware in the Wolfsting guest OS, the driver loaded by Wolfsting

has its hooks loaded into memory for the entire duration of the execution. This implies that

there is trace output generated for this entire period, and the last time stamp in the trace

output corresponds to the time when the snapshot of the clean guest OS is restored in the

virtual machine. However, during the execution of TDSS in the baseline run, it was observed

that the time stamp of the last trace output was much earlier than the time when the VM

snapshot was restored. This implied that the Wolfsting driver was unable to trace output for

the whole duration of TDSS’s execution; something disabled its tracing functionality. However,

this problem was not seen in the second execution; the last timestamp of the trace matched the

time when the VM snapshot was restored.

Fig. 7.12 displays the last moments of the trace output during the first baseline execution

of the TDSS instance as well the second execution during which the driver is blocked from

loading. The </scan> tag indicates that Wolfsting trace has ended. The call that loads the

driver on lines 8 to 14 in Fig 7.12a shows that the TDSS driver disabled Wolfsting’s hooks and

preventing further tracing of its activity. The reason this was possible is that the Wolfsting

54

(a) Last moments of trace output of TDSS instance - Driver allowed to load (result = 0 on line 13indicates success)

(b) Last moments of trace - TDSS driver blocked from loading. (result != 0 on line 30 indicatesfailure)

Figure 7.12: Trace output (filtered to show only relevant activity) when running TDSS

55

implementation is based on Regmon, a tool developed for monitoring registry activity. Regmon

uses the Windows System Service Dispatch Table (SSDT) hooking to install system call hooks,

a functionality also used by Wolfsting. The SSDT is a table of pointers to system calls; these

are swapped in the Regmon/Wolfsting driver to point to hooks that capture trace information.

The TDSS instance was coded to prevent Regmon and other monitoring tools from functioning

correctly by restoring the system call pointers in the SSDT to its original values (or to point

to TDSS’s own hooks); the difference in the Wolfsting output traces in Fig. 7.12 can help an

analyst realize such functionality exists in the malware instance. This new behavior cannot be

observed in the trace output of a single execution.

7.6 Zeus/ZBot

As described in [28], Zeus/ZBot is a family of password stealing trojans that targets the

online account information stored on a user’s machine, such as usernames and passwords to

banking websites, ftp sites or other accounts. This family of malware has also formed large

botnets all over the world. ZBot uses process injection and various methods of maintaining

presence on a victim machine. Fig. 7.13, retrieved from Technet [29] shows ZBot’s infection

rate starting in April 2007 and continuing into 2010.

ZBot is difficult to analyze because it uses process injection to an extreme. Malicious

activities are divided into tasks that are then spread amongst threads of work. These threads

are injected into other processes, and the pattern of injection - i.e., where each thread of

work goes - changes with every execution. However, Wolfsting’s analysis engine is capable of

taking this behavior into account while comparing two executions as discussed in section 6.2.1.

Wolfsting found additional behavior using instances of ZBot, detailed in the following sections.


Wolfsting recorded requests for several registry keys during baseline executions of Zeus/ZBot

- henceforth referred to as ZBot - instances. Table 7.1 illustrates the requests made by Zbot

56

Figure 7.13: ZBot Infection Rate in terms of number of reported infections worldwide as mea-sured by Microsoft

instances during execution.

Wolfsting created the additional keys requested by ZBot and observed new behavior as

recorded in Table 7.2. Note that these system calls illustrate malicious behavior more explic-

itly than those in Table 7.1, since the malware is not just checking for the presence of these

applications but also personal information stored by them. The calls made in Table 7.1 may

also be made by benign software parsing through the Windows registry.

Table 7.1: Windows registry keys requested by ZBot

HKEY_CURRENT_USER\SOFTWARE\Ghisler\Total Commander\

HKEY_LOCAL_MACHINE\SOFTWARE\FlashFXP\3

HKEY_CURRENT_USER\SOFTWARE\ipswitch\ws_ftp

HKEY_CURRENT_USER\SOFTWARE\Far\Plugins\ftp\hosts

HKEY_CURRENT_USER\SOFTWARE\Far2\Plugins\ftp\hosts

HKEY_CURRENT_USER\SOFTWARE\martin prikryl\winscp 2\

HKEY_CURRENT_USER\SOFTWARE\smartftp\client 2.0\settings\backup

57

Table 7.2: New behavior, i.e., new registry values (boldface) accessed by ZBot after creation ofkeys in Table 7.1

HKEY_CURRENT_USER\SOFTWARE\Ghisler\Total Commander\installdir

HKEY_LOCAL_MACHINE\SOFTWARE\FlashFXP\3\datafolder

HKEY_CURRENT_USER\SOFTWARE\ipswitch\ws_ftp\datadir

HKEY_CURRENT_USER\SOFTWARE\smartftp\client 2.0\settings\backup\folder

HKEY_CURRENT_USER\SOFTWARE\martin prikryl\winscp 2\sessions

As an experiment, WinSCP2, a popular FTP program was installed (manually) in the

snapshot of the guest OS before one ZBot instance’s baseline execution. The username and

password to a dummy FTP site (google.com) was stored in the program’s configuration. This

resulted in the additional behavior in Fig. 7.14. ZBot accessed the username and password

from the registry - the actual work is done by a worker thread injected into explorer.exe. This

additional set of system calls is valuable to subsequent clustering or classification as mentioned

previously. As explained in section 9.2.1, future work on Wolfsting may include automatically

installing software that malware is looking for in response to requests such as in Table 7.1.

7.6.2 Random Identifiers

ZBot creates registry keys, folders, files and a process with random identifier strings as

illustrated in Fig. 7.15. This behavior was captured by feeding two unmodified baseline execu-

tion traces to Wolfsting’s new behavior extractor. Note that it is not necessary that the same

processes or threads create these files or keys in the two executions of the ZBot instance.

58

Figure 7.14: ZBot worker thread in explorer.exe stealing username and password information.The API displayed here is ZwQueryRegistryValueKey, which retrieves a value correspondingto a given a registry key. This behavior can only be observed if the requested keys exist in theregistry.

Figure 7.15: Random identifier strings used by ZBot

59

7.7 Observations and Summary

7.7.1 Overhead analysis

Binaries are executed in the Wolfsting with a fixed timeout period of two minutes. With

two executions and the guest OS snapshot restoration overhead (before and after the first run),

the total execution (clock) time for the entire process as listed in section 5.1 - measured on the

hardware mentioned in section 6.1 - is 300 seconds. Note that the amount of time for a single

run, specifically the amount of time malware processes are allowed to run in the guest OS is

customizable and hence 300 seconds represents the maximum time as specified by our design.

The overhead involved in running a binary in the guest OS with the Wolfsting driver loaded

into kernel memory was measured as follows. A userland application was created that launched

the Firefox executable (firefox.exe). The userland application measured the time taken for the

CreateProcess Windows User API call to complete. This single userland API call causes several

native API calls in the windows kernel, including file, registry and process operations (covering

the major categories of native API calls) most of which are hooked by the Wolfsting driver.

The total time taken for CreateProcess to complete with no driver loaded into memory, i.e. a

clean guest OS snapshot, averaged over 25 trials was 375 milliseconds. The total average time

taken with the kernel driver loaded into memory was 401.5 seconds. To prevent effects such as

caching etc. from affecting the overhead analysis, the guest OS snapshot was restored before

each execution of the userland program. Thus the runtime overhead of the Wolfsting driver

measured within the guest OS is 6.61%.

7.7.2 Trace output

Most of the malware executables we experimented upon had some common trace output

that were the result of additional housekeeping or non-malicious activity executed by system

libraries. It is not possible for a system to, without human intervention, decide if a single

system call is definitely indicative of malicious or non-malicious behavior. The housekeeping

60

call filtering system depends on the manual filter list, more work needs to be done on this

component to achieve higher accuracy in the trace output. Note, however, that the filter list

even in its current form provides a context awareness (in terms of the specific guest OS details)

that focuses the additional behavior towards malicious activity. Also, subsequent methods based

on the output of Wolfsting like traces may use data mining methods to filter out non-malicious

activity.

7.7.3 Registry activity bias

Quantitatively, the most additional behavior was seen in registry activity rather in file,

network or process activity. This is logical; the largest information base on a Windows system

is the registry; everything from binary execution options to usernames and passwords are saved

in this information store. Searching for software installations mostly involves checking for

the presence of registry keys or software folders. Thus this is considered as a normal bias in

Wolfsting traces.

7.7.4 Summary

We have shown that the Wolfsting process is able to automatically elicit new behavior -

as specified in Chapter 3 - in malware binaries with a fixed analysis time and low runtime

overhead.

61

Chapter 8

Limitations

8.1 Limitations Of Dynamic Analysis

All dynamic analysis approaches, including Wolfsting, suffer from certain limitations inher-

ent to this type of analysis, as listed below:

1. Trace Dependence: As listed in [18], dynamic analysis systems depend on the trace output

of a single specific execution of a malware binary. While Wolfsting uses a minimum of

two executions to gain additional trace output, it simply cannot force malware execution

that is strictly trigger based, such as malware executing a payload on a specific date, or

malware that requires specific human interaction, such as visiting a specific bank website.

Methods that attempt to cover all possible execution paths are discussed in section 2.3.1.2.

In spite of their overhead, a comprehensive analysis may mandate the use of one of these

methods.

2. Detection of analysis environment: Malware families may attempt to detect an analysis

environment such as virtual machine or an emulated environment. Once such an envi-

ronment is detected, malware may then proceed to disable its own functionality, stop

execution of all its threads and processes, or even erase itself from the system completely.

This would lead to analysis systems such as Wolfsting being thwarted, preventing them

62

from generating accurate trace output. No implementation of such detection is perfect;

thus malware authors may choose to not incorporate such detection into their code - they

might miss potential victims that come up as false positives. Methods to prevent such

detection, with a focus on VMWare, are discussed in [30], while the un-decidability of

self-determination by a program on the question of whether it is running in a virtual

machine is described in [31].

8.2 Limitations Of Wolfsting

1. Trigger based behavior: As mentioned in section 8.1, it is difficult to explore trigger

based malware behavior, and even more so in systems such as CWSandbox, Anubis and

Wolfsting simply because of the practical limits that these systems aim to adhere to. For

instance, malware binaries are run in the CWSandbox system for a maximum of two

minutes. Any malware activity that is initiated after this time period will be missed, as is

the case with Wolfsting. The argument for restricting execution time is that malware is

highly opportunistic; malware must attempt to execute its payload before the user ends

his computer session, before the malware’s command server is taken down by authorities,

or even before an antivirus software begins a scan of the victim machine. Techniques such

as manipulating the clock in the guest OS may be used, but only if the malware is using

event based triggers (an event is fired indicating to the malware that a certain time period

has passed or a certain timestamp has been reached); if the malware is polling the time

- infrequently - and will execute only at a specified time, only static analysis of the code

may provide a solution. With regard to time triggers, the amount of time for a single

malware execution is customizable in Wolfsting’s design, and [11, 40] describe methods

to discover such triggers.

2. Rootkits: Wolfsting cannot analyze kernel based rootkits, i.e., rootkits that install hooks

or modify kernel data structures to hide malicious activity. This is because Wolfsting

63

considers the virtual machine guest OS kernel as a trusted component of its design, and

any compromise of kernel integrity will mean that Wolfsting’s trace output cannot be

trusted. Note that Wolfsting makes an attempt to block kernel code from executing by

preventing any attempt by malware to load drivers into the guest OS kernel. However,

there are other more subtle ways to achieve kernel privilege, as illustrated by the Windows

Kernel Exception Handler vulnerability in [32]. Thus Wolfsting cannot currently analyze

this category of malware.

8.3 Techniques to thwart Wolfsting

Malware authors have at their disposal several tricks to thwart any analysis system, ranging

from implementing simple trigger based behavior to thwart several dynamic analysis systems to

more complex statistical or heuristic based approaches to detect an analysis system and change

or stop execution. We now proceed to list some techniques that may be used by malware

authors to subvert analysis by Wolfsting:

1. String matching-search: Instead of searching for a specific setting or resource, malware

may choose to locate all resources or settings at a particular location, and then use internal

string matching algorithms to locate a particular match in this list of resources. Wolfsting

depends on malware querying for a resource or setting using a specific identifier, and thus

analysis would be thwarted if malware implements this kind of string matching search

algorithm. This behavior, however, has not been observed in malware samples analyzed

using Wolfsting; other approaches such as disassembly to determine the internal strings

in a malware binary may be required to overcome this obstacle. Note that malware may

choose not to use such an algorithm simply because of the overhead involved.

2. Compromising Wolfsting kernel hooks: The current implementation of Wolfsting is based

on Regmon (now evolved into Process Monitor), a popular tool amongst malware ana-

lysts. Several malware authors have incorporated code into their binaries to terminate

64

the RegMon process, unload its driver or uninstall its kernel hooks. These techniques may

also be used to compromise Wolfsting on the guest OS. Some precautions can be taken;

for instance, it is possible for Wolfsting to poll continuously to check that its hooks are

installed. However, there is no guarantee that Wolfsting cannot be compromised in this

manner.

3. Unnecessary or poisoned queries: Malware may make large volumes of queries for resources

or settings that do not exist, causing Wolfsting to have to create those resources or

settings without benefit and thus providing unnecessarily large amount of information

to the user. However, this kind of behavior may be avoided by malware since it would

defeat the goal of being stealthy. Malware may also test for the presence of Wolfsting

by requesting a resource with a randomized name that cannot possibly be present on

any system. Since Wolfsting will attempt to fake the existence of this resource, this may

be used as an indicator of Wolfsting’s presence by the malware which may proceed to

stop or alter execution. There is no easy way to prevent this method of detection; any

protocol replay system may also be compromised in this manner. One brute force method

is to selectively create resources or settings spread across several isolated executions to

see which one causes deviation due to the resource creation component of Wolfsting -

this will involve a very large overhead in terms of analysis time. Another technique to

thwart Wolfsting is using the same approach is to perform queries in multiple steps. For

example, a file located at C:\ABC\XYZ\PQR.txt may be accessed using three system

calls, one to open C:\ABC, another to open C:\ABC\XYZ, and finally the third to open

the file. This would need multiple sequential Wolfsting runs before the query for PQR.txt

is finally observed; each step would involve the creation of one directory in the path. This

kind of tedious programming is not seen in the real world; malware authors may use it to

specifically subvert Wolfsting like systems.

65

Chapter 9

Conclusions And Future Work

9.1 Conclusions

Malware has evolved from simple worms that displayed annoying messages or deleted a few

files to an industry wherein professional programmers are hired to create targeted malware

packages that infiltrate computers belonging to home and corporate users alike. This threat

has cost great financial and economic loss in terms of hard currency as well as time spent on

combating or repairing the damage caused by malicious activity. Research in countering this

threat has also accelerated at an exponential pace, with universities and industry responding

with ever greater speed to new and sophisticated malware techniques. Research focus has

shifted from traditional signature based detection schemes to more complex static, dynamic or

hybrid malware detection techniques that are automated and highly scalable. Online dynamic

malware analyzers help in providing a detailed trace of a malware instance’s execution without

human intervention; in this thesis we attempt to improve upon the work done on these systems.

In this thesis, we presented Wolfsting, a system that improves upon previous work on on-

line dynamic analysis systems. This was followed by a discussion of related work in the field,

with special focus on malware analysis including static and dynamic analysis. Systems like

CWSandbox and Anubis, useful tools to analysts attempting to understand malware behavior

66

and reverse engineer malware instances, were described in detail and presented as the founda-

tions of this work. The motivation that led to Wolfsting was discussed; extracting additional

behavior from malware instances by presenting them with exactly those environments - files,

keys, etc. - they are looking for. The design of the Wolfsting system was presented, discussing in

detail the main components - the guest OS components, the controller and the analysis engine.

This was followed by the steps in running a malware binary through the Wolfsting process - a

discussion intended to provide a complete end to end picture of the system. Implementation

details were discussed, with OS and language specific details provided. We then presented some

results from experiments in which infamous malware binaries were subjected to the Wolfsting

process. Finally, we discussed the limitations of our work, and techniques to thwart the Wolf-

sting implementation. We now proceed to note some observations on our approach as well the

results seen.

9.1.1 Observations

Malware, being targeted to users or software installations, tends to look for certain settings

or resources on the system. We noted that several of these queries originate due to non-

malicious, housekeeping or system activity. An automated process that follows a system like

Wolfsting, such as one that performs clustering or constructs call flow graphs based on trace

output, can use various data mining algorithms to extract only unique behavior from such

traces, but for a detailed analysis including reverse engineering, a human analyst needs to parse

through the trace. It is not our intent to completely replace the human analyst; rather this

thesis focuses on augmenting a useful set of tools that help 1) The analyst understand and

possibly aid in the reverse engineering of malware behavior, and 2) increase the efficiency of

automated systems by providing traces with more unique malware behavior.

Determining what software a malware will interact with before actually executing the mal-

ware is almost impossible. Wolfsting attempts to setup the environment to enable more malware

behavior by noting what the malware does; not by the obvious brute-force method of installing

67

several popular software packages. While this may initially seem like an attractive idea, es-

pecially a decade ago when popular software was limited to a few hundred titles, the software

industry today has grown exponentially, and the vast number of software packages and ver-

sions of each of these software packages make it impossible to have installations of all popular,

targeted-by-malware software on an analysis machine.

9.1.2 Summary

Wolfsting is a highly configurable tool to improve the trace output of online dynamic mal-

ware analysis systems such as CWSandbox and Anubis. As shown in Chapter 7, Wolfsting

was able to extract additional behavior - i.e., behavior that may not be seen in an unmodified

execution - in very recently observed and widely known real world malware instances. This

additional behavior included attempts to change registry keys to disable monitoring programs,

disable antivirus functionality, and exploit browser vulnerabilities. We also presented a case

where a manual installation in the guest OS of a popular software resulted in augmented trace

output that is valuable to both automated analysis systems and human analysts. This extra

trace output is proof that scope exists to create specialized software environments in online

dynamic malware analysis systems to extract further malware behavior. The Wolfsting im-

plementation presented is in a prototype stage; much scope for improvement and future work

exists as will be discussed in the next section.

9.2 Future Work

The Wolfsting system engages malware by presenting malware binaries with exactly the

software environment (files, registry settings etc.) that they are looking for. This results in

additional, unique malware behavior that would otherwise not be traced without this special

environment. We now present some ideas preceded by a discussion of the scope for future work

using the Wolfsting approach and implementation as a foundation.

68

9.2.1 Scope for future work

We believe that the Wolfsting implementation is in an infant stage and can be improved

upon greatly. To address its limitations as specified in Chapter 8, we propose the following

ideas to exemplify the scope for improvement in our work:

1. Combination with other schemes: Using schemes that involve static analysis, hybrid

(static and runtime) disassembly such as [4] , more information about a particular malware

instance may be gleaned out of the binary executable itself, allowing greater accuracy in

setting up the software environment that will force execution of more code in the malware.

This includes strings in the binary, statistically significant byte or instruction patterns

that indicate presence of code from previously known malware, etc.

2. Tackling rootkits: Rootkits are a particularly hard subject for automated analysis system

because of the lack of trust involved when the guest OS kernel is compromised. Tracing

done externally, i.e., outside the guest system is the solution to this; constructing the view

of the guest OS from outside the guest OS is a problem that has been tackled in Jiang et

al. [43]. The Wolfsting design would have to be modified to use such an external tracing

mechanism to analyze rootkits.

3. Software Artifact Database: Although the brute force method of installing software in the

Wolfsting guest OS is not attractive as discussed in section 9.1.1, it may be possible to

have a database of software artifacts (files, registry keys, processes) that normally indicate

the presence of a particular software package. When Wolfsting encounters a query for one

of these artifacts, it may then be possible to install (or simulate) either the whole software

or all the artifacts associated with it; malware may then execute the payload associated

with the presence of the software package.

4. Non malicious software: Wolfsting may also be used to analyze non malicious software.

Unwanted behavior as defined by several security companies is behavior that is not nec-

essarily categorized as malicious, but raises issues of ethics, privacy, etc.; Usually this

69

involves doing something (intentionally or unintentionally) on a user’s system that user

would not allow if she had knowledge of the action. Wolfsting may be used to monitor an

application and provide the software environment it is looking for, to see if the resultant

actions are in accordance with user wishes, corporate policy etc.

9.2.2 Usefulness in Corporate Environments

The recent and widely publicized attack on Google [42] was accomplished using simple social

engineering; an employee clicking on a malicious link in an instant messenger window was all

that was required to compromise Google’s systems, allowing hackers to steal the source code to

Google’s password management system. A company’s source code is perhaps its most valuable

asset. We propose that Wolfsting can be configured as a honeypot-like defense mechanism for

such attacks; a user in a corporate environment may submit a link or a file (the user’s system

may also automatically submit downloaded files) to Wolfsting for processing; the Wolfsting

guest OS may have a fake source code management client and fake source code trees that

imitate the real software product source code tree. Any actions taken by the malware can then

be monitored and suspicious activity reported to IT security. It maybe so that the malware

instance targets several code management systems such as SVN, CVS, Perforce, etc.; in this case

Wolfsting’s resource creation mechanism may step in to create these software environments.

9.2.3 Usefulness to Individual or Home users

The Wolfsting system presents a detailed trace of a malware binary’s execution, with aug-

mented output that includes additional behavior as a result of providing an environment to

the malware that forces it to execute more of its code base. This output can also be modified

to provide the exact sequential steps to disable the malware from a user’s machine. This is

helped by the fact that Wolfsting is capable of automatically detecting random identifiers used

by malware as detailed in section 3.2.5.1. With the tremendous penetration of broadband con-

nectivity in the world, we also propose this radical idea- let the user’s environment be copied

70

into Wolfsting’s guest OS; this will force malware to execute any payload that is specific to that

user or software environment.

9.3 Summary

In this final chapter, we discussed some of the observations we made working on Wolfsting

as well the conclusions drawn from these observations. We presented the scope for future work

as well as some ideas to extend the functionality and applicability of Wolfsting. This concludes

the thesis.

71

Bibliography

[1] Carsten Willems, Thorsten Holz, Felix Freiling, "Toward Automated Dynamic Malware

Analysis Using CWSandbox," IEEE Security and Privacy, vol. 5, no. 2, pp. 32-39,

Mar./Apr. 2007, doi:10.1109/MSP.2007.45

[2] ANUBIS. http://anubis.iseclab.org, 2009.

[3] Weidong Cui, Vern Paxson, Nicholas C. Weaver, and Randy H. Katz. Protocol-independent

adaptive replay of application dialog. In Network and Distributed System Security Sym-

posium, San Diego, CA, February 2006.

[4] Susanta Nanda, Wei Li, Lap-Chung Lam, and Tzi-cker Chiueh. BIRD: Binary interpre-

tation using runtime disassembly. In International Symposium on Code Generation and

Optimization, NY, March 2006.

[5] Xu Chen, Jon Andersen, Z. Morley Mao, Michael Bailey, and Jose Nazario. Towards an

understanding of anti-virtualization and anti-debugging behavior in modern malware. In

International Conference on Dependable Systems and Networks, Anchorage, AK, June

2008.

[6] A. Moser, C. Kruegel, and E. Kirda. Limits of Static Analysis for Malware Detection. In

ACSAC, pages 421–430. IEEE Computer Society, 2007.

[7] Bayer, U., Moser, A., Kruegel, C., & Kirda, E. (2006). Dynamic analysis of malicious code.

Journal in Computer Virology, 2, 67–77.

72

[8] U. Bayer, C. Kruegel, and E. Kirda. TTAnalyze: A Tool for Analyzing Malware. In 15th

European Institute for Computer Antivirus Research (EICAR 2006) Annual Conference,

April 2006.

[9] F. Bellard. Qemu, a Fast and Portable Dynamic Translator. In Usenix Annual Technical

Conference, 2005.

[10] D. Brumley, C. Hartwig, Z. Liang, J. Newsome, P. Poosankam, D. Song, and H. Yin.

Automatically identifying trigger-based behavior in malware. In Book chapter in ”Botnet

Analysis and Defense”, Editors Wenke Lee et. al., 2007.

[11] Andreas Moser , Christopher Kruegel , Engin Kirda, Exploring Multiple Execution Paths

for Malware Analysis, Proceedings of the 2007 IEEE Symposium on Security and Privacy,

p.231-245, May 20-23, 2007 [doi>10.1109/SP.2007.17]

[12] X. Hu, T. Chiueh, and K. Shin, “Large-scale malware indexing using function-call graphs,”

in ACM Conf. on Computer and Communications Security (CCS), 2009.

[13] M. G. Kang, P. Poosankam, and H. Yin, “Renovo: a hidden code extractor for packed

executables,” in ACM Workshop on Recurring malcode (WORM), 2007.

[14] L. Martignoni, M. Christodorescu, and S. Jha, “Omniunpack: Fast, generic, and safe

unpacking of malware,” in Annual Computer Security Applications Conf. (ACSAC), 2007.

[15] Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel, Engin Kirda, Xiaoyong

Zhou, and Xiaofeng Wang. Effective and efficient malware detection at the end host. In

USENIX Security Symposium, Montr´eal, Canada, August 2009.

[16] Konrad Rieck, Guido Schwenk, Tobias Limmer, Thorsten Holz and Pavel Laskov. Botzilla:

Detecting the "Phoning Home" of Malicious Software. Proc. of 25th ACM Symposium on

Applied Computing (SAC), 1978-1984, March 2010.

73

[17] Juan Caballero, Noah M. Johnson, Stephen McCamant and Dawn Song. Binary Code

Extraction and Interface Identification for Security Applications. In Proceedings of the 17th

Annual Network and Distributed System Security Symposium, San Diego, CA, February

2010.

[18] Ulrich Bayer, Paolo Milani, Clemens Hlauschek, Christopher Kruegel, and Engin Kirda.

Scalable, Behavior-Based Malware Clustering. 16th Annual Network and Distributed Sys-

tem Security Symposium (NDSS 2009), San Diego, February 2009

[19] Konrad Rieck, Automatic Analysis of Malware Behavior using Machine Learning,

http://honeyblog.org/junkyard/paper/malheur-TR-2009.pdf

[20] Weidong Cui. Automating Malware Detection by Inferring Intent. PhD thesis, University

of California, Berkeley, September 2006.

[21] David Wagner and Drew Dean. Intrusion detection via static analysis. In Proceedings of

the 2001 IEEE Symposium on Security and Privacy, May 2001.

[22] M. Gheorghescu. An Automated Virus Classification System. In Virus Bulletin conference,

2005.

[23] Christodorescu, M., And Jha, S. Static Analysis of Executables to Detect Malicious Pat-

terns. In Usenix Security Symposium (2003).

[24] Christodorescu, M., Jha, S., Seshia, S., Song, D., And Bryant, R. Semantics-Aware Mal-

ware Detection. In IEEE Symposium on Security and Privacy (2005).

[25] Kruegel, C., Robertson, W., And Vigna, G. Detecting Kernel-Level Rootkits Through

Binary Analysis. In Annual Computer Security Applications Conference (ACSAC) (2004).

[26] Provos, N. A Virtual Honeypot Framework. Tech. Rep. 03-1, CITI (University of Michigan),

Oct. 2003.

[27] Wikipedia, Component Object Model, http://en.wikipedia.org/wiki/Component_Object_Model

74

[28] F-Secure Description of the ZBot Trojan, http://www.f-secure.com/v-descs/trojan-

spy_w32_zbot.shtml

[29] Microsoft Technet blog on ZBot, http://blogs.technet.com/b/mmpc/archive/2010/03/11/got-

zbot.aspx

[30] Matthew Carpenter, Tom Liston, Ed Skoudis, "Hiding Virtualization from Attackers

and Malware," IEEE Security and Privacy, vol. 5, no. 3, pp. 62-65, May/June 2007,

doi:10.1109/MSP.2007.63

[31] Shay Gueron, Jean-Pierre Seifert: On the Impossibility of Detecting Virtual Machine Mon-

itors. SEC 2009: 143-151

[32] Vulnerabilities in Windows Kernel Could Allow Elevation of Privilege (977165). Microsoft

Security Bulletin MS10-015. http://www.microsoft.com/technet/security/Bulletin/MS10-

015.mspx

[33] Wikipedia, Vundo Description, http://en.wikipedia.org/wiki/Vundo

[34] Microsoft Technet Blog on Top Windows malware,

http://blogs.technet.com/b/mmpc/archive/2009/05/20/860000-computers-cleaned-

from-password-stealer-infections-in-one-week.aspx

[35] Zhang, Q., Reeves, D.S.: MetaAware: Identifying Metamorphic Malware. In: Choi, L.,

Paek, Y., Cho, S. (eds.) ACSAC 2007. LNCS, vol. 4697, Springer, Heidelberg (2007)

[36] Zhang Q., Polymorphic and Metamorphic Malware Detection, PhD thesis, North Carolina

State University, May 2008.

[37] VxHeavens Malware Collection, http://vx.netlux.org/

[38] Offensive Computing Malware Collection, http://www.offensivecomputing.org/

[39] Virustotal malware scanner, http://www.virustotal.com/

75

[40] Crandall, J. R.,Wassermann, G., de Oliveira, D. A., Su, Z.,Wu, S. F., and Chong, F.

T. 2006. Temporal search: detecting hidden malware timebombs with virtual machines.

SIGARCH Comput. Archit. News 34, 5 (Oct. 2006), 25-36.

[41] Windows SysInternals, http://technet.microsoft.com/en-us/sysinternals/default.aspx

[42] Wired Magazine report on online attack targeting Google,

http://www.wired.com/threatlevel/2010/04/google-hackers/

[43] X. Jiang, X. Wang, and D. Xu. Stealthy Malware Detection Through VMM-Based ”Out-

of-the-Box” Semantic View Reconstruction. In CCS, pages 128–138, 2007.

[44] A. Dinaburg, P. Royal, M. Sharif, and W. Lee. Ether: Malware analysis via hardware

virtualization extensions. In In Proceedings of The 15th ACM Conference on Computer

and Communications Security

[45] Vinod Ganapathy , Somesh Jha , David Chandler , David Melski , David Vitek, Buffer

overrun detection using linear programming and static analysis, Proceedings of the 10th

ACM conference on Computer and communications security, October 27-30, 2003, Wash-

ington D.C., USA [doi>10.1145/948109.948155]

[46] BRUMLEY, D., CABALLERO, J., LIANG, Z., NEWSOME, J., AND SONG, D. To-

wards automatic discovery of deviations in binary implementations with applications to

error detection and fingerprint generation. In Proceedings of USENIX Security Symposium

(USENIX Security 2007).

[47] Wikipedia, Definition of Polymorphic code, http://en.wikipedia.org/wiki/Polymorphic_code

[48] Wikipedia, Definition of Metamorphic code, http://en.wikipedia.org/wiki/Metamorphic_code

[49] P. Fogla, M. Sharif, R. Perdisci, O. M. Kolesnikov, and W. Lee. Polymorphic blending

attack. In Proceedings of the 15th USENIX Security Symposium (Security’06), 2006.

76

[50] Chinchani, R., Berg, E.V.D.: A fast static analysis approach to detect exploit code inside

network flows. In: Proceedings of the International Symposium on Recent Advances in

Intrusion Detection (RAID), (2005)

[51] Li, Z., Sanghi, M., Chen, Y., Kao, M.-Y., Chavez, B.: Hamsa: fast signature generation

for zero-day polymorphic worms with provable attack resilience. In: Proceedings of the

2006 IEEE Symposium on Security and Privacy, pp. 32–47, 2006

[52] C. Krügel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna. Polymorphic worm detection

using structural information of executables. In RAID, 2005.

[53] Mohamed R. Chouchane , Arun Lakhotia, Using engine signature to detect metamorphic

malware, Proceedings of the 4th ACM workshop on Recurring malcode, November 03-03,

2006, Alexandria, Virginia, USA [doi>10.1145/1179542.1179558]

[54] Microsoft Developer Network (MSDN), http://msdn.microsoft.com/en-us/default.aspx

[55] Kaspersky Antivirus, http://www.kaspersky.com/

77

abstract - nc state universitymodusoperandi.csc.ncsu.edu/theses/vikram.pdf · abstract mulukutla,...

Documents