capturing information flow with concatenated dynamic taint analysis

32
Capturing Information Flow with Concatenated Dynamic Taint Analysis Hyung Chan Kim, Angelos D.Keromytis, Michael Covington, Ravi Sahita IEEEARES 2009

Upload: kirsi

Post on 04-Jan-2016

53 views

Category:

Documents


3 download

DESCRIPTION

Capturing Information Flow with Concatenated Dynamic Taint Analysis. Hyung Chan Kim, Angelos D.Keromytis, Michael Covington, Ravi Sahita IEEEARES 2009. Authors. Hyung Chan Kim He is currently an Expert Researcher in National Institute of Information and Communications Technology. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Capturing Information Flow with Concatenated Dynamic Taint

Analysis

Hyung Chan Kim, Angelos D.Keromytis, Michael Covington, Ravi Sahita

IEEEARES 2009

Page 2: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Authors

• Hyung Chan Kim– He is currently an Expert Researcher in National

Institute of Information and Communications Technology.

– Before February 2009, he was in Network Security Lab of Columbia University in the City of New York.

Page 3: Capturing Information Flow with Concatenated Dynamic Taint Analysis
Page 4: Capturing Information Flow with Concatenated Dynamic Taint Analysis
Page 5: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• Angelos Keromytis

– He is an associate professor in the Computer Science department at Columbia University, in New York.

– He is also the director of the Network Security Lab.

– His main research interests are in computer security, cryptography, and networking.

Page 6: Capturing Information Flow with Concatenated Dynamic Taint Analysis
Page 7: Capturing Information Flow with Concatenated Dynamic Taint Analysis
Page 8: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• Michael Covington

– Senior Research ScientistAdjunct Professor of Computer Science

– Associate Director ofInstitute for Artificial IntelligenceThe University of Georgia

– research areas• Natural language understanding • Computational psycholinguistics • Information retrieval and extraction • Logic programming • Microcontroller applications • Image processing

Page 9: Capturing Information Flow with Concatenated Dynamic Taint Analysis
Page 10: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• Ravi Sahita – He is a Senior Researcher in the Communication

Technology Lab in Intel’s Corporate Technology Group.

– He is currently working on platform approaches to address computer security issues.

Page 11: Capturing Information Flow with Concatenated Dynamic Taint Analysis

IEEEARES 2009

Page 12: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Background

• Dynamic taint analysis (DTA) is a technique used for tracking information flow by propagating taint propagation across memory locations during program execution.

• Most implementations of DTA are based on dynamic binary instrumentation (DBI) frameworks or whole-system emulators/virtual machine monitors.

Page 13: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Background• The boundary of information tracking with DBI

frameworks is a single process.• However, there is an increasing need for tracking

information flow across single-system boundaries and across the whole enterprise.

• They describe an architecture for tracking multiple mixed-information flows among several processes across a distributed enterprise.

• Their DTA tool is based on PIN and the concatenated DTA processing is realized with per-host flow managers.

Page 14: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Single-Process DTA

• Information Flow Model– Data dependency• They identify information flow from these actions of

data copy or transformation: i.e., information flows from memory source(s) to memory destination(s).

– Control dependency• Dynamic dependency, based on program control

structure (indirect information flow).• They don’t consider such control dependencies now.

Page 15: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Design and Implementation

• The architecture keeps the track of the association between a taint tag in shadow memory and memory/registers handled by program instructions.

Page 16: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Shadow Memory for Tag Management

• Previous work– Each bit of process memory has a corresponding

unit size of shadow memory (tag).

• Their work– Each byte of application memory is mapped to a

unit in shadow memory. – A change of any bit in a byte results in tainting the

tag for that byte as a whole.

Page 17: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Colored Tainting

• In SeeC, 1 byte in application memory can be associated with a 4-byte unit in the shadow memory. They use 4-byte tags used as a bitmap.

• Their tool can record 32 values for any process memory byte, realizing colored tainting (track the combination of different data).

Page 18: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Tag Propagation• Propagation and clearance policies:– If at least one source operand is tainted, then the

destination operand(s) should be also tainted.– If all the inputs to an operation are clear, the

destination operand(s) should be also cleared.

• buf1 and buf2 may come from different data sources.

• As SeeC supports colored tainting, they identify both sources referring to the tag associated with the buffer named result.

Page 19: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Taint Sources

• Each color value is associated with a description of the data source.

• The main source points are system calls where read-like operations are performed to introduce data from outside the process.

• Moreover, a user can specify a certain memory area to be tainted for specific applications.

• SeeC can apply regular expressions of incoming data, to select only part for tracking.

Page 20: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Taint Sinks

• Taint sinks are data destination points, mostly on write-like system calls, where SeeC performs some assertion or validity checking for outgoing file or network stream data.

• Normal usage of SeeC at a sink point is to check tag information corresponding to the (buffer) memory of the outgoing data.

Page 21: Capturing Information Flow with Concatenated Dynamic Taint Analysis

DTA Across System Boundaries

• System Design and Implementation– To realize information flow tracking across

multiple applications and even across system boundaries, we need to observe the native data exchanged via inter-process communication (IPC) between interacting applications.

– Here, they limit their discussion in IPC via TCP connections.

Page 22: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• The purpose of our mechanism is to deliver additional tag data when data transfer happens.

• The receiver side (App2) translates the tag data received and reflects them to its shadow memory.

• In App2, the information about the data source coming from App1 can still be maintained and finally be spotted at the sink point(s) of App2.

• They place a flow manager in each host to handle multiple DTA processes and connections.

Page 23: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Session Management

1. If a process is launched with SeeC to participate in an information tracking session, SeeC registers its process id to the local flow manager, establishing an IPC channel with it.

2. SeeC queries the peer manager to decide whether to further perform concatenated DTA processing for the given channel.

3. The flow manager located in the peer host responds to the membership query.

4. Channel information also need to be registered in the flow manager.

Page 24: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Tag Data Delivery

• Tag data should be delivered with native data to the peer recipient so that the peer process can reflect the tag information in its shadow memory at its source points.

• But dealing with a TCP channel we cannot expect synchronized write()–read() pairs of system calls in both sides for a given socket descriptor.

Page 25: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• They use a simple FIFO queue structure (write_q) to hold tag data. Because the TCP channel is an ordered byte stream.

Page 26: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• Since the TCP channel is ordered, we can expect that there are already pushed tag data of size w′, accumulated and associated with multiple write() calls made previously in the sender, when a read() system call returns r and w′ >= r.

Page 27: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Experiments

• Identifying Information Leakage by a SQL Injection Attack– Taint sources: We have specified taint sources on

the database server host to be any database (*.MYD) and the /etc/passwd file.

– Taint sinks: Any network output streams in the web server are sink points.

Page 28: Capturing Information Flow with Concatenated Dynamic Taint Analysis

• With the colored tainting, sources in the database server host are discriminated in the sink point of the web server.

Page 29: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Performance

Page 30: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Related Work

• DTA Implementations: – TaintCheck, LIFT, Dytan and Flayer are

implemented using DBI frameworks such as Valgrind, StarDBT, and PIN.

– TaintBochs, Argos, and Panorama are implemented on whole-system emulators such as Bochs or QEMU.

Page 31: Capturing Information Flow with Concatenated Dynamic Taint Analysis

References• Chi-keung Luk, Robert S. Cohn, Robert Muth, Harish Patil, Artur Klauser,

P. Geoffrey Lowney, Steven Wallace, Vijay Janapa Reddi, Kim M. Hazelwood : Pin: building customized program analysis tools with dynamic instrumentation, PLDI, 2005 (Citations: 267)

• Nicholas Nethercote, Julian Seward : Valgrind: a framework for heavyweight dynamic binary instrumentation , PLDI, 2007 (Citations: 118)

• Cheng Wang, Shiliang Hu, Ho-seop Kim, Sreekumar R. Nair, Mauricio Breternitz, Zhiwei Ying, Youfeng Wu : StarDBT: An Efficient Multi-platform Dynamic Binary Translation System, ACSAC, 2007 (Citations: 4)

• Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Timothy L. Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield : Xen and the art of virtualization, SOSP, 2003 (Citations: 612)

• Fabrice Bellard : QEMU, a Fast and Portable Dynamic Translator, USENIX, 2005(Citations: 77)

• G. Edward Suh, Jae W. Lee, David Zhang, Srinivas Devadas : Secure program execution via dynamic information flow tracking, ASPLOS, 2004 (Citations: 159)

• Guru Venkataramani, Ioannis Doudalis, Yan Solihin, Milos Prvulovic : FlexiTaint: A programmable accelerator for dynamic taint propagation, HPCA, 2008 (Citations: 12)

• Heng Yin, Dawn Xiaodong Song, Manuel Egele, Christopher Kruegel, Engin Kirda : Panorama: capturing system-wide information flow for malware detection and analysis, CCS, 2007 (Citations: 47)

Page 32: Capturing Information Flow with Concatenated Dynamic Taint Analysis

Thank you!