automated worm fingerprinting authors: sumeet singh, cristian estan, george varghese and stefan...

Post on 08-Jan-2018

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Existing Detecting Techniques  Scan detection Example: code red. Network telescope: passive network monitors that observe large ranges of unused, yet routable, address space. Assumption: worms select target victims at random Limitations: not suited to non-random spreading worms

TRANSCRIPT

Automated Worm Fingerprinting

Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan SavagePublish: OSDI'04. Presenter: YanYan Wang

Introduction

Recent large scale internet worm post profound threat.

Traditional detection methods are usually expensive and slow.

This paper investigate “Early bird” method that automatically detect and contain new worms on the network using precise signature.

Existing Detecting Techniques

Scan detection Example: code red. Network telescope: passive network

monitors that observe large ranges of unused, yet routable, address space.

Assumption: worms select target victims at random

Limitations: not suited to non-random spreading worms

Existing Detecting Techniques

Honeypots Monitoring idel hosts with untreated

vulnerabilities Limitations: requires significant amount

of slow manual analysis, depend on the honeypot being quickly infected

Existing Detecting Techniques

Behavioral techniques at end hosts Dynamically analyze the patterns of

system calls for anomalous activity. Limitations: expensive, only detect

attack against a single host.

Characterization

Priori vulnerability signatures: match known exploitable vulnerabilities in deployed software.

Automation for signature extraction: extracts the infected decoy programs in a controlled environment and identify invariant code strings.

Autograph: (early bird)

Containment

To slow or stop the spread of an active worm Host quarantine: preventing an infect host

from communicating with other hosts String matching: matches network traffic

against particular strings, or signatures Connection throttling: limit rate of all outgoing

connection made by a machine, slow but not stop

Worm Behavior

Content invariance Program is identical across every host it

infects, though some has limited polymorphism

Content prevalence: content not prevalent is not useful for constructing signatures

Address dispersion: the no. of infected hosts will grow over time

Finding Worm Signature: Content Sifting

For each network: Extract content and process substring Index each substring into a prevalence

table Each table entry includes IP addresses Sort the table

Finding Worm Signature: Content Sifting

Huge memory consumption: Multi-stage filters

Finding Worm Signature: Content Sifting

Address dispersion: trade precision for dramatic reductions in memory requirements Example: For example, to count up to

64 sources using 32 bits, one might hash sources into a space from 0 to 63 yet only set bits for values that hash between 0 and 31 . thus ignoring half of the sources.

Finding Worm Signature: Content Sifting

Payload string requires significant processing: value sampling select only those substrings for which the

fingerprint matches a certain pattern. Example: if f is the fraction of the tracked

substrings (e.g. f = 1=64 if we track the substrings whose Rabin fingerprint ends on 6 0s), then the probability of detecting a worm with a signature of length x is

Finding Worm Signature: Content Sifting

If = 1=64 and = 40, the probability of tracking a worm with a signature of 100 bytes is 55%, but for a worm with a signature of 200 bytes it increases to 92%, and for 400 bytes to 99.64%.

Practical Content Sifting: Early Bird packet granularity

Early Bird

As each packet arrives, its content (or substrings of its content) is hashed and appended with the protocol identifier and destination port to produce a content hash code. 32 bit cyclic redundancy check (CRC) 40 byte rabin fingerprints for substring

hashses

Early Bird

If the content hash is not found in the dispersion table, it is indexed into the content prevalence table. 4 independent hash functions creat

indexes into 4 counter arrays.

Early Bird

Practical Content Sifting: Early Bird

Prototype System : Early Bird

Sensor: sifts through traffic on configurable address space “zones” of responsibility and reports anomalous signature.

Aggregator: coordinated real-time updates from the sensors, coalesces related signatures, activates any network-level or host level blosing services and is responsible for administrative reporting and control.

Single threaded, excute at user-level, and captures packets using libpcap library.

Prototype System

Early Bird

Early Bird

Early Bird

Early Bird

What’s the paper’s contribution?

A combination of existing and novel algorithms for content sifting

Low memory and CPU requirements

What’s the paper’s weakness?

Depend on invariant content Attackers can design variant content for worms

Attackers can evade by creating metamorphic worms and traditional IDS evasion techniques

Assume max growing time Automated containment can be used

trigger a worm defense by attackers.

How to improve the paper?

Hybrid pattern matching: separate non code string from potential exploits

Investigate traffic normalization Maintain triggering date across

multiple time scale Develop efficient mechanisms for

comparing signature with existing traffic corpus

top related