hash-based ip traceback

28
Hash-Based IP Traceback Alex C. Snoeren, Craig Partidge, Luis A. Sanchez, Christine E. Jones, Fabrice Tchakountio, Stephen T. Kent, and W. Timothy Strayer SigComm Aug. 2001 San Diego, Ca Presented by Chris Dion

Upload: hallie

Post on 19-Jan-2016

23 views

Category:

Documents


1 download

DESCRIPTION

Hash-Based IP Traceback. Alex C. Snoeren, Craig Partidge, Luis A. Sanchez, Christine E. Jones, Fabrice Tchakountio, Stephen T. Kent, and W. Timothy Strayer SigComm Aug. 2001 San Diego, Ca. Presented by Chris Dion. Tonight’s Outline. Introduction to the problem What is IP Traceback? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hash-Based IP Traceback

Hash-Based IP Traceback

Alex C. Snoeren, Craig Partidge, Luis A. Sanchez, Christine E. Jones, Fabrice Tchakountio, Stephen T.

Kent, and W. Timothy Strayer

SigComm Aug. 2001 San Diego, Ca

Presented by Chris Dion

Page 2: Hash-Based IP Traceback

Tonight’s Outline

• Introduction to the problem

• What is IP Traceback?

• Some Previous Work

• Overview of the Proposed Solution

• Implementation/Simulation

Page 3: Hash-Based IP Traceback

Internet Anonymity

• Not all attacks are large flooding DOS attacks

• Well placed single packet attacks can be just as effective

• These packets can be spoofed to appear from almost anywhere

• How can we track these attacks and find their origin?

Page 4: Hash-Based IP Traceback

Current Methods

• Use of ingress filtering to limit source address– Not all routers can look at every packets

source address

• Spoofed addresses are all to often found– NAT– Mobile IP– Hybrid satellite architectures

Page 5: Hash-Based IP Traceback

IP Traceback

• Some Assumptions about the network– Packets may be Multi- or broadcast

• Tracing system must be prepared for multiple packets

– Attackers can get into routers• Tracing must not be confounded by a motivated attacker

– Routing behavior of network can be unstable• Tracing must be prepared to handle divergent information

– Packet Size Should not grow due to Tracing– End hosts may be resource constrained– Tracing is an infrequent operation

• Can use routers control path vs. data path

Page 6: Hash-Based IP Traceback

Attack Path

Attack packet #1

Attack packet #2

Possible Compromised Routers

Victim

Page 7: Hash-Based IP Traceback

Packet Transformations

• Packets may be modified for number of valid reasons– Packet fragmentation– IP option processing– ICMP processing– Packet duplication– NAT– IPsec Tunneling

• Less then 3% of Internet traffic in 2000• Attackers can use these!

Page 8: Hash-Based IP Traceback

Some Previous work

• 2 approaches to determining route:– Audit of flow as it traverses network

• Can grow packet with route information, use fields in header, or use out-of-band signaling

– Inference of flow based on its impact on state of network

• Systematically floods network and watch for variations in received packet flow

• Becomes infeasible when flow sizes approach a single packet

Page 9: Hash-Based IP Traceback

Packet Digests

• We do not need the entire packet– Reduces storage requirements – Need only packet header to determine attacker– Still need to uniquely determine packet– Security concerns

• Mask out fields that modify along a packets route:– Type of Service– TTL– Checksum– IP Options

Page 10: Hash-Based IP Traceback

IP Packet fields for Hash Input

Page 11: Hash-Based IP Traceback

Why 28 bytes?

• WAN trace from OC-3 gateway router

• LAN trace from active 100Mb segment

• For 28 bytes– .00092% WAN– .139 % LAN

Page 12: Hash-Based IP Traceback

Bloom filters

• Used to store digests in router

• From Communications of ACM July 1970

• Computes k distinct packet digests for each packet using hash functions

• Uses results to index into a bit array

• Could potentially create false positives

Page 13: Hash-Based IP Traceback

Bloom filter

n bit digests for each packet received

K bit hash functions

Page 14: Hash-Based IP Traceback

Bloom Filters (cont)

• Restrictions on Hash Family– Must distribute a high correlated set of inputs

(packet digests)– Independent collision events (false positives

at one router is independent of neighboring routers)

• Called universal hash families

– Must be easy to compute at high link speeds

Page 15: Hash-Based IP Traceback

Source Path Isolation Engine

Page 16: Hash-Based IP Traceback

SPIE System

• DGA – Data Generation Agent– Produces packet digests of each departing packet

and stores them in a digest table– Represents the traffic forwarded in a given time

interval

• SCAR – SPIE Collection and Reduction Agent– When attack is detected, SCAR product attack graph

for it’s region

• STM- SPIE Traceback Manager– Interface to the intrusion detection system – Gathers complete attack graph

Page 17: Hash-Based IP Traceback

Traceback processing

• IDS will signal potential attack and give STM:– Packet P– Victim V, must be expressed in terms of the last-hop

routers– Time of attack T, must be in a timely fashion

• STM immediately asks all SCARs in domain to poll DGAs for digests

• SCAR will give Attack graph, then STM will work backwards to identify source

Page 18: Hash-Based IP Traceback

What if Packet is Transformed?

• Need a TLT – Transform Lookup Table with each packet digest:

IP Packet Digest

Type of Transform (ICMP, NAT, etc.)

Indirect flag

Variable for Packet Data needed to transform

Page 19: Hash-Based IP Traceback

Graph Construction

• Each SCAR is responsible for it’s region

• After gathering all digest tables, simulates reverse-path flooding (RPF)

• If packet is found in router, node is marked and arrival time is the latest possible time to search

Page 20: Hash-Based IP Traceback

Graph Construction Example

Attack Paths

SPIE Queries

Page 21: Hash-Based IP Traceback

Implementation

• Universal hash family is simulated using MD5 Hashing (128-bit output)

• Random number is pre-pended to each packet for independency

• Output is taken as 4 32-bit digests

• Size of Digest Table varies with the total traffic capacity of the router

Page 22: Hash-Based IP Traceback

Possible DGA in hardware

Page 23: Hash-Based IP Traceback

False Positive Analysis

• Use probability of false positives at p=1/8d for a theoretical limit (d=degree of router’s neighbors)– Assuming 32 node path length, approaching diameter

of the Internet

• For simulation used topology for a major ISP – 70 backbone routers with T-1 (1.54 Mbps) to OC-3

(155 Mbps)

• Sent 1000 attack packets at a constant rate to one victim, with background traffic set to a fixed false-positive rate P

Page 24: Hash-Based IP Traceback

Simulation Result

• Low value was due to link utilizations• Considerable Gap between theoretical and simulation

Page 25: Hash-Based IP Traceback

Time and Memory Analysis

• Give one minute to identify attack packet

• Memory will be linear with link capacity– We will consider Bloom filter with 3 digesting

functions and a capacity factor of 5 for a false positive rate of P = .092 when full

– Average sized packets (1000 bits)

• Using this we get a rule of thumb– SPIE requires 0.5% of total link capacity

Page 26: Hash-Based IP Traceback

Time and Memory Analysis (cont)

• 4 OC-3 links = 47 MB of storage

• 32 OC-192 links = 23.4GB for one minute

• Access Time is also important– Given DRAM cycle time of 50ns, routers

processing more then 1 OC-192 will need SRAM (only 16Mb which must be paged)

Page 27: Hash-Based IP Traceback

Some Issues

• Traceback may be requested when the network is unstable– Possibly from the attack itself– Best solution would be out-of-band

management– Priority handling may work for in-band

• ISP-ISP deployment– Possible sharing of SPIE infrastructure?– Grant STM requests to other domains

Page 28: Hash-Based IP Traceback

Conclusions

• Traceback of a single packet is very difficult• SIPE’s key contribution is that it is feasible

– Low Storage– Does not aid in eavesdropping– Complete System

• The future could discard packet digests probabilistically as they age to allow for longer traceback times