protecting cyber-ta contributors: risks and challenges vitaly shmatikov the university of texas at...

Protecting Cyber-TAProtecting Cyber-TAContributors:Contributors:

Risks and ChallengesRisks and Challenges

Vitaly Shmatikov The University of Texas at Austin

• Intrusion detection data• Security alerts• Firewall data

How to do collaborative analysis if networks don’t trust each other?

Goal: stop attackers from abusing these data

Big Picture

Sample Intrusion Detection Alert

may contain victim’s IP address reveals relationships with other networks

reveals target’s IP address reveals topology of targeted network and attack propagation

leaks information stored on targeted systems

may reveal organization that owns it

Basic Tradeoffs

tradeoffs

privacy and anonymity

utility efficiency

• Do not enable attackers to track attack propagation• Do not announce site defenses• Do not reveal network topology, configuration, enabled services

Support (at least) coarse-grained analysis: event trends, identification of common attack sources, connection patterns, blacklisting, etc.

Low overhead; no complicated crypto

Alerts may be used to track progress of attacks and find new vulnerabilities

Hard to tell the difference between an attacker and a legitimate researcher

Sometimes, the only difference is intent-Hard to tell by looking at data requests

Fundamental Problem

alert database

Example: Probe-Response Attack

attack a particularIP address

attack is detected andalert reported to repository

alert

attacker looks up the alert and learns the address of the detecting IDS sensor

IP hashing doesn’t help! Attacker knows targeted subnet,stages simple dictionary attack with small (<256) dictionary

repository

repository

Unique attack signature• Port combinations• Rare IDS rules• Multiple scans (to cross statistical thresholds)

Attack is detected andalert reported to repository

alert

Attacker completely mapsout network defenses andavoids them in the future

“Fingerprinting” Attacks

[E.g., see Bethencourt et al., USENIX Security 2005]

Attacker wants attack to be detected

A and B can compare their observations of events on C’s network

Dictionary attack possible, but address space is large

Enables detection of widely observed IP addresses

Current IP Address Sanitization

Is this IP address on my network? Yes: use HMAC with secret key No: use SHA-1

Can only be compared for equality with IP addresses reported by IDS on the same network

Dictionary attack not feasible

Current Alert Sanitization

Content fields scrubbed- InfectedFile, CapturedData, etc.

Timestamps rounded- Tradeoff: limit sequence analysis

High port numbers rounded- Tradeoff: limit port analysis possibilities

Unique contributor IDs (not stored)-Rely on source anonymity to hide identity

Formalization of fingerprinting attacks + secure alert correlation schemes

IP address virtualization that preserves topological structure of address space without revealing true addresses-Reconstruct topology of attack graphs

Protocols that reveal attack data only if similar attack has been observed by a threshold number of contributors

Data Sanitization Challenges

Internet

Overlay peer-to-peerrandomized routing(robust even if somenodes are compromised)

Based on Tor (low-latencyTCP-level anonymity)

Protecting Source Identity

Internet

Overlay peer-to-peerrandomized routing

Future Work: Backpropagation

Propagate analysis results back to contributors (e.g., hashed IP addresses for filtering)

Dataset poisoning and denial of service-Deliberate attacks or accidental flooding

Pre-registration and vetting are neededGroup membership credentials

- Issued through “blind” registration; unlinkable to contributor’s true identity

-Hard to guess, easy to check- Linkability of same-source contributions?

Possible attacks on registration process

Source Anonymity Issues

Contributor IDs issued by Cyber-TA Coordination Center-Random IDs unlinkable to true identity

Repositories can blacklist certain contributor IDs

Current research:- Prevention of flooding and data poisoning-Revocation mechanisms-Reputation systems

Contributor Registration

Timing Attack

Internet

Observe outgoing connection (sniff or attack 1st overlay node)

De-anonymize alert origin bycorrelating message timings

Overlay peer-to-peerrandomized routing

Additional Protection

Re-keying by alert repository-Additional keyed hashing of IP addresses

Randomized hot list thresholds- Publish only the hot list of reported alerts

that have something in common Need randomness to prevent flushing attacks

Delayed alert publication… all of these rely on repository

integrity!

Source address: can be used as a marker to learn sensor coverage

Port number: rare port numbers can be used as markers to link alerts to sensors

Destination address: reveals sensor coverage, capabilities, network topology

Port number: reveals network services

Timestamp: can be used to link an alert to the sensor that produced it

SensorID: reveals defensive services and capabilities, organization that owns sensor

EventID: reveals defensive services, capabilities, policies

Outcome: reveals target site’s vulnerabilities, topologies, policies, etc.

Captured data, Infected file: reveals private user data, topology and applications, vulnerabilities.

Sample Intrusion Detection Alert

protecting cyber-ta contributors: risks and challenges vitaly shmatikov the university of texas at...

Documents

particular ip address

austin slide

data big picture slide

proberesponse attack

repository alert attacker

complicated crypto slide

network topology

targeted subnet