protecting cyber-ta contributors: risks and challenges vitaly shmatikov the university of texas at...
TRANSCRIPT
Protecting Cyber-TAProtecting Cyber-TAContributors:Contributors:
Risks and ChallengesRisks and Challenges
Vitaly Shmatikov The University of Texas at Austin
• Intrusion detection data• Security alerts• Firewall data
How to do collaborative analysis if networks don’t trust each other?
Goal: stop attackers from abusing these data
Big Picture
Sample Intrusion Detection Alert
may contain victim’s IP address reveals relationships with other networks
reveals target’s IP address reveals topology of targeted network and attack propagation
leaks information stored on targeted systems
may reveal organization that owns it
Basic Tradeoffs
tradeoffs
privacy and anonymity
utility efficiency
• Do not enable attackers to track attack propagation• Do not announce site defenses• Do not reveal network topology, configuration, enabled services
Support (at least) coarse-grained analysis: event trends, identification of common attack sources, connection patterns, blacklisting, etc.
Low overhead; no complicated crypto
Alerts may be used to track progress of attacks and find new vulnerabilities
Hard to tell the difference between an attacker and a legitimate researcher
Sometimes, the only difference is intent-Hard to tell by looking at data requests
Fundamental Problem
alert database
Example: Probe-Response Attack
attack a particularIP address
attack is detected andalert reported to repository
alert
attacker looks up the alert and learns the address of the detecting IDS sensor
IP hashing doesn’t help! Attacker knows targeted subnet,stages simple dictionary attack with small (<256) dictionary
repository
repository
Unique attack signature• Port combinations• Rare IDS rules• Multiple scans (to cross statistical thresholds)
Attack is detected andalert reported to repository
alert
Attacker completely mapsout network defenses andavoids them in the future
“Fingerprinting” Attacks
[E.g., see Bethencourt et al., USENIX Security 2005]
Attacker wants attack to be detected
A and B can compare their observations of events on C’s network
Dictionary attack possible, but address space is large
Enables detection of widely observed IP addresses
Current IP Address Sanitization
Is this IP address on my network? Yes: use HMAC with secret key No: use SHA-1
Can only be compared for equality with IP addresses reported by IDS on the same network
Dictionary attack not feasible
Current Alert Sanitization
Content fields scrubbed- InfectedFile, CapturedData, etc.
Timestamps rounded- Tradeoff: limit sequence analysis
High port numbers rounded- Tradeoff: limit port analysis possibilities
Unique contributor IDs (not stored)-Rely on source anonymity to hide identity
Formalization of fingerprinting attacks + secure alert correlation schemes
IP address virtualization that preserves topological structure of address space without revealing true addresses-Reconstruct topology of attack graphs
Protocols that reveal attack data only if similar attack has been observed by a threshold number of contributors
Data Sanitization Challenges
Internet
Overlay peer-to-peerrandomized routing(robust even if somenodes are compromised)
Based on Tor (low-latencyTCP-level anonymity)
Protecting Source Identity
Internet
Overlay peer-to-peerrandomized routing
Future Work: Backpropagation
Propagate analysis results back to contributors (e.g., hashed IP addresses for filtering)
Dataset poisoning and denial of service-Deliberate attacks or accidental flooding
Pre-registration and vetting are neededGroup membership credentials
- Issued through “blind” registration; unlinkable to contributor’s true identity
-Hard to guess, easy to check- Linkability of same-source contributions?
Possible attacks on registration process
Source Anonymity Issues
Contributor IDs issued by Cyber-TA Coordination Center-Random IDs unlinkable to true identity
Repositories can blacklist certain contributor IDs
Current research:- Prevention of flooding and data poisoning-Revocation mechanisms-Reputation systems
Contributor Registration
Timing Attack
Internet
Observe outgoing connection (sniff or attack 1st overlay node)
De-anonymize alert origin bycorrelating message timings
Overlay peer-to-peerrandomized routing
Additional Protection
Re-keying by alert repository-Additional keyed hashing of IP addresses
Randomized hot list thresholds- Publish only the hot list of reported alerts
that have something in common Need randomness to prevent flushing attacks
Delayed alert publication… all of these rely on repository
integrity!
Source address: can be used as a marker to learn sensor coverage
Port number: rare port numbers can be used as markers to link alerts to sensors
Destination address: reveals sensor coverage, capabilities, network topology
Port number: reveals network services
Timestamp: can be used to link an alert to the sensor that produced it
SensorID: reveals defensive services and capabilities, organization that owns sensor
EventID: reveals defensive services, capabilities, policies
Outcome: reveals target site’s vulnerabilities, topologies, policies, etc.
Captured data, Infected file: reveals private user data, topology and applications, vulnerabilities.
Sample Intrusion Detection Alert