detecting malware domains at the upper dns hierarchy · statistical features (cont.) •the...
TRANSCRIPT
Detecting Malware Domains at the Upper DNS Hierarchy
Andreas Charalampous
University Of Cyprus
Professor: Dr. Elias Athanasopoulos
1Andreas Charalampous
Motivation
• Internet miscreants have been leveraging the DNS to build malicious network infrastructures for malware command and control.
• Propose a novel detection system KOPIS for detecting malware-related domains by monitoring the traffic at the upper levels of the DNS hierarchy
2Andreas Charalampous
Related Work
• D. Wessels, M. Fomenkov, N. Brownlee, and K. Claffy. Measurements and Laboratory Simulations of the Upper DNS Hierarchy. In PAM, 2004.
– examining the DNS caching behavior of recursive DNS servers from the point of view of AuthNS and TLD servers.
• S. Hao, N. Feamster, and R. Pandrangi. An Internet Wide View into DNS Lookup Patterns. http://labs.verisigninc.com/projects/ malicious-domain-names.html, 2010.
– DNS lookup patterns measured from the .com TLD servers.– Drawback: does not discuss how the findings may be leveraged for detection purposes
3Andreas Charalampous
Related Work(cont.)
• M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, and N. Feamster. Building a Dynamic Reputation System for DNS. In the Proceedings of 19th USENIX Security Symposium(USENIX Security ’10), 2010
• L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi. Exposure: Finding malicious domains using passive dns analysis. In Proceedings of NDSS, 2011.
Drawback: Partial visibility, rely heavily on features based on IP reputation
4Andreas Charalampous
System Overview
• Kopis monitors streams of DNS queries to and responses from the upper DNS hierarchy, and detects malware domain names based on the observed query/response patterns.
• Training Mode: Uses a set of known malware related domains & feature vectors (associated with a label)
• Operation Mode: Kopis monitors the streams of DNS traffic and maps each domain extracted from streams into a feature vector.• The classifier assigns a label and a score.
5Andreas Charalampous
Statistical Features
• Kopis extracts the statistical features from the monitored DNS traffic.
Requester Diversity: Characterize if the machines that query a given domain name are localized or globally distributed.
Requester Profile: Assign higher weight to servers with many clients
Malware-related domains tend to be queried from a diverse set of requesters with a higher weight more often than legitimate domains.
Resolved-Ips Reputation: Describe whether the IP address of a domain has been historically linked with malicious activities or legitimate services (Malware Evidence, SBL Evidence, Whitelist Evidence).
6Andreas Charalampous
Statistical Features (cont.)
• The Requester Diversity and Requester Profile features, are novel and very different from the statistical features proposed in Notos and Exposure (based on IP reputation)
• Kopis extracts statistical features to harvest “malware signal” as seen from the upper DNS hierarchy
Detect malware-related domains even when no IP reputation information is available
7Andreas Charalampous
Model Selection
• Random Forest Classifier.• 2 – 5 day training window. (150 datasets)
• Achieved a TPrate = 98.4% and a Fprate = 0.3% for five day observation window.
8Andreas Charalampous
Evaluation
• 150 datasets
• How Kopis would operate if trained on datasets assembled by features from each family, first separately and then combined.
• Isolated the features from the RD, RP and IPR into three additional datasets
9Andreas Charalampous
Evaluation(cont.)
10Andreas Charalampous
Evaluation(cont.)
• the combination of the RD and RP sets of features performs almost equally to the IPR features.
• using all features performs much better than using each single feature subset in isolation
Correct classification of domains where IP reputation is absent
11Andreas Charalampous
Evaluation(cont.)
• Evaluate its ability to detect new and previously unclassified malware domains.
• Excluded 20% of both benign and malware-related domains
• used the next three weeks of data as an evaluation dataset (contained excluded domains + newly seen domains).
• Detection Threshold
Detect new and previously unclassified domains with an average TPrate of 73.62% and average FPrate of 0.53%.
12Andreas Charalampous
Evaluation (cont.)
• Canadian TLD• Greater global visibility
• Used a data reduction tool
• The average Fprates = 0% and Tprates =94.44% (after filtering)
• DDos Botnet Detection• Kopis was able to identify a commercial DDoS botnet in the first few weeks of
its propagation in China and well before it began propagating within other countries, including the US
13Andreas Charalampous
Conclusions
• Kopis, a system that can operate at the upper DNS hierarchy and detect malware related domains based on global DNS resolution patterns.
• Kopis models three key signals at the DNS authorities:• the daily domain name resolution patterns• the significance of each requester for an epoch• Domain name’s IP address reputation
• identify malware-related domains on the rise can provide the DNS operators the ability to remove growing botnets at an early stage
14Andreas Charalampous
ZMap: Fast Internet-wide Scanning and Its Security Applications
Andreas Charalampous
University Of Cyprus
Professor: Dr. Elias Athanasopoulos
15Andreas Charalampous
Motivation
• Internet-wide network scanning has numerous security applications, including exposing new vulnerabilities and tracking the adoption of defensive mechanisms, but probing the entire public address space with existing tools is both difficult and slow.
• Zmap scanner: perform Internet-wide scans and capable of surveying the entire IPv4 address space in under 45 minutes from user space on a single machine
16Andreas Charalampous
Related Work
• Gordon Fyodor Lyon. Nmap Network Scanning: The Official Nmap Project Guide to Network Discovery and Security Scanning. Insecure, USA, 2009.
• D. Leonard and D. Loguinov. Demystifying service discovery: Implementing an Internet-wide scanner. In 10th ACM SIGCOMM conference on Internet measurement (IMC), pages 109–122, 2010.
Drawback: Too slow
Drawback: Does not process responses. They need custom network drivers.
17Andreas Charalampous
ZMap
• Optimized probing• attempt to send probes as quickly as the source’s NIC can support.
• can send probes at gigabit line speed.
• No per-connection state• Does not maintain state for each connection to track which hosts have been
scanned.
• No retransmission• Do not retransmits probes that are lost due to packet loss.
• Sends a fixed number of probes (1) per target.
18Andreas Charalampous
Nmap vs ZMap
• Nmap adapts its transmission rate to avoid saturating the source or target networks but Zmap assume that the source network is well provisioned.
• While Nmap maintains state for each connection to track which hosts have been scanned and to handle timeouts and retransmissions ZMap forgoes any per-connection state.
• Nmap detects connection timeouts and adaptively retransmits probes that are lost due to packet loss.
19Andreas Charalampous
Interpretation components
Output Handlers
20Andreas Charalampous
Evaluation
• Determine whether Zmap and the upstream network can handle scanning at gigabit line speed.• Whether scan rate has any effect on the hit rate
Slower scanning does not reveal additional hosts.
21Andreas Charalampous
Evaluation
• what fraction of target hosts does ZMap actually find using its default single-packet probing strategy.
Sending just 1 SYN will achieve about 98% coverage.
22Andreas Charalampous
Evaluation
• Find the optimal time of day to perform a scan.
The highest response rates were at approximately 7:00 AM EST
23Andreas Charalampous
Applications and Security Implications
• Scanning the IPv4 address space in under an hour gives the ability to:– gain visibility into previously opaque distributed systems
– understand protocol adoption
– uncover security phenomenon
• High-speed scanning also has potentially malicious applications:• Finding and attacking vulnerable hosts
• disrupt existing security models
24Andreas Charalampous
Visibility into Distributed Systems
• e-commerce and secure web transactions inherently depend on browser trusted TLS certificates
• Most CAs do not publish lists of the certificates they have signed and, due to delegation of authority to intermediate CAs, it is unknown who can sign browser-trusted certificates at any given time.• Wide-scale visibility into CA behavior can help to identify security problems.
• A CA certificate that was accidentally issued to a Turkish transit provider
• 1,300 CA certificates were misissued by the Korean Government
25Andreas Charalampous
Enumerating Vulnerable Hosts
• Rapid Internet-wide scans comes the potential to quickly enumerate hosts that suffer from specific vulnerabilities.• powerful defensive tool
• powerful offensive tool
• UPnP vulnerabilities• 2.56 million (16.5%) were running vulnerable versions of the Intel SDK
• 817,000 (5.2%) used vulnerable versions of MiniUPnPd
• Weak Public Keys• 44,600 unique certificates utilized factorable RSA keys
• 2,743 unique certificates that contained Debian weak keys
26Andreas Charalampous
Monitoring Service Availability
• identify Internet outages and disruptions to service availability without an administrative perspective.
Scanning can help service providers identify networks and physical regionsthat have lost access to their service.
27Andreas Charalampous
More Security Applications
• Tracking Protocol Adoption• Attempted to understand the adoption of new protocols, address depletion,
common misconfigurations, and vulnerabilities through active scanning.
• Gain instantaneous visibility into the deployment of multiple protocols by performing many ZMaps scans of different ports.
• Discovering Unadvertised Services• Uncover unadvertised services that were previously only accessible with
explicit knowledge of the host name or address.• i.e Tor Bridges
28Andreas Charalampous
Privacy and Anonymous Communication
• High-speed scanning raises potential new privacy threats, such as the possibility of tracking user devices between IP addresses• a company could track home Internet users between dynamically assigned IP addresses
based on the HTTPS certificate
• Provide the basis for a system of anonymous communication• Rather than using the scanner to send probes, it could be used to broadcast a
short encrypted message to every public IP address.
29Andreas Charalampous
Scanning and Good Internet Citizenship
• Internet-wide scanning involves interacting with an enormous number of hosts and networks worldwide.• It would be impossible to request permission in advance from the owners of
all these systems.
• give traffic recipients the ability to opt out of further probes.
• To avoid overwhelming destination networks Zmap scan addresses according to a random permutation.
30Andreas Charalampous
Conclusions
• ZMap, a network scanner specifically architected for performing fast, comprehensive Internet-wide surveys.
• ZMap is capable of scanning the public IPv4 address space on a single port in under 45 minutes with an estimated 98% coverage of publicly available hosts.
• Security applications of high speed scanning, also provides new attack vectors that we must consider when defending systems
31Andreas Charalampous
ZMap: Fast Internet-wide Scanning and Its Security Applications
Thank you! Questions?
Andreas Charalampous
University Of Cyprus
Professor: Dr. Elias Athanasopoulos