revealing botnets using network traffic statistics - …spi.unob.cz/papers/2011/2011-02.pdf ·...

Revealing Botnets Using Network Traffic Statistics

Pavel Čeleda

[email protected] Institute of Computer Science,

Masaryk University, Brno Czech Republic

Radek Krejčí

[email protected] Faculty of Informatics,


Vojtěch Krmíček

[email protected] Faculty of Informatics,


Abstract

This paper presents a state-of-the-art overview of Unix-like embedded malware. We describe botnets using network connected embedded devices for illicit activities. There does not exist suitable security solution (anti-virus or anti-malware) for these devices. We propose an approach using network traffic statistics to reveal Unix-like embedded malware and its activities. The paper is based on our observation of real world malware attacking vulnerable embedded devices world wide.

Keywords: PSYB0T, Chuck Norris botnet, Kaiten, Hydra, malware, botnet, NetFlow.

1 Introduction

Nowadays malware (viruses, worms, botnets) attacks any vulnerable system. In the past they mostly attacked hosts with widespread single vulnerability in large-scale. New malware variants are more and more prepared to exploit high-value targets. At the same time they are looking for new kinds of vulnerable devices.

Embedded network devices have become a ubiquitous part of nowadays homes and offices. ADSL modems, WiFi routers, set-top-boxes, music centers, etc. are sitting in their place, quietly doing their job and being regarded as a pure hardware (see Figure 1).

Figure 1: Modern home with a network enabled devices.

Security and Protection of Information 2011 7

Often Unix-like operating system is running inside them. Linux variants like a BusyBox system [1] are able to run on really limited hardware while having capabilities comparable to ordinary computer. This is often overlooked and embedded devices are likely to be accompanied by a “plug-in-and-do-not-care” mentality, which causes many security problems. Widely deployed and often misconfigured, embedded network connected devices constitute highly attractive targets for exploitation [2]. There is not available anti-virus or anti-malware software to protect these devices.

Any network connected device leaves communication traces e.g., connection establishment, data downloads and uploads etc. Mechanisms like a NetFlow/IPFIX generation are used to keep the communication traces. Flow data doesn't include a payload but contain information about who communicates with whom, when, how long, how often, using what protocol and service and also how much data was transferred. Such information reveals details about device behavior and can be used to detect embedded malware activities. We present in our paper an approach how to use traffic statistics to detect Unix-like embedded malware.

The paper is organized as follows. After a short overview of malware detection methods, we present a description of most well-known Unix-like embedded botnets. Then we describe typical architecture and behavior of Unix-like embedded botnets. Finally, we conclude with real world example discussing detection of Chuck Norris botnet using network traffic statistics.

2 Related Work

Various botnet detection techniques were introduced in past years. Most of them were focused on PC based malware targeting Microsoft Windows operating systems. Public availability of some malware source codes (e.g., Agobot, SDBot, Q8bot, Kaiten, ...) written in C, C++ activated development of further functionalities and cross-platform availability. New malware variants based on public code steadily appear. Unix-like embedded malware is a special kind of cross-platform malware.

First publicly disclosed embedded malware is called PSYB0T and was manually detected by Terry Baume [3]. PSYB0T manual detection was accidental and can not be systematically used.

Honeypots [4] are devices which are vulnerable to malicious attacks. They can be used to capture and to understand malware activities, e.g., Chuck Norris botnet 2 was detected with Kojoney honeypot [5]. Both approaches require that malware will attack vulnerable host and attract attention by its activities.

To track malware activities in large-scale passive traffic monitoring can be used. Instead of waiting for attack against honeypot traffic from/to all hosts can be observed. Two principal approaches are signature based and anomaly based detection. Signature-based detection uses signatures of known malware. Tools like Snort, Bro or Suricata use predefined detection rules and patterns to find malware. But they are limited to detect only the known one.

Anomaly based detection uses algorithms based on a network behavioral analysis to classify legitimate and malicious traffic. Using network behavioral analysis (NBA) in comparison with signature based methods allows to recognize unknown malware. Generic NBA system present CAMNEP [6]. BotMiner [7] is botnet oriented system that can detect real-world botnets and has a very low false positive rate.

3 Unix-like Embedded Malware

Following list describes selected bots targeting embedded devices with Unix-like operating systems. Despite an age and an original usage of some mentioned tools, many variants of them can be still found inside embedded devices around the world.

8 Security and Protection of Information 2011

Kaiten

Kaiten (formally named Linux.Backdoor.Kaiten) was detected on 14 February 2006 [8]. It is a simple IRC client providing besides IRC functionality also ability for DDoS attacks by several types of floods. For more details anyone can get publicly available source code [9]. Kaiten was used as a base for part of the Chuck Norris botnet tools and numerous other botnets with IRC C&C centers.

Hydra

Hydra has appeared in 2008. It is another IRC bot with floods capabilities but providing, in addition to Kaiten, also scanning for other vulnerable devices. Despite a similarity of functions, Hydra has a comple-tely different code in comparison to Kaiten. Source codes of the Hydra tool are publicly available [10].

There is another tool with a similar name – THC-Hydra [11]. This tool is not an IRC client. It is a login cracker supporting numerous protocols and services. THC-Hydra is available since 2001 as a helpful tool for security researchers and consultants.

Kaiten as well as Hydra were originally Linux bots used on standard x86 (PC) architecture. Lately they were widely used as a cross-platform to infect embedded (MIPSEL) devices.

PSYB0T

PSYB0T is considered to be the first widespread worm targeting SOHO (Small Office/Home Office) devices. It was disclosed by security researcher Terry Baume [3] in January 2009. It became more famous in March 2009 when it was labeled as an inflicter of the DDoS attack against DroneBL site [12]. The botnet was shut down by its master shortly after this incident. Botmaster announced that it was a fun but that the research is over. In that time the size of the botnet was estimated about 80‒100 thousands of bots.

Chuck Norris Botnet

For the first time the Chuck Norris (CN) botnet was detected in Decemeber 2009 [13]. The detection was achieved thanks to continual network monitoring at Masaryk university Brno. Public disclosure ensued in February 2010 and on 22 February 2010 the activity of the botnet’s C&C centers was suspended. Estimated size of the botnet was over 30,000 devices in that time.

Ongoing development of the Chuck Norris botnet was documented in November 2010. Marco van Berkum announced [14] detection of a new version of the Chuck Norris botnet. Detailed analysis of the development progress can be found in [15].

4 Architecture of Unix-like Embedded Botnets

A behavior of botnets mentioned in the previous section is very similar. Their activity can be divided into the following phases:

1. Scanning for other vulnerable devices in the (selected) network.

2. Infection of found vulnerable devices.

3. Bot initialization and update.

4. Connecting to the C&C center via IRC protocol and listening to commands.

5. Performing attacks (DDoS, DNS spoofing, etc.) proceeded from the C&C center.

The common threat lifecycle represents needs and aims of the threat itself. Generally speaking, it needs to spread to as much as possible vulnerable devices. When some vulnerable devices are found, it exploits them. Next step is to download the newest bot code. When the up-to-date bot code is downloaded and executed, bot connects to the botnet C&C center and waits for further commands. Once it receives


a specific command, it starts to perform requested attack. The set of a botnet's inner processes needed for mentioned purposes can be seen in Figure 2.

Figure 2: The scheme of a botnet's inner processes.

Following paragraphs describe each phase of the botnet lifecycle in details from the networking point of view. Described characteristics are used for botnet detection using NetFlow technology. The specific detection methods are described later in Section 5.

Scanning for Vulnerable Devices

The goal of the propagation phase is to find as much as possible vulnerable hosts, which could be infected. Depending on the particular threat type, it performs a large-scale scans against the devices service ports:

port 22 – SSH,

port 23 – TELNET or

port 80 – HTTP.

The scans are not against the completely random hosts. The scanners have usually predefined lists of IP prefixes [13, 15] of networks supposed to contain a number of possibly vulnerable devices. The prefixes usually include networks of Internet service providers (ISPs) providing SOHO modems and routers with default configuration to their customers.

Part of the infected devices are used 24 hours a day, 7 days a week and they are continually available to the botmaster. But there is a significant part of the botnet, which changes rapidly. In comparison to a common PC botnets, the most of botnets for embedded devices do not infect the devices permanently. The botnet executables are stored in RAM, so disinfection can be simple performed by power cycling the device. In addition, some networks use dynamic IP address configuration, so after reboot they appear with different IP address. Therefore, scanning is performed regularly against the same networks to gain an updated list of connected vulnerable devices.


Device Infiltration

A discovery of a possible vulnerable device leads to the TELNET or SSH brute force dictionary attacks with the goal to infect the device. The TELNET is the most frequently used protocol for the attack. But the botnets also try to infiltrate into the device through the SSH or web (HTTP) configuration interfaces. Some botnets implement exploits for specific devices, but the most widely method to compromise the device is trying default logins and passwords (dictionary attack). This way botnet compromises poorly configured devices. Unfortunately this approach is sufficient for a number of devices around the world [2].

Initialization and Update

After the successful login into the victim device, the download of the bot’s current code is initiated. There is usually a list of domain names or IP addresses, where the bot code is placed. The botnet up-to-date executables are simply downloaded using, e.g., wget or other download programs like ftpget or tftp. Botnets use a public free web hosting services frequently. Therefore it is difficult to distinguish regular communication with the web pages at the free web hosting server and a bot's downloads from the same web server. There is no anti-virus or other detection program, which could analyze downloaded data.

This phase can include any other operation usually performed as a command in a device’s shell. As an example of such operation, we mention, e.g., setting up iptables to block any attempt of a connection to the device configuration interface. But this type of operations usually do not generate any network traffic and therefore they can not be detected by NetFlow monitoring.

Communication with C&C Center

All botnets mentioned in Section 3 use IRC communication to control the botnet. Although many recent botnets started using fast-flux technology or Hybrid P2P system, embedded device botnets we were able to observe still use simple IRC protocol and single server centralized structure (with possible one or two backup C&C centers). They use static domain name or IP address of the C&C center hard coded inside the bot. Changes of the C&C center can be made only with bot updates.

Attacking

One of the most powerful activity of this botnet class is a capability to run any shell commands inside the infected device. Unfortunately, as mentioned early, this activity can not be generally detected by NetFlow monitoring. We are able to detect only some consequences of the selected operations affecting network behavior of the device. These operations include e.g., changes of used DNS servers.

The most typical harmful activity of all botnets mentioned in the previous section is flooding resulting in DDoS attack. The bots provide a various types of floods including e.g.,

UDP flood,

TCP SYN flood and

TCP PSH ACK flood.

In addition, they also provide possibility for spoofing source IP address of transmitted packets with random (or random in predefined range) values.

5 Embedded Malware Detection – Chuck Norris Botnet Use Case

Using IP flow statistics (NetFlow/IPFIX) for network security awareness and monitoring is widely used approach and could provide a solid protection of the observed network [6]. We can use NetFlow moni-toring for revealing Unix-like embedded botnets also, because the generic botnet architecture lifecycle presented in the previous section leaves a remarkable NetFlow traces in the monitoring data.


In this section, we identify a set of generic NetFlow signatures, representing particular botnet lifecycle phases. Any occurrence of such signatures in observed network could indicate a presence of possible botnet infection. We validated these general detection signatures on the recent variant of Chuck Norris Botnet – its version 2 (CNBv2) [15]. These methods are presented as filters for NFDUMP tool [16], which is frequently used for NetFlow data processing.

Scanning and Infiltration of Vulnerable Devices

In the case of scanning, we can observe a large amount of flows from outside network against devices in the observed network. The attackers perform a horizontal scan, i.e. they initiate a new connection against every possible IP address in the scanned subnet. These connections are against one of the service ports, i.e. SSH port (22), Telnet port (23) or HTTP port (80). This behavior results in a large number of unsuccessful connections, because not every scanned IP address is alive and it doesn’t accept a new connection at the scanned port. Such connections don’t finish TCP three way handshake [17] and therefore they contain only SYN (S) or RESET (R) TCP flags, but not ACKNOWLEDGE (A) flag. When we filter out such connections (TCP connections with TCP flags S or SR against the port 22, 23 or 80), we got a list of devices outside our network, which could be (but not necessarily) a part of botnet.

CNBv2 Detection Method

CNBv2 has changed its propagation mechanism and it targets Telnet (23) and SSH (22) services. It uses TCP scanning, therefore we can define a simple detection method revealing possible botnet scanning against devices in our local network (denoted as local_netwok) as:

(dst net local_network) and (dst port 22 or dst port 23) and (proto TCP) and ((flags S and not flags ARPUF) or

(flags SR and not flags APUF))

Figure 3: NFDUMP filter for CNBv2 scanning detection.

Initialization and Update

In the case of bot code download, we observe an HTTP connection from a local network to the web server with update files. To perform bot update download, a regular HTTP connection against HTTP port (80) has to be established, so we filter out only connections containing TCP flags S, A and not flag R. By the analysis of the bot code we can get a list of used IP addresses or domain names, so we are able to filter out only connections against these domain names/IP addresses. The main limitations of this approach is no possibility to distinguish a regular request for WWW pages and the connection performing bot code


download from the same web server. Such a situation is frequent in the case of virtual web hosting, where a single IP address is used for hosting multiple web domains (including the domains with bot code).


CNBv2 use a list of hard coded IP addresses with web servers containing botnet code (denoted as web_servers). Therefore the detection method is defined by revealing all successful HTTP connections going out of our local network (denoted as local_network) and connecting to the list of know botnet IP addresses:

(src net local_network) and (dst hosts web_servers) and (dst port 80) and (proto TCP) and (flags SA and not flag R)

Figure 4: NFDUMP filter for CNBv2 initialization and update detection.

Communication with C&C Center

Because the botnet member needs to communicate with C&C server regularly, we can see a set of TCP connections to the IRC server. These connections use a set of predefined IRC ports (including default IRC port 6667, the IRC port numbers can vary depending on the particular botnet). By the analysis of the botnet code we can get the IRC port number and the domain names or IP addresses of the IRC C&C servers.


CNBv2 contains a list of three hard coded IP addresses of IRC C&C servers. These IP addresses can be changed only by the update of the botnet code. Therefore the detection method is defined by revealing all established IRC connections going out from the monitored network (denoted as local_network) and connecting to the list of known IRC C&C botnet servers. Depending on the IP address of the used C&C server, it uses TCP port 6667 or 12000 for IRC communication [15].


(src net local_network) and (((dst ip 62.211.73.229 or dst ip 62.211.73.230) and port 6667)

or (dst IP 93.184.100.76 and port 12000)) and (proto TCP) and (flags SA and not flag R)

Figure 5: NFDUMP filter for CNBv2 C&C communication detection.

Attacking

As mentioned in the previous chapter, the detection of bot attacks is difficult, because we can see only the flow activity generated by the infected device. We are not able to inspect commands and scripts executed at the device itself. Even so we can detect several types of botnet attacks analyzing NetFlow data from monitored network.

DNS spoofing attack is based on the misconfiguration/change of the DNS server IP addresses in the infected device. The user is then redirected to the spoofed botnet DNS server and becomes a target of the phising attack. In such case, we can detect connections to the DNS servers (using UDP protocol/port 53) located outside local network, not ordinary used in the local network. These connections would indicate possible infection of the inspected device.

Detection of the most typical harmful activity, (D)DoS attacks, generated by the infected devices, is not trivial. We need to distinguish all regular connection going outside the inspected device from the attack traffic. For the (D)DoS attacks are used e.g., UDP floods, TCP SYN floods or TCP PSH ACK floods. Such attack represents a significant change in the device behavior. Therefore we can detect it e.g., by using behavior profiles of each device. Another way is a detection of the spoofed source IP addresses in the monitored traffic (i.e. addresses not originated in the local network).


The devices infected by the CNBv2 have usually changed their DNS servers. As the primary DNS server is used a forged botnet DNS server (denoted as botnet_DNS_server) and as the secondary DNS server is used publicly available OpenDNS server (denoted as OpenDNS_server). Therefore we can identify a DNS communication of botnet members by detecting DNS flows going from a local network (denoted as local_network) against these two types of DNS servers.


(src net local_network) and ((dst ip OpenDNS server) or (dst ip botnet DNS server)) and (proto UDP) and (dst port 53)

Figure 6: NFDUMP filter for CNBv2 DNS spoofing attack detection.

6 Conclusion

The presented approach for revealing Unix-like embedded botnets can be used for the detection of any botnet variant with behavior presented in Section 4. Some of the signatures can not be generally defined and must be adapted to the specific type of botnet (or only for particular revision of the botnet).

The signatures and corresponding detection methods, applicable to any type of Unix-like embedded botnets, include:

The botnet scanning (the botnets are using a fixed set of ports).

The (D)DoS attacks (well-known behavior changes in the NetFlow data during the (D)DoS attack).

The DNS spoofing attacks (DNS connections going to the nonstandard DNS servers).

The communication with IRC C&C servers (in the case of network with a strict policy restricting IRC connections).

The signatures for revealing botnet download/initialization and the communication with the IRC C&C server (in the case of academic/open network) have to be customized to the particular botnet. The IP addresses and ports of web servers and IRC C&C servers may vary depending on the botnet variant.

Acknowledgement

This work was supported by the Czech Ministry of Defence under Contract No. SMO02008PR980-OVMASUN200801.

References

Erik Andersen et al.: BusyBox project, http://www.busybox.net, March 2011.

Ang Cui and Salvatore J. Stolfo.: A Quantitative Analysis of the Insecurity of Embedded Network Devices: Results of a Wide-Area Scan. In Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC '10, pages 97106, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0133-6.


Baume, T.: Netcomm NB5 Botnet – PSYB0T 2.5L, available from

http://baume.id.au/psyb0t/PSYB0T.pdf, January 2009.

Bacher, P., Holz, T., Kotter, M. and Wicherski, G.: Know your Enemy: Tracking Botnets. http://www.honeynet.org/papers/bots, 2008.

Coret, J.: Kojoney: A honeypot for the SSH Service, http://kojoney.sourceforge.net/.

Čeleda, P., Rehák, M., Krmíček, V. and Bartoš, K.: Flow Based Security Awareness Framework for High-Speed Networks, In Security and Protection of Information 2009. Brno : University of Defence, 2009. p. 3-13, 11 s. ISBN 978-80-7231-641-0.

Gu, G., Perdisci, R., Zhang, J. and Lee, W.: BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection. Proceedings of the 17th conference on Security symposium. USENIX Association. San Jose, CA. 2008.

Symantec Corp.: Linux.Backdoor.Kaiten, available from http://www.symantec.com/security_response/writeup.jsp?docid=2006-021417-0144-99.

kaiten.c IRC DDoS Bot, available from http://packetstormsecurity.org/irc/kaiten.c.

Hydra – Mass DDoS Tool, available from http://data.nicenamecrew.com/papers/malwareforrouters/resources/dlink-automatic/hydra-2008.1.zip.

The Hackers Choice: THC-Hydra, available from http://www.thc.org/thc-hydra/.

Nenolod: Network Bluepill – stealth router-based botnet has been DDoSing dronebl for the last couple of weeks. 2010, http://www.dronebl.org/blog/8.

Čeleda, P., Krejčí, R., Vykopal, J. and Drašar, M.: Embedded Malware – An Analysis of the Chuck Norris Botnet, Proceedings of the 2010 European Conference on Computer Network Defense (EC2ND’10). IEEE Computer Society, ISBN 1-58113-435-5. 2010.

Marco van Berkum: SSH scans, I caught one, Full disclosure mailing list archives, available from http://seclists.org/fulldisclosure/2010/Nov/228, November 2010.

Čeleda, P. and Krejčí, R.: An Analysis of the Chuck Norris Botnet 2, technical report, available from http://www.muni.cz/ics/research/cyber/files/cnb-2.pdf, March 2011.

Haag, P.: NFDUMP – NetFlow processing tools. http://nfdump.sourceforge.net/, 2011.

Wikipedia contributors: “Transmission Control Protocol.” Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 16 Mar. 2011. Web. 18 Mar. 2011.


revealing botnets using network traffic statistics - …spi.unob.cz/papers/2011/2011-02.pdf ·...

Documents