Understanding the network level behavior of spammers
Published by :Anirudh Ramachandran, Nick Feamster
Published in :ACMSIGCOMM 2006
Presented by: Bharat Soundararajan
OUTLINE
Spam - Basics of spam - Spam statistics - Spamming methods - Spam filtering
Network level behavior of spam - Network level spam filtering - Data Collection Method - Tools used for data collection - Evaluations - Drawbacks
2
3
SPAM
What is Spam?
E-mail spam, also known as "bulk e-mail" or "junk e-mail," is a subset of spam that involves nearly identical messages sent to numerous recipients by e-mail.
Spammers use unsecured mail servers to send out millions of illegitimate emails
2007 - (February) 90 billion per day
4
Spam statistics
5
Spamming Methods
Direct spamming– By purchasing upstream connectivity from “spam-
friendly ISPs” Open relays and proxies
– Mail servers that allow unauthenticated Internet hosts to connect and relay mail through them
Botnets Using the worm to infect mail servers and sending mail through them e.g.bobax BGP Spectrum Agility
Short lived BGP route announcements
6
Botnet command and control
7
Already captured Command and control center information is used for the sinkhole to act like command and control center
All bots now try to contact the command and control sinkhole and they collected a packet trace to determine the members of botnet
They observed a significantly higher percentage of infected hosts is windows using Pof passive fingerprinting tool
Information collected is not accurate
Sink hole
8
Dns blacklisting
9
A list of open-relay mail servers or open proxies—or of IP addresses known to send spam
Data collected from Spam-trap addresses or honeypots
80% of all spam received from mail relays appear in at least one of eight blacklists
> 50% of spam was listed in two or more blacklists
Spam filtering
10
Spammers are able to easily alter the contents of the email
SpamAssasin : a spam filter used for filtering is mainly source Ip and other variables which is easily changed by spammers
They have less flexibility when comes to altering the network level details of email
Spam filtering by this paper
- Comparing data with the logs from a large ISP - Analyzing the network level behavior using those logs in the sinkhole
- Update the filter content using those comparison
11
Network-level Spam Filtering
• Network-level properties are harder to change than content
• Network-level properties– IP addresses and IP address ranges– Change of addresses over time– Distribution according to operating system, country
and AS – Characteristics of botnets and short-lived route
announcements
• Help develop better spam filters
12
Data collected when the spam is received
• IP address of the mail relay
• Trace route to that IP address, to help us estimate the network location of the mail relay
• Passive “p0f” TCP fingerprint, to determine the OS of the mail relay
• Result of DNS blacklist (DNSBL) lookups for that mail relay at eight different DNSBLs
13
Mail avenger
14
few of the environment variables Mail Avenger sets
CLIENT_NETPATH the network route to the client
SENDER the sender address of the message
CLIENT_SYNOS a guess of the client's operating system type
Distribution across ASes
15
Still about 40% of spam coming from the U.S.
Pof fingerprinting
16
Passive Fingerprinting is a method to learn more about the enemy, without them knowing it
Specifically, you can determine the operating system and other characteristics of the remote host
TTL – what TTL is used for the operating system Window Size – what window size the operating system uses DF – whether the operating system set the don’t fragment bit TOS – Did the operating system specify what type of service
OS guess from ttl values
17
OPERATING SYSTEM
VERSION TTL VALUES
LINUX Red Hat 9 64
FREE BSD 5.0 64
Solaris 2.5.1,2.6,2.7,2.8 255
Windows 98 32
windows XP 128
Distribution Among Operating Systems
18
About 4% of known hosts are non-Windows.
These hosts are responsible for about 8% of received spam.
Spam Distribution
19 IP Space
Advantages
• A key to better and efficient filtering
• Reporting of information about spam helps in updating the blacklist
20
Weaknesses
• They cannot distinguish between spam obtained from different techniques
• They didn’t precisely measure using bobax botnet
21
22
THANK YOU