lens: leveraging anti-social network against spam · lens: leveraging anti-social network against...

18
Deutsche Telekom Laboratories An-Institut der Technischen Universität Berlin LENS: LEveraging anti-social Network against Spam Sufian Hameed, Pan Hui Technical Report No 7 March 2010

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

Deutsche Telekom Laboratories An-Institut der Technischen Universität Berlin

LENS: LEveraging anti-social Network against Spam

Sufian Hameed, Pan Hui Technical Report No 7 March 2010

Page 2: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 2

Table of contents

1 Introduction ......................................................................................................................... 3

2 LENS design ........................................................................................................................ 4 2.1 Design Goals..........................................................................................................................................................................4 2.2 Community Formation .........................................................................................................................................................4 2.3 GK Selection ..........................................................................................................................................................................5 2.3.1 Stage 1: GK selection in adjacent communities ...........................................................................................................5 2.3.2 Stage 2: GK selection beyond adjacent communities.................................................................................................6 2.3.3 Stage 3: Spontaneous GK selection to accommodate network dynamics .............................................................6

3 E-mail processing and LENS................................................................................................ 7 3.1 Mail Server’s (MSA) Responsibilities ...............................................................................................................................7 3.2 Mutual Authentication Protocol for GK selection .........................................................................................................8 3.3 Forgery of from addresses..................................................................................................................................................9 3.4 Signature from GKs ..............................................................................................................................................................9 3.5 LENS incorporated and Email processing .....................................................................................................................9 3.6 Prevention of Spam Transmission ................................................................................................................................. 10 3.7 Concerns related to compromised user....................................................................................................................... 10

4 Experiments and results .................................................................................................... 10 4.1 Social network data........................................................................................................................................................... 10 4.2 Experiments on Facebook samples .............................................................................................................................. 10 4.3 Experiments on Flickr samples ...................................................................................................................................... 11

5 Related work ..................................................................................................................... 12

6 Conclusion ........................................................................................................................ 13

7 List of abbreviations .......................................................................................................... 14

8 Table of figures.................................................................................................................. 15

9 Index of tables ................................................................................................................... 15

10 References ........................................................................................................................ 16

Page 3: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 3

Abstract

Spam is still an open problem from the network operator’s per-spective. The common state-of-the-art strategy to place filters against spam is at the recipient’s edge. Although this strategy largely solves the spam problem from the user’s perspective–false positives/negatives may still exist–it cannot prevent spam from traversing the Internet. Consequently, spam continues to consume large amounts of Internet bandwidth–there are around 200 billion spam/day–and provokes non-negligible financial loss to network operators. Therefore it becomes imperative to miti-gate spam much earlier than at the recipient’s edge. This goal has been recently accomplished only partially by placing filters at the edge of a social circle within a social network.

In this paper we introduce LENS, a novel spam protection system based on the anti-social networking paradigm, which further mitigates spam beyond social circles. The key idea of this para-digm in LENS is to let users select trusted users, called Gate-keepers (GKs), from outside their social circle and within pre-defined social distances. Unless a GK vouches for the emails of potential senders from outside the social circle of a particular recipient, those e-mails are prevented from transmission. In this way LENS drastically reduces the consumption of Internet band-width by spam to control messages only. To evaluate the scal-ability of LENS we use publicly available online social networks datasets and demonstrate that reliable email delivery from mil-lions of potential users is possible using GKs in the order of hundreds.

1 Introduction Collaboration (scientific, business or merely social ones) without email is almost impossible today. However, the explosive growth in the unsolicited email (spam) in the past decade [15] has made it impossible for email communications to function without spam protection/filtering. Currently, spam emails have largely out-numbered legitimate ones, increasing from 65% [4] in 2005 to 81% ( 200 billion spam messages daily) in 2009 [24]. Despite that researchers and practitioners have developed and deployed a broad variety of systems intended to prevent spam [6, 32, 2, 3, 5, 25, 21, 17, 16], spam remains a pressing problem of large scale. Existing approaches to combat spam falls roughly into 4 broad categories; a) content-based filtering, b) sender authenti-cation approach, c) header-based approach and d) social net-work and trust based approaches.

Each of the above approaches have certain disadvantages which are discuss in detail in section 5. In short, the spam protection systems used today only filter spam from the user’s inbox (i.e. recipient’s edge), but the spam already travels the network, and provokes non-negligible cost to network operators in terms of bandwidth and infrastructure. On the other hand, content-based filtering [2, 12], one of the most widely adopted defense mecha-nism, has turned spam problem into false positive and negative one. In consequence, this makes email delivery unreliable. False

negative is when spam is classified as legitimate and placed into the inbox. Whereas false positive can cause very serious problem as an important and legitimate email may be misclassified and may not be received on time [18].

In recent years several techniques ([26, 19, 7]) have been presented using social networks (from now onwards social network, circle and community are use interchangeably) to fight spam. Unfortunately their services are only limited within the social network of an email user. In this paper, we propose LENS, a new spam protection system which leverages anti-social networking paradigm based on an un-derlying trust infrastructure to both extend spam protection beyond a user’s social circle and fundamentally prevent the transmission of spam across the network at the first place. For any particular user, LENS handles emails separately in two different ways based on their origins (i.e. within and outside a user’s community).

Emails within social network: All the emails generated within the user’s community are authorized and trusted to be legitimate mes-sages, and will pass directly into the inbox. In LENS, the social com-ponents of a community consists of two levels i.e. friends of the users and its friends-of-friends (FoF). LENS allows a user A to add user B as a friend. A friend addition roughly corresponds to the notion that “User A trusts user B not to send him spam and vice versa”. We also observe from [7] that the addition of FoF into a social community is also useful for spam fighting in case when a sender and a recipient are not already friends, but instead share a common friend. Suppose A and B are friends, B and C are friends, on this basis A may conclude that C is unlikely to be a spammer as well.

Emails beyond social network: In order to extend spam-free emails beyond a recipient’s social network, we introduce a process callled anti-social networking for selecting Gatekeepers (GKs) in LENS. Subtracting a user’s social network from the overall social graph reveals its anti-social network. In particular, the term anti-social networking is use to describe the process in LENS that a user per-forms to select trusted GKs outside its social network. GKs are se-lected to be socially separated from the user with predefined or user-specified social distances. Once selected, the GKs are used to vouch for all incoming emails outside a recipient’s social network. The selection process of GKs assures that they are not malicious users and legitimate unknown senders can reach a recipient with the help of its GKs. The optimization problem in GK selection is to select a minimum number of GKs for a maximum outside reachablity. In this paper, we provide a scalable approximation solution for GK selection and integrate the solution into the current email system with popular email clients and SMTP servers.

We further demonstrate the scalability of LENS in term of number of email users using two popular online social network datasets: Face-book and Flickr. We show that with the help of hundreds of GKs, a recipient can be possibly reached by millions of users. The solution can be scalably extended to users with larger social distances by iterative GK selection.

The rest of the paper is organized as follows. Section 2 describes the design of LENS. In Section 3 we discuss how LENS is realized and incorporated with existing email processing system. In Section 4 we demonstrate scalability of LENS using online social network

Page 4: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 4

datasets. Section 5 describes the existing anti-spam approaches, and finally we conclude the paper in Section 6.

2 LENS design LENS consists of two parts. The first part consists of the forma-tion of a user’s social network, also termed as community forma-tion. All the emails generated within the community are author-ized and trusted as legitimate messages and are passed directly to the inbox. In order to receive legitimate messages from out-side the community a user will do anti-social networking to se-lect Gatekeepers (GKs), which constitutes the second part of LENS. In this section we present only the conceptual design of LENS and its actual realization details in the email processing architecture is discuss in section 3. We start this section by dis-cussing the basic design goals of LENS followed by the proce-dures of community formation and GK selection.

2.1 Design Goals

In order to be robust, secure and efficient, LENS follows several design principles:

1. Simple and efficient design: The simpler LENS is in terms of usability and design the more efficiently the system can be used and adopted. We are actually only fighting against spam, hence we do not require a very complex and robust cryptographic solu-tion with huge infrastructure cost.

2. Decentralized Solution: It is difficult to scale a centralized solution. Every user works out individually to form its own com-munity and select its GKs.

3. Knowledge of the network: A user cannot obtain full informa-tion about the global properties of the whole social network, such as the network diameter, central nodes and node degree distribution.

4. Sender Address non-forgeblity: Basic SMTP does not pro-vide any from address authentication. Therefore white listing is vulnerable to spam attacks using arbitrary from addresses. LENS utilizes standard sender authentication techniques to robustly verify that the from address in the received email is not forged.

5. Privacy of Community list information: Privacy of the com-munity list of each user must always be protected against any external (outside of community) threats and they (lists) should not be exchanged freely at the time of community formation. Each individual node is not allowed to possess necessarily too much information about the network, which may induce privacy and security black-holes.

6. Incrementally deployable: LENS should be able to integrate easily into the current SMTP servers. Inevitably, when deployed, some users will adopt LENS before others. The deployment of LENS should not worsen the spam problem for those who have not adopted it. Until every user is familiar with LENS, it is better to run it complementarily with the existing spam filter. The users

who have fully adopted LENS will get its full benefits; others will be profited until their extent of adoption (see Figure 1).

Figure 1 Complementary Lens, Sender Authentication and

Existing Spam Filters.

2.2 Community Formation

Many of our important decisions in life are based on the information provided by our network of friends. The reliability of our decisions hence depends on the trustworthiness of our social network. In this way we develop a number of interaction strategies, leading to trust-worthy social networks.

A number of studies about social networks have presented meas-ures of the closeness of a community. They have shown that these measures can be used to distinguish empirically observed social networks from non-social networks [13, 22, 23]. The most distinctive property of a social network is their tendency to form clusters. For example, if A knows B and C i.e. B and C are connected through A, then the likelihood of B knowing C is considerably higher than a random network with similar degree distribution. Based on the abo-ve fact, we consider the social component of level two (i.e. friends and FoF) as the user community in the LENS design.

Figure 2 Community Structure (friends, FoF) of user 1.

Figure 2 depicts the community structure for user 1. User 1 can receive all the messages from its community directly into its inbox. The formation of a social community (which also serves as white

Page 5: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 5

list/Commlist) is a simple two step process. Figure 3 depict the community formation process (both the steps).

1. Adding Friends: The first step starts with the initiation of friend request. Anyone can request anyone else for friendship. Addition of friend is the very basic yet extremely crucial step for LENS. It is assumed that only two nodes having a mutual trust on each other will join the friend relationship (like in msn, Skype or Facebook). System security and defense against attacks from malicious users depends on the fact that friends relationship always form between two legitimate users having proven record of social interaction.

2. Adding FoF: The idea of FoF addition is that there will be no exchange of friend lists among the friends. Instead any user can suggest its friends (mutually exclusive) to add each other into their communities as FoF. For instance (Figure 3), user 2 has two mutually exclusive friends so 2 will suggest both 3 and 1 to join FoF relationship. If both 1 and 3 accept the suggestion they will add each other into their communities.

Figure 3 Community Formation of user 1.

At the end of step 2, we will have a community structure for all the nodes with friends and FoFs. All the communities consists of only 2 levels of social components which are considerably close. During the entire process of community formation only local information of direct neighbors is used and the process is car-ried out in a decentralized manner at each individual user level. Furthermore there is no exchange of friend lists among the users without consensus to protect privacy of each user. By design, community formation is a selective process and involve certain human involvement to prevent any unnecessary addition in communities and preserve high level of privacy.

2.3 GK Selection

In order to counter spam and receive legitimate messages be-yond a user’s social circle, LENS performs anti-social networking to select Gatekeepers (GKs). The term anti-social networking is use to describe the process of an user (recipient) selecting GKs outside its social network. GK is selected to be socially sepa-rated from the user on social graph. They are legitimate and authenticated email user lying outside the social community of the recipient. The role of a GK is to vouch for legitimate users outside the community of the recipient for communication. Any emails outside the recipient’s community can reach its inbox

only if its authenticated GKs vouch for them. To maintain a reliable trust structure, a GK is only authorized to vouch for the nodes in its own community. Since a recipient node can only be reached from outside if ist GK has vouched for the communication, all we need to do is to find enough GKs to make the recipient node highly reach-able from outside. In order to keep LENS effective and scalable, the goal of GK selection process is to select minimum GK for maximum coverage. In the email system, a GK enabled user does not have to vouch manually for any other user, instead the mail server hosting it will realize the concept of GK and the trust it inherits to perform its duties transparently (Section 3).

Let us consider all the email users as a connected network and visualize it as a graph G = (V, E), with email users as vertices (V) and their relationships (i.e. friends) as edges (E). For every recipient node in G, we need to find a subset S of V (i.e. GK nodes) such that nearly every vertex not in the social community of the recipient node lies within at least one of the communities of the member of S, and we want the size of S to be as small as possible.

The total email users today are more than 1.4 billion [24]. Finding a smallest subset of S with maximum coverage will raise scalability question. This is also similar to the minimum dominating set prob-lem [14], which is a classical NP-complete problem in computational complexity theory, the only difference is that a GK is connected to its community instead of direct neighbors. Our short discussion above has raised the following questions:

1. Is it a good practice to figure out the optimal (smallest size but biggest coverage) S of GK for the entire network at any single instance?

2. What is the chance that a particular email user has to be reached by all the other users of the entire email network (for instance a black smith in Africa trying to reach an Eskimo in the arctic region)?

The answers to the above questions are both negative. In the re-maining part of this section we present our solution which takes into account all the concerns raised regarding the usage of anti-social network to solve the problem of spam. Instead of working on a glo-bal provisioning of optimal GK subset of V for the entire email net-work, we will discuss and present a scalable approximation solution.

One of the design goals of LENS is that a user can not obtain infor-mation about the global properties of the social network. The best approach here would be to restrict a user with only its personal community information. The GK selection procedure of LENS con-sists of three stages as follows;

2.3.1 Stage 1: GK selection in adjacent communities

For any given recipient node, the GK selection process starts from the adjacent communities right outside the recipient node’s social circle. The process is describe below:

1. Request: A recipient node will use its FoFs (also known as the boundary nodes as they are at the edge of the community) to help him find the locally optimal GKs outside its community (Figure 4). A recipient node will simply request all boundary nodes of its commu-nity to send their suggestion for good GKs.

Page 6: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 6

2. Suggestion: The boundary nodes will suggest a user from its friends with largest community (outside the recipient’s commu-nity) to the recipient as a GK. It will also inform the suggested user about the recipient. Figure 4 depicts the selection of GKs by two boundary nodes of the recipient. Once the GKs pass the authentication step (next step), they will be able to vouch for all the users in their communities for communicating with the re-cipient. Now with the addition of selected adjacent nodes as GKs the reachablity of the recipient has cover level 5. According the small world property of social network, any two users can be connected with a small number of hops (six-degree of separa-tions [28]). This suggests that if the email network exhibits a social network behavior, the recipient node would be very highly reachable throughout.

Figure 4 Community structure of recipient node and GK

coverage.

3. Authentication: Once the boundary node suggests a GK to the recipient, it will start a handshake with the GK for mutual authentication i.e. verification that both are legitimate users and establish a secret key SK. Once the GK node has a SK, it will use it to issue signatures to its entire community members and they will use these signatures if they need to communicate with the recipient node (see Figure 5). All the users within a social radius (level or hops) of 5 would be able to send emails to the recipient with an assurance of being free from spam. Distant users having a social distance greater than 5 are covered in stage2 of the GK selection process.

2.3.2 Stage 2: GK selection beyond adjacent communi-ties

In order to provide reachablity to other distant users in the email network, the GK selection procedure of LENS can be easily ex-tended to select GK in distant (beyond adjacent) communities. The process is very simple (see Figure 6). Once a recipient final-ize the selection of GKs in the adjacent communities (stage1), it will send a request to the selected GKs to help them look for further GKs from their adjacent communities. As a result of this request, the GKs will use their boundary node to find new locally optimal GKs and send their suggestions back to the recipient.

Finally, the recipient will authenticate the new set of GKs from social level 6 and extend its reachablity to level 8. Using the same proce-dure, the extension of GK selection is possible at any further levels. Of course, all these extensions do not come for free, it is solely de-pendent on the design choice of the LENS users. They can specify how far they want to extend GK selection.

Figure 5 GK Authentication and certification.

Re received 85% of the email correctly utilizing just its social net-work levels (friends and FoF) [26]. With the anti-social networking using GKs, LENS enhances its reliable and spam free delivery of emails beyond social network. Success of LENS depends on the successful formation of social communities and continuous exten-sion of anti-social network between the email users. User having a larger social community would be benefitting more from LENS than the isolated and less socially connected users.

Figure 6 GK Authentication and certification.

2.3.3 Stage 3: Spontaneous GK selection to accommodate

network dynamics

LENS, until now, has covered most of the aspects of communication to protect against spamming. There would rarely be any communica-tion outside the social levels covered in stage 1 and 2, but there always exist a possibility. For example, say, there are new email users joining the network or a really distant user wants to start a new collaboration. LENS provides spam free email communication to distant and new users as follows. Instead of extending GK selection to the entire network, LENS restricts it only to the social levels cov-ered in stage 1 and 2. If a user wants to send an email to a recipient, who is not only outside its community but there is also no GK for the

Page 7: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 7

recipient within its community, the user will perform the following two steps;

1. Announcement: A user will announce itself to the recipient that it wants to communicate and will start the authentication process.

2. Authentication: The user will start the mutual authentication process to prove that it is a legitimate user and not a spammer. As a result of this process, the user will establish a secret key with the recipient and the recipient will add the user as its GK. The user will further use the secret key to issue signatures to its entire community nodes and they will be able to use these signa-tures as well to communicate with the recipient node.

This process is only performed once and for all at the start. After the user is authenticated to be a GK of the recipient, not only the user but its entire community can send email to the recipient. So in this way, instead of having GK for the entire network in LENS we select GK on the fly after stage 2, if there is any communica-tion need.

We do not use the classical dominating set or distributed domi-nating set approximation to select the GKs for two main reasons. First, we do not want a common set of nodes to serve as GKs for the whole population. The reason is that these common GKs will have too much information about everyone in the network and would become privacy and security weak points. The second reason is that by further considering the communication pat-terns, you do not expect everyone on the planet to communicate randomly with each other (an example is a black smith in African and an Eskimo in Alaska). The probability actually decreases with an increase in the social distance. Hence, we do not really need a dominating set for the whole population.

3 E-mail processing and LENS In this section we discuss the architectural details, which in-cludes the realization of LENS with the current email architecture and the design details of shared keys and authentication proto-col. Figure 7 depicts the overall flow of an email i.e. from mes-sage creation, transport to delivery. Mail user agent (MUA), the sender’s email client, submits the email to its mail server (MSA) using SMTP. The sender’s MSA will look up the destination’s mail exchanger record (MX) in the DNS server. The DNS server finds the highest preference mail server for the recipient and reports the name of the mail server by returning a MX resource record. After this point, a TCP connection is established between the sender’s and the receiver’s MSAs and the sender’s MSA send the MAIL FROM command to the receiver. With successful ac-knowledgement from the receiver side, the complete email (header and the body) is sent, and the TCP connection is re-leased. The mail delivery agent (MDA) delivers the accepted email to a server for local mail delivery. Once delivered to the local mail server, the email is stored for batch retrieval by au-thenticated mail clients (MUAs) using IMAP or POP.

3.1 Mail Server’s (MSA) Responsibilities

It is actually the mail servers (MSA) that executes the LENS protocol on behalf of the email users. Each MSA may serve hundreds and thousands of email users depending on the size of the organization. Each email user is only responsible for making decision regarding its own community i.e. adding friends and FoF. The community in-formation of each user is stored in a CommList and it can only be accessed by that particular user or its MSA. All the remaining func-tionalities of LENS are handled by the MSA in transparent to the email users, which includes;

Figure 7 Email processing model

Executing the GK selection protocol, including the mutual au-

thentication of the recipients and GKs and signature issuance by the GKs to the users of their communities.

Maintenance of CommList (list of UserIDs of all the users in the community), SKList (list of GKID, RecipientIDs and shared secret keys SKs established between them) and SignList (list of signa-ture issues by the GK to its community user to communicate with the recipient i.e. list of Sign[(UserID)SK, GKID]).

Email filtering based on CommList and SignList. The functionalities mentioned above can be integrated with Send-mail (MTA) or Mail Avenger SMTP server and large email providers can also implement them on their email servers. All the legitimate MSAs must have a valid certificate issued from a Trusted Authority. These certificates are later used to sign messages proving that they are originated from a legitimate MSA. Any MSA with non-valid cer-tificate is assume to be malicious and all the communication re-quests associated with those certificates are ignored. Furthermore, it is also safe to assume that it is very hard, if not impossible, for bots or malicious users to reside in a valid and legitimate MSA. The rea-son is that the addition of email users are very strictly moderated in companies, private institutes and universities ...etc. However, any one can create a large number of account on webmail providers like gmail, yahoo, hotmail and gmx etc. Our previous assumption, that the entire users within any certified MSA are considered to be le-gitimate, might raise question of human base spamming; since a

Page 8: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 8

human spammer will be able to create dummy accounts on webmail providers without any financial cost. But in reality it is not like that due to the following reasons:

Creating and running a spamming account over yahoo, hotmail and gmail ...etc requires human effort and all of these will incur cost which is against the spamming model.

Almost all the webmail and Internet service providers im-pose an email sending limit. Exceeding the limit results in blocking of an email account for certain amount of time [1]. Table 1 lists the email sending limit of some of the major Internet service and webmail providers. Apart from imposing limits on sending emails, webmail providers also block email account for certain time if the email contains a large number of non-existent or broken addresses that bounce back on failed delivery.

Nevertheless studying the impact of human spamming is beyond the scope of this paper and it will be part of our future work.

Webmail / Internet-service providers --- Email sending limit

Gmail---500 recipients per message with web and 100 using email client software

Hotmail---100 recipients per day

Yahoo---100 emails or recipients per hour

Lycos--- max 25 recipients per message and max 250 emails per day

AOL---100 recipients per message or 500 recipients per con-nection

Verizon---100 recipients per email and 500 emails (recipients) per hour

Comcast---1000 recipients per day

EarthLink---1000 recipients per day

Cablevision/Optimum (OOL)--- 50 recipients at one time

Road Runner---1,000 recipients per day per IP

AT&T Yahoo---100 recipients per email message

Charter---50 recipients / emails per hour

Table 1: Email sending limit by major Webmail and Internet service providers.

3.2 Mutual Authentication Protocol for GK selection

Mutual authentication protocol is one of the most significant parts of GK selection process. It ensures that both the GK and the recipient are legitimate users and helps to establish a shared secret key SK between them. GK use these SKs to issue signa-tures to its community members as a vouching mechanism to send emails to the recipient. As mentioned earlier, the MSA of GK and the recipient carries out this protocol transparently from the email users, just like it searches for the destination mail exchange record (MX) from the DNS while an email is being sent.

With the successful completion of this protocol, the email users establish SKs with all their legitimate GKs and the members within the GK’s communities are able to send emails to the recipient using their GK signature.

We propose two variations of the protocol based on the difference of the location of the recipient and its GK. Here we would like to em-phasize that the main focus of LENS is spam protection, so only verifying the legitimacy of a user is enough to counter spam, rather than running costly protocols to authenticate the identity of each user.

Shared Secret Key:

If the recipient and its GK are hosted by the same MSA, LENS only need to establish a shared secret key SK between the recipient and its GK. This is based on the fact that if any MSA is certified by a Trusted Authority, it is assumed to host and serve only non-malicious users and all the malicious activities are taken care of.

We use the classical Diffie-Hellman key exchange protocol (D-H) [29], base on its ability to establish shared secret key between two parties having no prior knowledge of each other. In this protocol, all the communication is within the MSA i.e. its not over an insecure communication channel, thus, D-H’s vulnerability to man-in-the-middle attack is not of concern any more. Here is a general descrip-tion of the protocol (see [29] for details). D-H protocol uses the multiplicative group of integer modulo q, where q is a prime and p is a primitive root mod q.

Recipient and its GK agree on a finite cyclic group G and a

generating element p in G. GK picks a random natural number x and sends px to recipient. Recipient picks a random natural number x and sends py to GK. GK computes (py)x. GK computes (px)y.

Figure 8 Diffie-Hellman key exchange protocol

Finally both GK and recipient are in possession of pxy which can be used as SK. Security of D-H is based on the discrete logarithm prob-lem. In Lens GK will always initiate the protocol.

Shared Secret Key and Authentication:

If the recipient and its GK are hosted by different MSAs, LENS will run an authenticated shared key establishment protocol for estab-lishing SK. In 1992 Whitfield Diffie, Paul C. van Oorschot and Mi-chael J. Wiener presented Station-to-Station (STS) protocol [11]. STS

Page 9: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 9

is based on the classical D-H protocol and provides entity au-thentication along with SK.

In LENS, we modify the STS protocol. Since the recipient and GK belong to different MSAs we perform authentication at the server level to verify that both recipient’s and GK’s servers are legitimate and untempered. SK establishment is perform at the user level, the users being the recipient and GK. The benefit of server base authentication is that we do not need to authenticate at the user level, using public key certificates, thus, a lot of com-plexity is reduce by avoiding use of a system wide PKI. Authenti-cation of the servers is enough to assume that the users hosted on them are also valid and legitimate. Since, as far as our appli-cation scenario is concern, the legitimacy of users is enough to fight against spam, rather than authenticating the identity of every user. Following is a brief description of our modified proto-col (see [11] for actual STS protocol in detail). The protocol uses the multiplicative group of integer modulo q, where q is a prime and p is a primitive root mod q.

Recipient and its GK agree on a finite cyclic group G and a

generating element p in G. GK picks a random natural number x and sends px to recipi-

ent. Recipient generates a random natural number y and com-

putes py. It further computes the shared secret key SK = (px)y mod q. After that it concatenates the exponentials (py; px) (order is important), signs them using private signature key of its MSA, and then encrypts them with SK (since the proto-col is executed by the MSAs themselves therefore, access to the private signature key is available). Finally it sends the ci-pher texts along with its own exponential py and its MSA’s certificate issued by the Trusted Authority to GK.

GK computes the shared secret key SK = (py)x mod q and decrypts and verifies recipient’s signature.

GK concatenates the exponentials (px; py) (order is impor-tant), signs them using its MSA’s secret key for signature scheme and then encrypts them with SK. Finally it sends the cipher text after concatenating it with its MSA’s certificate issued by the Trusted authority.

Recipient decrypts and verifies GK’s signature.

Figure 9 Diffie Modified STS protocol

MSAs of GK and Recipient are now mutually authentication and GK and Recipient are trusted to be legitimate and have a shared secret key SK. Once the secret key SK is establish the MSA will make an entry in the SKList of its respective user (GK and recipient). Each SKList entry consists of GKID, RecipientIDs and shared SKs estab-lished between them.

3.3 Forgery of from addresses

Since addresses are not authenticated in SMTP it will be very easy for the spammer to launch a spam attack with forged from addresses as if they are from the recipient’s community. In order to solve this problem LENS use the standard sender authentication techniques (for instance SPF [32], DKIM [6]) which are already being used in the existing email system.

3.4 Signature from GKs

Once the GK has the SK its MSA will issue a signature Sign[(UserID)SK, GKID]) to all the users of the GK’s community. These signatures are add to the SignList of the users and to use later for communication with the recipient.

3.5 LENS incorporated and Email processing

Working of LENS is in transparent to the users like any other spam filter. Now let us assume that we have the social communities de-fined and the GKs selected. Each user will store Commlist, SKList and SignList at the local mail server (responsible for the functioning of LENS). There are two types of messages sent and received i.e. messages within the social community and those outside. The proc-essing of both the messages are briefly explains as follows (also see Figure 10).

1. Message within community: If a message is sent to any recipient within the community, the message will flow all the way to the re-ceiver’s MSA. At the MSA, the sender will be verified against the recipient’s Commlist and will be placed into its mailbox.

Figure 10 Email processing model with LENS

Page 10: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 10

2. Message outside community with GK: If a message is sent to a receiver outside its community, the sender’s MSA will bind a signature, issued by an authorized GK to communicate with the recipient, along with the message. When the message arrives at the recipient’s MSA, it is verified using the SK of the GK listed in the signature. On successful verification it will place the mes-sage in the recipient’s mailbox.

3. Message outside community without GK: If a message is intended for a recipient outside the sender’s community and with no signature issued by any GK, the sender’s MSA will hold the message and start a GK selection procedure (stage 3). The sen-der will be announced as a potential GK for its community and the GK authentication procedure will be carried out as discussed earlier. On successful completion the sender will be selected as a GK for the receiver and the MSA will now bind its signature with the withheld message and send it out. When the message arrives at the receiver’s MSA, it will verified using the SK of the GK listed in the signature and on successful verification it will place the message in the receiver’s mailbox.

3.6 Prevention of Spam Transmission

One of the main contributions of LENS is that it prevents the transmissions of spam across the network at the first place to save maintenance and infrastructure cost of the network opera-tors. In the beginning of this section we explained the flow of an email message from the sender’s client to the receiver’s inbox. In order to prevent spam transmission at the first place, LENS adds a minor verification step. Let us consider that the sender’s and receiver’s MSA have already established a TCP connection. Now, when the sender’s MSA send the MAIL FROM command, it also append it with a signature, issued by the authorized GK to com-municate with the recipient, if the recipient is not in the sender’s community. At the recipient’s end, the MSA verify if the sender in the MAIL FROM command is a community member or not. If the sender is a member, the receiver’s MSA sends back an acknowl-edgement and the process continues. On the other hand if the sender is not in the receiver’s community, the MSA checks for a valid signature of a GK. Failure to present a valid signature re-sults in termination of the TCP connection by the receiver and the transmission of email (header and body) will not take place.

3.7 Concerns related to compromised user

Working of LENS is in transparent to the users and it is actually the Mails Server (MSA) that execute the LENS protocol on behalf of the email users. This means that the MSA manages the lists (SKList and SignList) for the email users. If a user (who is also a GK) is compromised it will only have a temporary local effect within the community. The effect is temporary and lasts until the victimized user broadcast about the incident using his other ids (may be through friends or word of mouth) or claims back his ownership from the email service provider. Let us suppose that the victimized user is unable to reclaim the ownership of its Id, in that case the user can always request its community to abandon its compromised Id and MSA will remove all the data associated with the compromised Id from the SKList and SignList. Hence,

the attacker would not be able to harm the system on large scale as the SKs and signatures are handled internally by the MSA.

4 Experiments and results In order to verify the feasibility and scalability of LENS, we use two large scale online social network datasets i.e. Facebook and Flickr. Data samples of Facebook and Flickr are good choices for evaluat-ing LENS as they represent real or cyberly real social connections. We are only able to evaluate the GK selection procedure at stage 1 (GK selection in the adjacent communities) because of the limitation of the dataset size. Although both datasets contain millions of users, the average path lengths are no more than 5 hops. For the results we are mainly interested in the following two things:

1. Number of GKs for receiving messages: As a result of GK selec-tion procedure, each recipient ends up authenticating certain num-ber of GKs outside its social network to vouch for legitimate users. Feasibility and scalability of LENS depend on the fact that the num-ber of GKs selected for a particular recipient lies within a small ran-ge. The more the required GKs, the larger the number of SKs needs to be establish and maintain.

2. Reachablity of recipient via GKs: The success of LENS also depend on how many legitimate users outside the community can possibly reach a recipient with a certain number of GKs. Ideally, we expect a maximum number of legitimate users with a minimum number of GKs.

In the remaining part of this section we will start by a brief explana-tion of high level statistics of the datasets. After that we present the results of the experiments on those datasets.

4.1 Social network data

Table 2 presents the high-level statistics of Facebook and Flickr datasets gathered and used in [20, 30]. Facebook is the largest social network in the world and the number one photo sharing site on the internet. It is a “pure” social network, in the sense that its primary purpose is finding and connecting to other users. Our data sample of Facebook consists of 3.1 million users with over 23 million edges and an average of 15.2 friends per user. Flickr on the other hand is not a pure social network, intended primarily for publishing, organizing and locating content. Our dataset of Flickr consists of 1.7 million users with over 15 million edges and an average of 18.1 friends per user.

4.2 Experiments on Facebook samples

We randomly select around 3000 nodes within the graph of Face-book samples and test them for GK selection in LENS. The nodes are selected randomly with the constraints that the community size should be between 100 and 650 (this number is quite reasonable for an average email user) and the number of friends for any given node should be greater than 25. In the following we discuss the results for Facebook;

Page 11: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 11

Experimental data set Facebook Flickr

Number of Users Number of Edges

Average Friends

Clustering Coefficient

Avg Path Length

Average Community Size

3,097,165

23,667,394

15.28

0.175

5.13

1,587.32

1,715,255

15,555,041

18.13

0.313

5.67

4,398.44

Table 2: High-level statistics of out Facebook and Flickr datasets

Number of GKs for receiving messages

Figure 11 presents the results of the number of GKs selected for a recipient to receive messages outside its community. The number of required GKs is very reasonable, ranging between 58 to 420 and most of the time the number is less than half of the community size. The number of GKs shows a near linear rela-tionship with the number of boundary nodes. Increase in the number of boundary nodes also show a relative increase in the number of GKs but this is not always the case. We observe sev-eral times that a higher number of boundary nodes results in smaller number of GKs. The GK number is lower if the GK is selected from a region where the nodes have high clustering coefficient, which results in the suggestion of the same GK from a number of boundary nodes. SKlist consist of three entries, ID of GK, recipient and the SK itself. Currently, about 99% of the email addresses are on average of 22 characters (i.e. 22 bytes in terms of space). If we choose the key size of 256 bits i.e. 32 bytes, any single entry in SKList will cost only 76 bytes. With the worse case of 420 GKs, the SKList size will be still only 31.37 Kbytes, which is efficient in space based on the storage capacity nowadays.

Figure 11 Number of GK for sending and receiving messages and

their reach for Facebook dataset

Reachablity of recipient via GKs

In Figure 11 we show the results of the number of users that can reach a particular recipient with the help of GKs. With a minimum number of GKs the reachablity of the recipient is ranging between 760K to 1.45 million i.e. 24 to 47% of the total network and most of the time it remains above 35%. All of these are achieve with merely the execution of stage1 GK selection process. Based on these re-sults, we can safely assume that in reality there would rarely be any messages sent to the recipient by a sender not covered by the GKs. Nevertheless recursive iterations of stage2 and the use of stage3 of the GK selection process will outcast nearly any rare case of a le-gitimate message not being handle.

4.3 Experiments on Flickr samples

In the Flickr case, we randomly selected more than 500 nodes within the graph of Flickr sample and test them for GK selection in LENS. The nodes have a community size between 100 and 500 and an average number of friends greater than 25 (same setting as for Fa-cebook).

Number of GKs for receiving messages

Figure 12 shows the results of the number of GKs for a recipient to receive messages from outside its community. The resulting number of GKs range between 23 to 153 and most of the time the numbers are less than 30% the community size. The numbers are reasonably small. Any single entry in SKList will cost only 76 bytes and even with the worse case of 153 GKs the SKList size will be only 11.35 Kbytes.

Page 12: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 12

Figure 12 Number of GK for sending and receiving messages

and their reach for Flickr dataset

Reachablity of recipient via GKs

In Figure 12 we present the results to show the number of users that can reach a particular recipient with the help of GKs. With the selected GKs above, the reachablity of the recipient is range between 643K to 854K i.e. 38 to 50 % of the total network and mostly it remains above 45%. Flickr is not a pure social network and is intended primarily for publishing, organizing and locating content. It contains a large number of strongly connected cores of very high degree nodes. Due to this most of the boundary nodes end up suggesting the same node as GK, thus, resulting in smaller number of GKs covering a large number of users. The case with Facebook is different as it is a pure social network with the primary purpose to find and connect to new users.

MSA maintains a SignList for each user, containing signatures of the GKs to send emails to the recipients outside their community. A single entry in SignList occupies only 44 bytes of space. Even if a SignList contains a million entries it will occupy only 42 Mbytes of space. Nowadays, webmail providers allow 20 Mbytes of attachments for a single email and allocate multiple GBs of space to single user. Therefore, if a SignList occupies couple of Mbytes, it will not create any scalability issue.

Based on the results presented in this section, we can confi-dently conclude that LENS is scalable in terms of number of required GKs and the reachablity. With the help of only hundreds of GKs, a recipient can be reached by millions of users and the

solution can be scalably extended to the users with even further social distance by further GK selection. Increase in the size of a recipient’s community has a direct impact on its reachablity. User having a larger social community would be benefitting more from LENS than the isolated and less socially connected users.

5 Related work Many techniques have been proposed in the past to counter the problem of spam emails. In this section we will review and discuss various approaches. Any existing anti-spam system may comprise of one or several of these approaches.

Content-based filtering is the most popular spam protection tech-nique and is widely available in most free and commercial imple-mentations; SpamAssassin [2] and DSPAM [12] are two representa-tive examples. Content-based filtering uses heuristics and machine learning methods, based on filters and keywords, for spam recogni-tion. Unfortunately content-based filtering exhibits several problems which limits its usage. These problems includes the intrinsic cost of initialization and continuous adaptation of the filters [27, 31], false positives and false negatives of results.

Sender Authentication techniques are used as a protection against forged sender or from addresses by the spammer. It is an intrinsic requirement for all the white lists approaches since otherwise spammer can simply guess on well-known email addresses in the from: field. Under the umbrella of sender authentication techniques, the anti-spam community has developed a number of domain-based authentication systems such as SPF [32], DKIM [6] - to validate that the received email actually came from one of these domains. Oc-cam’s razor [8] has developed a solution in which the mail recipient requests a real-time challenge-based authentication protocol to validate the binding between individual domain names and legiti-mate mail sources for those domains. Although solutions like SPF [32], DKIM [6] are widely in use today, for an effective spam protec-tion, further mechanisms (like white lists or black lists) are also re-quired. LENS uses sender authentication to prevent from address forgery.

Header-based approach examines the header of an email to detect spam. This approach can be categorized as white lists and black lists. White lists schemes collect a list of all the emails that are trusted to be non-spammers. Any email sourced from the addresses in the white lists is collected directly in the inbox. White listing is highly vulnerable to from address forgery, therefore it must be used together which source authentication schemes. Blacklists schemes, in contrasts stores the IP-addresses of all the spammers (email ad-dresses are easily forgeable and are ineffective 95% of the time [9]) and refuse to accept emails from them. Manually generated lists have proved to be highly efficient but put quite a burden on an email user to maintain it. The intra-community communication in LENS is one kind of white listing application.

During the recent years different approaches are exploited for spam detection using social networks and trust reputation systems. Boykin et al. create a social network of friends in the cyberspace based on

Page 13: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 13

the emails exchanged between them [7]. With the usage of local clustering properties of social network the emails are classified. For spammer the clustering coefficient is very low as they do not exchange emails with each other. In contrast, the clustering coefficient of a legitimate user is rather high. Their method is able to classify 53% of all the emails as spam or non-spam with 100% accuracy. However, the method is limited to offline analy-sis, and even the remaining 47% emails are left for other filtering techniques.

Ostra [19] tries to explore the use of trust relationship to thwart unwanted communication. They used the number of trust rela-tionships the user has to limit the amount of unwanted commu-nication a user can produce. Their system relies on existing trust networks to connect senders and receivers via chains of pair-wise trust relationship and use a pair-wise, link-based credit scheme to impose a cost on originator of unwanted communica-tion. Scalability of this system is still not certain if it maintain a per link credit scheme. Although it can be decentralized by in-troducing a central tracker component, it is not clear how scal-able the system will be - they have not evaluated this part. Also this method would not work for functional Ids that exist only to receive data (e.g. [email protected] for student travel grants). If a user is bad at responding/classifying emails (i.e. he does not care about sending either), then he is stuck waiting for a response.

Re: Reliable Email [26] talks about use of white list of friends and automatic white list of FoF to increase the communication chan-ce of only white list friends. By using this protocol, Re can accept almost 75% of received emails and prevent up to 88% false positive by the existing spam filters. With FoF protocol there is 10% increase for accepted emails. Re does not directly address the problem of inability to receive emails from unknown sender or from users other then friends or FOF, instead it uses existing schemes to deal with this problem. Further Re employees a lot of structural overheads if use for spam protection alone. Each user needs to maintain a public/private key, resulting in maintaining a system wide PKI (public key authentication) and authentication server for each domain. It would have been easier to use existing protocols like digitally signed emails, then going through all this architectural complexity.

In Trust and Reputation Systems, network users try to calculate the reliability and trustworthiness of other users based on their own experiences and that of others. Boykin et al. [7] proposed an automatic email ranking system based on trust and reputation algorithms. Reputation algorithms provide a rating for each email address, which can subsequently be used to sort the incoming emails. These ratings can be gained in two ways, globally and personally/locally. In global scheme people share their personal ratings in such a way that a single global rating or reputation can be inferred for each email address. Whereas, in personalized scheme, the ratings (considered as trust) are different for each email user and depends on individual personal social network. MailRank [9] is a spam detection system based on trust and reputation scheme to classify email addresses (apart from rank-ing emails as done in [7]) into spammer addresses and non-

spammer addreses. It additionally determines the relative rank of an email address with respect to other email addresses. Trust and repu-tation systems are inherently subject to attacks like identity spoof-ing, false accusation and collusion. These attacks are independent of a particular trust and reputation calculation metric and are primar-ily due to lack of authentication and non-repudiation in standard trust and reputation solution.

6 Conclusion In this paper, we present LENS, a novel, easily adaptable and scal-able spam protection system which can stop spam at the first place from sending instead of only filtering from mailboxes. With LENS, network operators can save the operating cost due to large amount of spam (200 billions per day in 2009) crossing their networks. LENS uses social trust and leverages anti-social networking for GK selec-tion to further enhance the reliable delivery outside the recipient’s social circle. We evaluate LENS using empirical online social net-work datasets, and prove the system to be scalable with millions of users, and can be further extended.

Future work in this direction may involve refinement of the algorithm to select GKs beyond adjacent communities. We also want to im-plement LENS on open source SMTP servers such as SendMail and evaluate the system performances (e.g. delays, and overhead) on large scale tesbeds like PlanetLab [10]. We will further study the spamming behavior based on real email traces collected within our research institutes and publicly available. Furthermore, there is a need to study the efficacy of human spammer with some potential attack scenarios, where malicious users convince LENS that a le-gitimate user is a spammer. We will also evaluate the robustness of the system when certain amount of GKs have been compromised. This is a first work in the area of using social network and gate-keeper concept to establish a trust infrastructure for reliable email delivery and stop spamming from traversing across the network, a lot of works can be followed.

Page 14: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 14

7 List of abbreviations LENS LEveraging anti-social Network against Spam

GK Gate Keeper

FoF Friends of a friend

SMTP Simple mail transfer protocol

MSA Mail server

Page 15: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 15

8 Table of figures Figure 1 Complementary Lens, Sender Authentication and Existing Spam Filters. ..................................................................... 4 Figure 2 Community Structure (friends, FoF) of user 1. ................................................................................................................... 4 Figure 3 Community Formation of user 1............................................................................................................................................ 5 Figure 4 Community structure of recipient node and GK coverage. .............................................................................................. 6 Figure 5 GK Authentication and certification..................................................................................................................................... 6 Figure 6 GK Authentication and certification..................................................................................................................................... 6 Figure 7 Email processing model ......................................................................................................................................................... 7 Figure 8 Diffie-Hellman key exchange protocol ................................................................................................................................. 8 Figure 9 Diffie Modified STS protocol ................................................................................................................................................. 9 Figure 10 Email processing model with LENS.................................................................................................................................... 9 Figure 11 Number of GK for sending and receiving messages and their reach for Facebook dataset .................................. 11 Figure 12 Number of GK for sending and receiving messages and their reach for Flickr dataset ........................................... 12

9 Index of tables Table 1: Email sending limit by major Webmail and Internet service providers........................................................................ 8 Table 2: High-level statistics of out Facebook and Flickr datasets............................................................................................ 11

Page 16: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 16

10 References [1] Email address limit in webmail by providers. http://www.emailaddressmanager.com/tips/emailaddresslimit.html.

[2] Spamassassin. http://spamassassin.apache.org/.

[3] Symantec brightmail antispam. http://www.brightmail.com/.

[4] Messagelabs intelligence report: Spam intercepts timeline. http://www.messagelabs.co.uk/, July 2005.

[5] Distributed checksum clearinghouse, 2005 October.

[6] E. Allman, J. Callas, M. Delany, M. Libbey, J. Fenton, and M. Thomas. Domainkeys identified mail (dkim). Internet Engineering Task Force (IETF) Draft, July 2005.

[7] P. Oscar Boykin and Vwani Roychowdhury. Personal email networks: An effective anti-spam tool. IEEE COMPUTER, 38:61, 2004.

[8] C.Fleizach, GM.Voelker, and S.Savage. Slicing spam with occam’s razor. In Proc. of the Conference on Email and Anti-Spam (CEAS), 2007.

[9] Paul-Alexandru Chirita, J¨org Diederich, and Wolfgang Nejdl. Mailrank: using ranking for spam detection. In CIKM ’05: Pro-ceedings of the 14th ACM international conference on Information and knowledge management, pages 373–380, New York, NY, USA, 2005. ACM.

[10] Brent Chun, David Culler, Timothy Roscoe, Andy Bavier, Larry Peterson, MikeWawrzoniak, and Mic Bowman. PlanetLab: An Overlay Testbed for Broad-Coverage Services. ACM SIGCOMM Computer Communication Review, 33(3):00–00, July 2003.

[11] Diffie.W, van Oorschot.P.C, and M.J. Wiener. Authentication and authenticated key exchanges. Designs, Codes and Cryptog-raphy, 2:107–125, 1992.

[12] DSPAM. http://dspam.nuclearelephant.com.

[13] Ebel.Holger, Mielsch.Lutz-Ingo, and Bornholdt.Stefan. Scale-free topology of e-mail networks. Phys. Rev. E, 66(3):035103, Sep 2002.

[14] Garey.Michael.R and Johnson.David. Computers and intractability: A guide to the theory of npcompleteness. W. H. Freeman, ISBN 0-7167-1045-5, p. 190, problem GT2, 1979.

[15] D. Geer. Will new standards help curb spam? IEEE Computer, 37(2):14–16, Feb 2004.

[16] J. Golbeck and J. Hendler. Reputation network analysis for email filtering. In Conference on Email and Anti-Spam (CEAS), 2004.

[17] Joseph S. Kong, P. Oscar Boykin, Behnam A. Rezaei, Nima Sarshar, and Vwani P. Roychowdhury. Let your cyberalter ego share information and manage spam. IEEE COMPUTER, 39:67, 2006.

[18] L.G.Harbaugh. Spam-proof your inbox. PCWorld, May 2004.

[19] A Mislove, A Post, P Druschel, and KP Gummadi. Ostra: Leveraging trust to thwart unwanted communication. In Proceedings of the 5th Symposium on Networked Systems Design and Implementation (NSDI’08), San Francisco, CA, USA, April 2008.

[20] Alan Mislove, Massilmiliano Marcon, Krishna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee. Measurement and ana-lysis of online social networks. In Proceedings of the 5th ACM/USENIX Internet Measurement Conference (IMC’07), San Diego, CA, October 2007.

[21] M.Walfish, J.D.Zamfirescu, H.Balakrishnan, D.Karger, and S.Shenker. Distributed quota enforcement for spam control. In 3rd Symposium on Networked System Design and Implementation (NSDI), San Jose, CA, May 2006.

[22] M. E. J. Newman. Assortative mixing in networks. Phys. Rev. Lett., 89(20):208701, Oct 2002.

[23] M. E. J. Newman and Juyong Park. Why social networks are different from other types of networks. Phys. Rev. E, 8(3):036122, Sep 2003.

[24] Royal Pingdom. Internet 2009 in numbers. http://royal.pingdom.com/2010/01/22/internet-2009-innumbers/,2009.

[25] V. Prakash. Razor. http://razor.sourceforge.net.

[26] S.Garriss, M.Kaminsky, M.J.Freedman, B.Karp, D.Mazi‘eres, and H.Yu. Re: Reliable email. In Proceedings of the 3rd Sympo-sium on Networked Systems Design and Implementation (NSDI’06), SanJose, CA, May 2006.

Page 17: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 17

[27] S.Hansell. Internet is losing ground in battle against spam. The New York Time, April 2003.

[28] J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry, 32(4):425–443, 1969. [29] W.Diffie and M.E.Hellman. New directions in cryptography. IEEE Transactions on Information Theory, IT-22:644–654, 1976.

[30] ChristoWilson, Bryce Boe, Alessandra Sala, Krishna P.N. Puttaswamy, and Ben Y. Zhao. User interactions in social networks and their implications. In EuroSys ’09: Proceedings of the 4th ACM European conference on Computer systems, pages 205–218, New York, NY, USA, 2009. ACM.

[31] Gregory L. Wittel and S. Felix Wu. On attacking statistical spam filters. In Proc. of the Conference on Email and Anti-Spam (CEAS), 2004.

[32] M. W. Wong. Sender authentication: What to do. http://spf.pobox.com/whitepaper.pdf, July 2005.

Page 18: LENS: LEveraging anti-social Network against Spam · LENS: LEveraging anti-social Network against Spam Page 3 Abstract Spam is still an open problem from the network operator’s

LENS: LEveraging anti-social Network against Spam

Page 18

Publisher:

Deutsche Telekom AG Laboratories Ernst-Reuter -Platz 7 D-10587 Berlin Telefon: +49 30 8353-58555 www.laboratories.telekom.com Authors: [email protected], [email protected]

© 2010 Deutsche Telekom Laboratories

The information contained in this document represents the current view of the authors on the issues discussed as of the date of publica-tion. This document should not be interpreted to be a commitment on the part of Deutsche Telekom Laboratories, and Deutsche Telekom Laboratories cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. Deutsche Telekom Laboratories makes no warranties - express, implied, or statutory - as to the information in this document.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording or otherwise), or for any purpose, without the express written permission of Deutsche Telekom Laboratories.

Deutsche Telekom Laboratories may have patents, patent applications, trademarks, copyrights or other intellectual property rights cover-ing the subject matter in this document. Except as expressly provided in any written license agreement from Deutsche Telekom Laboratories, the furnishing of this document does not give you any license to these patents, trademarks, copyrights or other intellectual property.