detecting and characterizing social spam campaigns hongyu gao, jun hu, christo wilson, zhichun li,...

42
Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US Northwestern / Huazhong Univ. of Sci & Tech, China University of California, Santa Barbara, US NEC Laboratories America, Inc., US

Upload: joseph-payne

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

Detecting and Characterizing Social Spam Campaigns

Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao

Northwestern University, USNorthwestern / Huazhong Univ. of Sci & Tech, China

University of California, Santa Barbara, USNEC Laboratories America, Inc., US

Page 2: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

Background

2

Page 3: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

3

Benign post1

Benign post2

Benign post1

Benign post2

Benign post1

Benign post2

Benign post3

Benign post1

Benign post2

Benign post3

Benign post1

Benign post2

Benign post1

Benign post2

……

……

Page 4: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

4

Secret admirer

reveald. Go here

to find out who …

Page 5: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

5

Contributions

• Conduct the largest scale experiment on Facebook to

confirm spam campaigns. – 3.5M user profiles, 187M wall posts.

• Uncover the attackers’ characteristics.– Mainly use compromised accounts.

– Mostly conduct phishing attack.

• Release the confirmed spam URLs, with posting times. – http://list.cs.northwestern.edu/socialnetworksecurity

– http://current.cs.ucsb.edu/socialnets/

Page 6: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

66

• Detection System Design

• Validation

• Malicious Activity Analysis

• Conclusions

Roadmap

Page 7: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

7

System Overview

• Identify coordinated spam campaigns in Facebook.– Templates are used for spam generation.

Page 8: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

8

Build Post Similarity Graph

– A node: an individual wall post

– An edge: connect two “similar” wall posts

Check out

funny.com

Go to

evil.com!

Page 9: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

Wall Post Similarity Metric

Spam wall post model:

9

A textual description:

A destination URL:

hey see your love

compatibility ! go here

yourlovecalc . com

(remove spaces)

hey see your love

compatibility ! go here

(remove spaces)

yourlovecalc . com

Page 10: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

10

Wall Post Similarity Metric

• Condition 1:

– Similar textual description.

Guess who your secret admirer is?? Go here nevasubevd . blogs pot . co m (take out spaces)

Guess who your secret admirer is??”

Visit: yes-crush . com (remove spaces)

Establish an edge!

Guess who your secret admirer is?? Go here (take out spaces)“Guess who ”, “uess who y”, “ess who yo”, “ss who you”, “s who your”, “ who your ”, “who your s”, “ho your se”, …14131193659701777830, 14741306959712195600, 10922172988510136713, 9812648544744602511, …996649753058124798, 1893573314373873575, 4928375840175086076, 5186308048176380985, …

Guess who your secret admirer is??”

Visit: yes-crush . com (remove spaces)“Guess who ”, “uess who y”, “ess who yo”, “ss who you”,“s who your”, “who your s”, “ho your se”, “o your sec”, …14131193659701777830, 14741306959712195600, 10922172988510136713, 9812648544744602511, …996649753058124798, 1893573314373873575, 4928375840175086076, 5186308048176380985, …

Page 11: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

11

Wall Post Similarity Metric

• Condition 2:

– Same destination URL.

secret admirer revealed.goto yourlovecalc . com (remove the spaces)

hey see your love compatibility !go here yourlovecalc . com (remove spaces)

Establish an edge!

Page 12: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

12

Extract Wall Post Campaigns

• Intuition:

• Reduce the problem of identifying potential

campaigns to identifying connected subgraphs.

A B

CB

A C

B

Page 13: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

13

Locate Spam Campaigns

• Distributed: campaigns have many senders.

• Bursty: campaigns send fast.

Wall post

campaign

Distributed?NO

Benign

YES

Bursty?NO

BenignYES

Malicious

Page 14: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

1414

• Detection System Design

• Validation

• Malicious Activity Analysis

• Conclusions

Roadmap

Page 15: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

15

Validation

• Dataset:

– Leverage unauthenticated regional network.

– Wall posts already crawled from prior study.

– 187M wall posts in total, 3.5M recipients.

– ~2M wall posts with URLs.

• Detection result:

– ~200K malicious wall posts (~10%).

Page 16: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

16

Validation

• Focused on detected URLs.

• Adopted multiple validation steps:

URL de-obfuscation

3rd party tools

Redirection analysis

Keyword matching

URL grouping

Manual confirmation

Page 17: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

17

Validation

• Step 1: Obfuscated URL

– URLs embedded with obfuscation are malicious.

– Reverse engineer URL obfuscation methods:

• Replace ‘.’ with “dot” : 1lovecrush dot com

• Insert white spaces : abbykywyty . blogs pot . co m

Page 18: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

18

Validation

• Step 2: Third-party tools

– Use multiple tools, including:

• McAfee SiteAdvisor

• Google’s Safe Browsing API

• Spamhaus

• Wepawet (a drive-by-download analysis tool)

• …

Page 19: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

19

Validation

• Step 3: Redirection analysis– Commonly used by the attackers to hide the malicious

URLs.

URL1

URLM

URL1

Page 20: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

20

Experimental Evaluation

The validation result.

False Positives

True Positives (ALL)

Manual confirmation

URL grouping

Keyword matching

Redirection Anslysis

Blacklisted URL

Obfuscated URL

28.0%

6.3%

27.9%

1.2%

32.5%

0.1%

96.1%

3.9%

Page 21: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

2121

• Detection System Design

• Validation

• Malicious Activity Analysis

• Conclusions

Roadmap

Page 22: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

2222

Malicious Activity Analysis

• Spam URL Analysis

• Spam Campaign Analysis

• Malicious Account Analysis

• Temporal Properties of Malicious Activity

Page 23: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

23

Spam Campaign Topic Analysis

Campaign Summarized wall post description Post #

Crush Someone likes you 45088

Ringtone Invitation for free ringtones 22897

Love-calc Test the love compatibility 20623

… … …

• Identifying attackers’ social engineering tricks:

Page 24: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

24

• Categorize the attacks by attackers’ goals.

Spam Campaign Goal Analysis

Phishing #1: for moneyPhishing #2: for info

Page 25: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

25

• Sampled manual analysis:

Malicious Account Analysis

Neither

Either

Receiving wall post

Using application 33.9%

84.5%

89%11%

• Account behavioral analysis:

No conversation

Unknown conversation

Human conversation 194

5

1

Page 26: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

26

• Counting all wall posts, the curves for malicious and

benign accounts converge.

Malicious Account Analysis

Page 27: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

2727

• Detection System Design

• Validation

• Malicious Activity Analysis

• Conclusions

Roadmap

Page 28: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

28

Conclusions

• Conduct the largest scale spam detection and

analysis on Facebook. – 3.5M user profiles, 187M wall posts.

• Make interesting discoveries, including:– Over 70% of attacks are phishing attacks.

– Compromised accounts are prevailing.

Page 29: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

29

Thank you!

Project webpage: http://list.cs.northwestern.edu/socialnetworksecurity

http://current.cs.ucsb.edu/socialnets/

Spam URL release:http://dod.cs.northwestern.edu/imc10/URL_data.tar.gz

Page 30: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

30

Bob

DaveChuck

Bob’s Wall

That movie was fun!

From: DaveThat movie was fun!

Check out funny.com

From: ChuckCheck out funny.com

Go to evil.com!

From: ChuckGo to evil.com!

Chuck

Page 31: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

31

Benign post1

Benign post2

Benign post1

Benign post2

Malicious p1

Malicious p2

Benign post1

Benign post2

Benign post3

Malicious p1

Benign post1

Benign post2

Benign post3

Malicious p1

Benign post1

Benign post2

Benign post1

Benign post2

Malicious p1

Malicious p2

……

……

……

Page 32: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

32

Data Collection

• Based on “wall” messages crawled from Facebook (crawling period: Apr. 09 ~ Jun. 09 and Sept. 09).

• Leveraging unauthenticated regional networks, we recorded the crawled users’ profile, friend list, and interaction records going back to January 1, 2008.

• 187M wall posts with 3.5M recipients are used in this study.

Page 33: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

33

Filter posts without URLs

• Assumption: All spam posts should contain some form of URL, since the attacker wants the recipient to go to some destination on the web.

• Example (without URL):

Kevin! Lol u look so good tonight!!!

Filter out

Page 34: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

34

Filter posts without URLs

• Assumption: All spam posts should contain some form of URL, since the attacker wants the recipient to go to some destination on the web.

• Example (with URL):

Further process

Um maybe also this:http://community.livejournal.com/lemonadepoem/54654.html

Guess who your secret admirer is?? Go here nevasubevd\t. blogs pot\t.\tco\tm (take out spaces)

Page 35: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

35

Extract Wall Post Clusters

A sample wall post similarity graph and the corresponding clustering result (for illustrative purpose only)

Page 36: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

36

Locate Malicious Clusters

• (5, 1.5hr) is found to be a good (n, t) value.• Slightly modifying the value only have

minor impact on the detection result.• A relaxed threshold of (4, 6hr) only result

in 4% increase in the classified malicious cluster.

Page 37: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

37

Experimental Validation

• Step 5: URL grouping– Groups of URLs exhibit highly uniform features. Some

have been confirmed as “malicious” previously. The rest are also considered as “malicious”.

– Human assistance is involved in identifying such groups.

• Step 6: Manual analysis– We leverage Google search engine to confirm the

malice of URLs that appear many times in our trace.

Page 38: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

38

URL Analysis

• 3 different URL formats (with e.g.):– Link: <a href=“...”>http://2url.org/?67592</a>

– Plain text: mynewcrsh.com

– Obfuscated: nevasubevu . blogs pot . co m

Type # of URLs

# of Wall Posts

Avg # of Wall posts per URL

Total # 15,484 199,782 N/A

Obfuscated 6.5% 25.3% 50.3

Plaintext 3.8% 6.7% 22.9

Hypertext link 89.7% 68.0% 9.8

Page 39: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

39

URL Analysis

• 4 different domain types (with e.g.):– Content sharing service: imageshack.us

– URL shortening service: tinyurl.org

– Blog service: blogspot.com

– Other: yes-crush.comType # of URLs # of Wall Posts

ContentShare 2.8% 4.8%

URL-short 0.7% 5.0%

Blogs 55.6% 15.8%

Other 40.9% 74.4%

Page 40: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

40

Spam Campaign Temporal Analysis

Page 41: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

41

Account Analysis

• The CDF of interaction ratio.• Malicious accounts exhibit higher interaction ratio than

benign ones.

Page 42: Detecting and Characterizing Social Spam Campaigns Hongyu Gao, Jun Hu, Christo Wilson, Zhichun Li, Yan Chen and Ben Y. Zhao Northwestern University, US

42

Wall Post Hourly Distribution

• The hourly distribution of benign posts is consistent with the diurnal pattern of human, while that of malicious posts is not.