birds, bots and machines - fraud in twitter and machine learning

59
Vicente Díaz Senior Security Analyst, Global Research and Analysis Team irds, bots and machines: etecting fraud in Twitter using Machine Lear

Upload: vicentediazkl

Post on 28-Nov-2014

32.214 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Birds, Bots and Machines - fraud in Twitter and machine learning

Vicente DíazSenior Security Analyst, Global Research and Analysis Team

Birds, bots and machines:Detecting fraud in Twitter using Machine Learning

Page 2: Birds, Bots and Machines - fraud in Twitter and machine learning

Expectations vs reality

Page 3: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 4: Birds, Bots and Machines - fraud in Twitter and machine learning

Why Twitter?

Page 5: Birds, Bots and Machines - fraud in Twitter and machine learning

Spam - email

Q1 of 2011

Q2 of 2011

Q3 of 2011

Q1 of 2012

Q2 of 2012

Q3 2012

September 2012

October 2

012

November 2012

December 2

012

January 20130.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

Page 6: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 7: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 8: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 9: Birds, Bots and Machines - fraud in Twitter and machine learning

Using hacked accounts

Page 10: Birds, Bots and Machines - fraud in Twitter and machine learning

Using hacked accounts

Page 11: Birds, Bots and Machines - fraud in Twitter and machine learning

Anything else interesting?

#PalabrasNeciasMovistarSorda

Page 12: Birds, Bots and Machines - fraud in Twitter and machine learning

Anything else interesting?

#PalabrasNeciasMovistarSorda

Page 13: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 14: Birds, Bots and Machines - fraud in Twitter and machine learning

Getting profiles

Page 15: Birds, Bots and Machines - fraud in Twitter and machine learning

Getting profiles

Page 16: Birds, Bots and Machines - fraud in Twitter and machine learning

Getting profiles

Page 17: Birds, Bots and Machines - fraud in Twitter and machine learning

A random campaign

Page 18: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 19: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 20: Birds, Bots and Machines - fraud in Twitter and machine learning

Lifespan of bots

Page 21: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 22: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 23: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 24: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 25: Birds, Bots and Machines - fraud in Twitter and machine learning

Detour – A few words on privacy

Page 26: Birds, Bots and Machines - fraud in Twitter and machine learning

Tracking

Page 27: Birds, Bots and Machines - fraud in Twitter and machine learning

Advanced trackingIdentify the user:

Passive data: headers, plugins, browser, OS

JS: screen resolution, custom resource detection via Plugins API

(i.e. printers via PDF, fonts via Flash, etc.)

Track IDCookies, Flash cookies (allow cross-domain references),

HTML5 storage, silverlight

Java: own download cache, applets can read embedded resource streams

Future? Apps and games in social networks.

Page 28: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 29: Birds, Bots and Machines - fraud in Twitter and machine learning

Let´s play

Page 30: Birds, Bots and Machines - fraud in Twitter and machine learning

Experiment

• 3 months of tracking• 36 malicious campaigns

• 13,490 profiles• 195,801 tweets

• 6,519,247 relationships

Page 31: Birds, Bots and Machines - fraud in Twitter and machine learning

Machine Learning in 60 seconds• Supervised learning• Training – adaptative models• Classification

• Key: choose the right attributes

Page 32: Birds, Bots and Machines - fraud in Twitter and machine learning

Machine Learning in 60 seconds• Supervised learning• Training – adaptative models• Classification

• Key: choose the right attributes

Page 33: Birds, Bots and Machines - fraud in Twitter and machine learning

Feature selection• Curse of dimensionality• No new knowledge is generated: choose the

right features!

TwitterusernameprofileImgfollowingCount followersCount tweetsCount fullName followingfollowersnumberOfProfileTweetsprotected text

possiblySensitivesourcelocation

coordinatesdescriptionlangurlcreatedAttimeZoneverified

Derived

meanTimeBetweenTweets

friendFollowerRatiotweetsKnownRecv tweetsUnknownRecv percFollowingFollowers

percProfileTweetsWithLink percProfileTweetsToSomeone percProfileTweetsRT

numberOfViasUsed

Page 34: Birds, Bots and Machines - fraud in Twitter and machine learning

Mean time between tweets

Page 35: Birds, Bots and Machines - fraud in Twitter and machine learning

Tweets to someone

Page 36: Birds, Bots and Machines - fraud in Twitter and machine learning

Tweets to someone

After some testing and feature-selection algorithms:

numberOfViastweetsToSomeonetweetsWithLinkfollowingFollowersfriendFollowerRatiotweetsKnownReceivertweetsUnknownReceiver

Page 37: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 38: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 39: Birds, Bots and Machines - fraud in Twitter and machine learning

Avoiding detection

You are doing it wrong!

Page 40: Birds, Bots and Machines - fraud in Twitter and machine learning

Avoiding semantic analysis• if its do you me your my do it my be find is but on are its rt that

was

• I a me at get out your they on rt if I get rt can a • u you rt find in I that that your my my find one you so is is my you

this but get all a one its it • they with its your get me of I

Page 41: Birds, Bots and Machines - fraud in Twitter and machine learning

Avoiding relationship checks

Page 42: Birds, Bots and Machines - fraud in Twitter and machine learning

Avoiding relationship checks

Or just overflow with fake profiles …

Page 43: Birds, Bots and Machines - fraud in Twitter and machine learning

DIY

Page 44: Birds, Bots and Machines - fraud in Twitter and machine learning

Finding malicious profiles• Not so hard …

Page 45: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 46: Birds, Bots and Machines - fraud in Twitter and machine learning

AdrianaDickson7

MyrtleTerry11

PatricaFitzpat6

RobertP97792514

RochelleBeasle8

ShannonMunoz13

1 week later…

5200 profiles in this campaign

Around 250 new profiles created every day

Page 47: Birds, Bots and Machines - fraud in Twitter and machine learning

0 50 100 150 200 250 3000

20406080

100120140160180

Following

Following

Page 48: Birds, Bots and Machines - fraud in Twitter and machine learning

0 50 100 150 200 250 3000

20406080

100120140160180

Followers

Followers

Page 49: Birds, Bots and Machines - fraud in Twitter and machine learning

Top tweets sent• Mmmm hot chocolate with cream• Beyonce looks so hot in her new ad• So Hot• Spain !! Too hot• hot summer• a hot bubble bath is much needed• Tea water supposed to hot ya now• Air conditioner-laying on the bed-naked-relax-heaven! So hot tonight!• playing piano and guitar r the only things i can do right in life does this

make me hot enough for a boyfriend yet</p• Austin mahone is just like another justin beiber..he is hot tho!

1800 different tweets

Page 50: Birds, Bots and Machines - fraud in Twitter and machine learning

Top tweets sent• Mmmm hot chocolate with cream• Beyonce looks so hot in her new ad• So Hot• Spain !! Too hot• hot summer• a hot bubble bath is much needed• Tea water supposed to hot ya now• Air conditioner-laying on the bed-naked-relax-heaven! So hot tonight!• playing piano and guitar r the only things i can do right in life does this

make me hot enough for a boyfriend yet</p• Austin mahone is just like another justin beiber..he is hot tho!

1800 different tweets

Page 51: Birds, Bots and Machines - fraud in Twitter and machine learning
Page 52: Birds, Bots and Machines - fraud in Twitter and machine learning

Not only limited to Twitter

Page 53: Birds, Bots and Machines - fraud in Twitter and machine learning

Not only limited to Twitter

Page 54: Birds, Bots and Machines - fraud in Twitter and machine learning

Not only limited to Twitter

Page 55: Birds, Bots and Machines - fraud in Twitter and machine learning

ConclusionsIt is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

Machine learning: lots of resources!

Page 56: Birds, Bots and Machines - fraud in Twitter and machine learning

ConclusionsIt is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

Machine learning: lots of resources!

Page 57: Birds, Bots and Machines - fraud in Twitter and machine learning

ConclusionsIt is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

Machine learning: lots of resources!

Page 58: Birds, Bots and Machines - fraud in Twitter and machine learning

ConclusionsIt is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

Machine learning: lots of resources!

Page 59: Birds, Bots and Machines - fraud in Twitter and machine learning

Thank youQuestions?

Vicente Díaz @trompi

Senior Security Analyst, Global Research and Analysis Team