early detection of twitter trends milan stanojevic [email protected] university of belgrade...

22
EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC [email protected] UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

Upload: spencer-gibson

Post on 23-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

EARLY DETECTION OF TWITTER TRENDS

MILAN STANOJEVIC

[email protected]

UNIVERSITY OF BELGRADE

SCHOOL OF ELECTRICAL ENGINEERING

Page 2: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CONTENTS

Introduction

Trending topics

Parametric model

Data-Driven approach

Experiment results

Conclusion

2/22

Milan Stanojevic

Page 3: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

INTRODUCTION

Events occur in large datasets

We need: detection

classification

prediction

Parametric models are popular but overly simplistic

Nonparametric approach is proposed for time series inference

Observed signal is compared to two sets of reference signals – positive and negative examples

Is there enough information for earlier prediction?

(spoiler alert: YES)

3/22

Milan Stanojevic

Page 4: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

TRENDING TOPICS

Twitter: a global communication network

Tweet: a short, public message

Topic: a phrase in a tweet

Trending topic (trend): a topic that becomes popular

4/22

Milan Stanojevic

Page 5: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

PARAMETRIC MODEL

Expect certain type of pattern usually constant + jumps

Fit parameter in data e.g. size of a jump

5/22

Milan Stanojevic

Page 6: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

DATA-DRIVEN APPROACH

All the information needed is in the data

Assumptions: tweets are written by people

people are simple:

in how they spread information

in how they connect to each other

there is a small number of distinct ways in which a topic becomes trending

6/22

Milan Stanojevic

Page 7: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

DATA-DRIVEN APPROACH

7/22

Milan Stanojevic

Page 8: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

DATA DRIVEN APPROACH

8/22

Milan Stanojevic

Page 9: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

9/22

Milan Stanojevic

Page 10: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

10/22

Milan Stanojevic

Page 11: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

11/22

Milan Stanojevic

Page 12: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

12/22

Milan Stanojevic

Page 13: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

13/22

Milan Stanojevic

Page 14: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

14/22

Milan Stanojevic

Page 15: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

15/22

Milan Stanojevic

Page 16: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

16/22

Milan Stanojevic

Page 17: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

17/22

Milan Stanojevic

Page 18: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CLASSIFICATION BY EXPERTS

Properties

simple: computation of distances

scalable: computation is easily parallelized

nonparametric: model “parameters” scale along with the data

18/22

Milan Stanojevic

Page 19: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

EXPERIMENT

SETUP

Dataset: 500 trends

500 non-trends

Do trend detection of 50% holdout set of topics

Online signal classification

RESULTS

Early detection 79% rate of early detection, 1.43hrs average

Low rate of error 95% true positive rate, 4% false positive rate

19/22

Milan Stanojevic

Page 20: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

EXPERIMENT

FPR / TPR Tradeoff

Early / Late Tradeoff

20/22

Milan Stanojevic

Page 21: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

CONCLUSION

New approach to detecting Twitter trends

Generalized time series analysis method: Classification

Prediction

Anomaly detection

Possible applications: Movie ticket sales

Stock prices

etc.

21/22

Milan Stanojevic

Page 22: EARLY DETECTION OF TWITTER TRENDS MILAN STANOJEVIC sm123317m@student.etf.rs UNIVERSITY OF BELGRADE SCHOOL OF ELECTRICAL ENGINEERING

BIBLIOGRAPHY

Trend or No Trend: A Novel Nonparametric Method for Classifying Time Series

Stanislav Nikolov

Master thesis

Massachusetts Institute of Technology (2011)

22/22

Milan [email protected]