jure leskovec phd: machine learning department, cmu now: computer science department, stanford...

25
Dynamics of large networks Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

Upload: charles-price

Post on 30-Dec-2015

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

Dynamics of large networks

Jure LeskovecPhD: Machine Learning Department, CMUNow: Computer Science Department, Stanford University

Page 2: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

2

Web today – Diverse applications

Diverse on-line

computing applications

Page 3: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

3

Web today – Millions of users

Large scaleMillions of

users

Page 4: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

4

Web today – Rich content

Rich user generated

content

Page 5: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

5

Web today – Highly dynamic

Content is constantly

being updated and changed

Page 6: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

6

Web today – Traces of activity

Massive traces of human social

activity are collected

Page 7: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

7

Web today – Rich interactions

Rich interactions

between users and content

Page 8: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

8

Web today – Interaction networks

Modeled as interaction network

Page 9: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

9

Web today – Immense posibilities

Large on-line applications

with hundreds of millions of

users

Web is like a “laboratory” for studying millions of

humans

Page 10: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

10

Testing the Small-world hypothesis

[w/ Horvitz WWW ‘08]

MSN Messenger

Average path length is 6.6

90% of nodes is reachable <8 steps

Network of who talks to whom on MSN Messenger:

240M nodes, 1.3 billion edges

Page 11: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

11

Why study web and networks?

Large on-line applications

with hundreds of millions of

users

Build understanding and theory: How users create content and interact with it and

among themselves?

Build better on-line applications: How to design better services and algorithms?

Page 12: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

12

Thesis: Network dynamics

Network evolution How network structure changes as the

network grows and evolves?

Diffusion and cascading behavior How does information and diseases spread

over networks?

Page 13: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

13

Thesis: Network evolution (1)

Observations: measuring Q: How do does network structure evolve? A: Networks densify and diameter shrinks

Densification

a=1.6

N(t)

E(t)

time

diam

eter

Diameter

Network diameter shrinks over time

[w/ Kleinberg and Faloutsos KDD ‘05]

Page 14: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

14

Thesis: Network evolution (2)

Models: understanding Q: What is a good model/explanation? A: Forest Fire model

uw

v

[w/ Kleinberg and Faloutsos KDD ‘05]

Page 15: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

15

Thesis: Network evolution (3)

Algorithms: doing things better Q: How to generate realistic graphs? A: Kronecker graphs

[w/ Faloutsos ICML ‘07]

Page 16: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

16

Thesis: Diffusion and cascades (1)

Observations: measuring Q: How does information propagate over the web? A: Meme-tracking

[w/ Backstrom and Kleinberg KDD ‘09]

Page 17: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

17

Thesis: Diffusion and cascades (2)

Models: understanding Q: How to information propagation A: Zero-crossing model

[w/ Goetz, McGlohon and Faloutsos ICWSM ‘09]

Page 18: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

18

Thesis: Diffusion and cascades (3)

Algorithms: doing things better Q: How to identify influential nodes and epidemics? A: CELF (cost-effective lazy forward-selection)

[w/ Krause, Guestrin, Glance and Faloutsos KDD ‘07]

Page 19: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

19

Thesis: Size matters

Massive data: MSN Messenger network [w/ Horvitz, WWW ’08]

▪ 240M people, 255B messages Product recommendations [w/ Adamic, Huberman EC ‘06]

▪ 4M people, 16M recommendations Blogosphere [w/ Backstrom, Kleinberg KDD ‘09]

▪ 164M posts, 127M links

Benefit: Properties become “visible” E.g.: In large networks only small clusters exist

[w/ Dasgupta, Lang and Mahoney WWW ’08]

Page 20: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

20

Thesis: Reflections

Why are networks the way they are? Only recently have basic properties been

observed on a large scale Confirms social science intuitions; calls others

into question

What are good tractable network models? Builds intuition and understanding

Benefits of working with large data Observe structures not visible at smaller scales

Page 21: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

21

Challenges

Why are networks the way they are? Richer networks, richer more detailed data

New findings and observations

More accurate models Predictive modeling

Large scale Will find phenomena and emergent patterns not

visible at small scales

Page 22: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

22

Future directions

Global predictive models Online massively multi-player games

Information diffusion When, where and what post will create a cascade?

Where to tap the network to get right effects? Social Media Marketing

Steering the evolution of the network Cultivating the social network

Page 23: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

23

What’s next?

Observations: Data analysis

Models: Predictions

Algorithms: Applications

Actively influencing the network

Page 24: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

24

Christos FaloutsosJon KleinbergEric Horvitz

Carlos GuestrinAndreas KrauseSusan Dumais

Andrew TomkinsRavi Kumar

Lada AdamicBernardo Huberman

Kevin LangMichael Manohey

Zoubin GharamaniAnirban Dasgupta

Tom MitchellJohn Lafferty

Larry WassermanAvrim Blum

Sam MaddenMichalis Faloutsos

Ajit SinghJohn Shawe-Taylor

Lars BackstromMarko Grobelnik

Natasa Milic-FraylingDunja Mladenic

Jeff HammerbacherMatt Hurst

Deepay ChakrabartiMary McGlohonNatalie Glance

Jeanne VanBriesen

Microsoft ResearchYahoo Research

FacebookEve online

Spinn3rLinkedIn

Nielsen BuzzMetricsMicrosoft Live Labs

Google

Acks

Page 25: Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University

Questions:[email protected]

Thesis:Google: Jure thesis

Data + Code:http://snap.stanford.edu

THANKS!