fast algorithms for querying and mining large graphs hanghang tong machine learning department...

102
Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University [email protected] http://www.cs.cmu.edu/~htong 1

Upload: viviana-still

Post on 14-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Fast Algorithms for Querying and Mining Large Graphs

Hanghang TongMachine Learning Department

Carnegie Mellon [email protected]

http://www.cs.cmu.edu/~htong

1

Page 2: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

-----

Graphs are everywhere!

2

Internet Map [Koren 2009] Food Web [2007]

Protein Network [Salthe 2004]

Social Network [Newman 2005]

Web Graph

Terrorist Network [Krebs 2002]

Why Do

We Care?

Page 3: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Research Theme

Help users to understand and utilize large graph-related data?

3

Page 4: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

A1: Social Networks

• Facebook (300m users, $10bn value, $500mn revenue)• MSN (240m users, 4.5pb); Myspace (110m users)• LinkedIn (50m users, $1bn value); Twitter (18m users)

How to help users explore such networks?(e.g., find strange persons, communities, locate common friends, etc)4

Page 5: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

A2: Network Forensics [Sun+ 2007]

How to detect abnormal traffic?

5

Port scanning DDoS Normal Traffic

Footnote: Rows are IP sources; Columns are IP destinations.

Adj. Matrix

ibm.com

cmu.edu

Graph

Page 6: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

6

2005

NY Time

Forbes

Reuters Hardware

Service

IBM

2006

NY Time

Forbes

Reuters Hardware

Service

IBM

2007

NY Time

Forbes

Reuters Hardware

Service

IBM

A3: Business Intelligence

….….

Year

Rank of IBM in Global Service(higher is better)

What is IBM’s rank in global service business over years?

Footnote: nodes are business reviews and keywords; edges means ‘reporting’

Page 7: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

A4: Financial Fraud Detection[Tong+ 2007]

7

7.5% of U.S. adults lost money for financial fraud 50%+ US corporations lost >= $500,000 [Albrecht+ 2001]

e.g., Enron ($70bn) Total cost of financial fraud: $1trillion [Ansari 2006]

How to detect abnormal transaction patterns?(e.g., money-laundry ring)

: Anonymous accounts

: Anonymous banks

Legends:

Page 8: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

A5: Immunization

How to select k `best’ nodes for immunization?8

34

33

2526

27

28

29

30

31 32

22

21

20

19

18

17

23 2412

1314

1516

1

9

10

11

3

4

56

7

8

2

Footnote: SARS costs 700+ lives; $40+ Bn

Page 9: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

This Talk

• Querying [Goal: query complex relationship]– Q.1. Find complex user-specific patterns;– Q.2. Proximity tracking;– Q.3. Answer all the above questions quickly.

• Mining [Goal: find interesting patterns]– M.1. Immunization;– M.2. Spot anomalies.

9

Page 10: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Tasks vs. Applications App.sTasks

A1 A2 A3 A4 A5

Q1Q2Q3M1M2

10

A1: Social Networks A2: Network Forensics A3: Business IntelligenceA4: Financial FraudA5: Immunization

Q1: Complex User-Specific PatternsQ2: Proximity Tracking Q3: Fast Proximity ComputingM1: Immunization M2: Anomaly Detection

Page 11: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

Q1

Q3

Q2

Q3

M1

M2 M2

Page 12: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 13: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Proximity Measurement

13Q: How close is A to B?

A BH1 1

D1 1

E

F

G1 11

I J1

1 1

a.k.a Relevance, Closeness, ‘Similarity’…

Background

Page 14: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Random Walk with Restart [Tong+ ICDM 2006]

Node 4

Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12

0.130.100.130.220.130.050.050.080.040.030.040.02

1

4

3

2

56

7

910

811

120.13

0.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03

Ranking vector More red, more relevant

Nearby nodes, higher scores

4r

Background

Page 15: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Intuitions: Why RWR is Good Score?

15

54

2

3

1513

1412

10 116 7 8 9

1 20

TargetSource

Score (Red Path) = (1-c) c6 x W(1,3) x W(3,4) x …. x W(14,20)

Penalty of length of path Prob of traversing the path

Footnote: (1-c) is restart probability in RWR; W is normalized adjacency matrix of the graph.

Background

Page 16: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Prox (1, 20) = Score (Red Path) + Score (Green Path) + Score (Yellow Path) +

Score (Purple Path) + …

A high proximity means many short/high weighted paths

54

2

3

1513

1412

10 116 7 8 9

1 20

TargetSource

Intuitions: Why RWR is Good Score?

Background

Page 17: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 18: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q1: Find Complex User-Specific Patterns

• Q1.1. Center-Piece Subgraph Discovery,– e.g., master-mind criminal given some

suspects X, Y and Z?

• Q1.2 Interactive Querying (e.g. Negation)– e.g., find most similar conferences wrt KDD,

but not like ICML?

18

Footnote: Our algorithms for both Q1.1 and Q1.2 are to be deployed in a real system (Cyano) in IBM

Page 19: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 20: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q1.1 Center-Piece Subgraph Discovery [Tong+ KDD 06]

A C

B

A C

B

Original GraphCePS

Q: How to find hub for the black nodes?

CePS Node

Input Output

Red: Max (Prox(A, Red) x Prox(B, Red) x Prox(C, Red))

Page 21: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

CePS: Example (AND Query)

21

DBLP co-authorship network: - 400,000 authors, 2,000,000 edges

?

Page 22: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

CePS: Example (AND Query)

R. Agrawal Jiawei Han

V. Vapnik M. Jordan

H.V. Jagadish

Laks V.S. Lakshmanan

Heikki Mannila

Christos Faloutsos

Padhraic Smyth

Corinna Cortes

15 1013

1 1

6

1 1

4 Daryl Pregibon

10

2

11

3

16

22

DBLP co-authorship network: - 400,000 authors, 2,000,000 edges

Page 23: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

K_SoftAND: Relaxation of AND

Asking AND query? No Answer!

Disconnected Communities

Noise

23

details

Page 24: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

R. Agrawal Jiawei Han

V. Vapnik M. Jordan

H.V. Jagadish

Laks V.S. Lakshmanan

Umeshwar Dayal

Bernhard Scholkopf

Peter L. Bartlett

Alex J. Smola

1510

13

3 3

5 2 2

327

4

CePS: 2 SoftAND

Stat.

DB

24

details

Page 25: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 26: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q1.2: Interactive Querying

26

User Feedback

User Feedback

User Feedback

User Feedback

Page 27: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Initial Results No to `ICML’ Yes to `SIGIR’

'ICDM' 'ICML' 'SDM' 'VLDB' 'ICDE'

'SIGMOD' 'NIPS''PKDD''IJCAI'

'PAKDD'

'ICDM' 'SDM''PKDD''ICDE''VLDB'

'SIGMOD''PAKDD''CIKM''SIGIR'

'WWW'

'SIGIR''TREC''CIKM''ECIR''CLEF''ICDM''JCDL''VLDB''ACL''ICDE'

two main sub-communities in KDD: DBs (green) vs. Stat (Red)

Negative feedback on ICML will exclude other stats confs (NIPS, IJCAI)

Positive feedback on SIGIR will bring more IR (brown) conferences.

what are most related conferences wrt KDD?(DBLP author-conference bipartite graph) 27

Q1.2 iPoG for Interactive Querying [Tong+ ICDM 08, CIKM 09]

Page 28: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q1.2 iPoG for Interactive Querying [Tong+ ICDM 08, CIKM 09]

Initial Results No to `ICML’ Yes to `SIGIR’

'ICDM' 'ICML' 'SDM' 'VLDB' 'ICDE'

'SIGMOD' 'NIPS''PKDD''IJCAI'

'PAKDD'

'ICDM' 'SDM''PKDD''ICDE''VLDB'

'SIGMOD''PAKDD''CIKM''SIGIR'

'WWW'

'SIGIR''TREC''CIKM''ECIR''CLEF''ICDM''JCDL''VLDB''ACL''ICDE'

two main sub-communities in KDD: DBs (green) vs. ML/AI (Red)

Negative feedback on ICML will exclude other ML/AI conf.s (NIPS, IJCAI)

Positive feedback on SIGIR will bring more IR (brown) conferences.

what are most related conferences wrt KDD?(DBLP author-conference bipartite graph) 28

Page 29: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Initial Results No to `ICML’ Yes to `SIGIR’

'ICDM' 'ICML' 'SDM' 'VLDB' 'ICDE'

'SIGMOD' 'NIPS''PKDD''IJCAI'

'PAKDD'

'ICDM' 'SDM''PKDD''ICDE''VLDB'

'SIGMOD''PAKDD''CIKM''SIGIR'

'WWW'

'SIGIR''TREC''CIKM''ECIR''CLEF''ICDM''JCDL''VLDB''ACL''ICDE'

two main sub-communities in KDD: DBs (green) vs. ML/AI (Red)

Negative feedback on ICML will exclude other ML/AI conf.s (NIPS, IJCAI)

Positive feedback on SIGIR will bring more IR (brown) conferences.

what are most related conferences wrt KDD?(DBLP author-conference bipartite graph) 29

Q1.2 iPoG for Interactive Querying [Tong+ ICDM 08, CIKM 09]

Page 30: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 31: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q2.2 pTrack: Challenge[Tong+ SDM 08]

• Observations (CePS, iPoG…)– All for static graphs– Proximity: main tool

• Graphs are evolving over time!– New nodes/edges show up; – Existing nodes/edges die out; – Edge weights change…

Q: How close is Philip Yu to DBs over years? A: Track proximity, incrementally! 31

Page 32: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

32

Author-Keyword Bipartite Graphs (NIPS)….….

NIP

S 1

995Sejnowski

Jordan

Neural Network

ICA

Bayes

NIP

S 1

994Sejnowski

Jordan

Neural Network

ICA

Bayes

NIP

S 1

993Sejnowski

Jordan

Neural Network

ICA

Bayes

Page 33: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

pTrack: Trend analysis on graph level

M. Jordan

G.HintonC. Koch

T. Sejnowski

Year

Rank of Influence

33

Page 34: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

pTrack: Problem Definitions

• [Given] – a large, skewed time-evolving bipartite graphs, – the query nodes of interest

• [Track] – (1) top-k most related nodes for each query node

at each time step t; – (2) the proximity score (or rank of proximity)

between any two query nodes at each time step t.

34

Page 35: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

pTrack: Philip S. Yu’s Top-5 conferences up to each year

ICDE

ICDCS

SIGMETRICS

PDIS

VLDB

CIKM

ICDCS

ICDE

SIGMETRICS

ICMCS

KDD

SIGMOD

ICDM

CIKM

ICDCS

ICDM

KDD

ICDE

SDM

VLDB

1992 1997 2002 2007

DatabasesPerformanceDistributed Sys.

DatabasesData Mining

DBLP: (Au. x Conf.) - 400k authors, - 3.5k conferences - 20 years

35

Page 36: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Prox. Rank

Year

Data Mining and Databases are getting closer & closer

36

(Closer)

John

KDD

Tom

Bob

Carl

Van

RoyRECOMB

ICML

VLDB

KDD’s Rank wrt. VLDB over years…

….

Page 37: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q2: pTrack on Bipartite Graphs• Computational Challenges (assuming )

– Iterative method O(m)– Straight-forward update

• Example– NetFlix (2.6m users x 18k movies, 100m ratings)– Both need >1hr

37

Page 38: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q2: pTrack on Bipartite Graphs• Observation #1

– n1 authors; n2 conferences;

– n1 >> n2

• e.g., 400k authors, 3.5k conf.s in DBLP

• Observation #2– m edges changed, (n1 authors, n2 conf.s)

– rank of update = = update

• Proposed algorithm: Fast-Update

38

Theorem: (Tong+ 2008) (1) Fast-Update has no quality loss (2) Fast-Update is

~ ~ ~

KDD

Page 39: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

3939

176x speedup

40x speedup

log(Time) (Seconds)

Data Sets

Our method

Our method

Q2: Speed Comparison

Page 40: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 41: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

41

RWR: Think of it as Wine Spill

1. Spill a drop of wine on cloth 2. Spread/diffuse to the neighborhood

Background

Page 42: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

42

RWR: Wine Spill on a Graph

wine spill on cloth RWR on a graph

Query

Background

Page 43: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

1

4

3

2

56

7

910

8

11

12

Random Walk with Restart

43

Background

Page 44: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

44

Computing RWR

1

43

2

5 6

7

9 10

811

12

0.13 0 1/3 1/3 1/3 0 0 0 0 0 0 0 0

0.10 1/3 0 1/3 0 0 0 0 1/4 0 0 0

0.13

0.22

0.13

0.050.9

0.05

0.08

0.04

0.03

0.04

0.02

0

1/3 1/3 0 1/3 0 0 0 0 0 0 0 0

1/3 0 1/3 0 1/4 0 0 0 0 0 0 0

0 0 0 1/3 0 1/2 1/2 1/4 0 0 0 0

0 0 0 0 1/4 0 1/2 0 0 0 0 0

0 0 0 0 1/4 1/2 0 0 0 0 0 0

0 1/3 0 0 1/4 0 0 0 1/2 0 1/3 0

0 0 0 0 0 0 0 1/4 0 1/3 0 0

0 0 0 0 0 0 0 0 1/2 0 1/3 1/2

0 0 0 0 0 0 0 1/4 0 1/3 0 1/2

0 0 0 0 0 0 0 0 0 1/3 1/3 0

0.13 0

0.10 0

0.13 0

0.22

0.13 0

0.05 00.1

0.05 0

0.08 0

0.04 0

0.03 0

0.04 0

2 0

1

0.0

n x n n x 1n x 1

Ranking vector Starting vector(Normalized) Adjacency matrix

1

(1 )i i ir cWr c e

Restart p

Footnote: Maxwell Equation for Web [Chakrabarti]

Page 45: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Computing RWR

45Footnote: 1-c restart prob; W normalized adjacency matrix

Q

How to get (elements) of Q?

-1

= - - c x WIQ

1

4

3

2

5 6

7

9 10

811

12

4r

Page 46: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Computing RWR• OntheFly

– No Pre-Computation; – Light Storage Cost (W)– Slow On-Line Response: O(m x Iter)

• Pre-Compute– Fast On-Line Response – Prohibitive Pre-Compute Cost: O(n3)– Prohibitive Storage Cost: O(n2)

46

Page 47: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q: How to Balance?

On-line Off-line

47

Goal: Efficiently get (elements) of

Page 48: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

B_Lin: Basic Idea[Tong+ ICDM 2006]

1

43

2

5 6

7

9 10

811

120.130.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03

1

4

3

2

56

7

910

811

12

Find Community

Fix the remaining

Combine1

43

2

5 6

7

9 10

811

12

56

7

910

811

12

1

43

2

1

4

3

2

5 6

7

910

811

12

1

4

3

2

48

1

43

2

Page 49: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

B_Lin: Basic Idea[Tong+ ICDM 2006]

• Pre-Compute Stage– Find Communities– Pre-compute within-community scores

• On-Line Stage– Fix the influence of the bridges (cross-community links)

49

Page 50: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

+~~

B_Lin: details

W1: within community Cross community

details

50

+W =

Page 51: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

B_Lin: details

WI – c ~~ I – c – cUSVW1

-1 -1

Easy to be inverted LRA difference

Sherman–Morrison Lemma!

details

51If Then

Page 52: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

B_Lin: Pre-Compute Stage

• Q: Efficiently compute and store Q• A: A few small, instead of ONE BIG, matrices inversions

52Footnote: Q1=(I-cW1)-1

Page 53: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

B_Lin: On-Line Stage

• Q: Efficiently recover one column of Q• A: A few, instead of MANY, matrix-vector

multiplications

53

Page 54: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Query Time vs. Pre-Compute Time

Log Query Time

Log Pre-compute Time

•Quality: 90%+ •On-line:

•Up to 150x speedup•Pre-computation:

•Two orders saving

54

Our Results

Page 55: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

More on Scalability Issues for Querying(the spectrum of ``FastProx’’)

• B_Lin: one large linear system – [Tong+ ICDM06, KAIS08]

• BB_Lin: the intrinsic complexity is small – [Tong+ KAIS08]

• FastUpdate: time-evolving linear system – [Tong+ SDM08, SAM08]

• FastAllDAP: multiple linear systems – [Tong+ KDD07 a]

• Fast-iPoG: dealing w/ on-line feedback– [Tong+ ICDM 2008, Tong+ CIKM09]

55

Page 56: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 57: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

A5: Immunization

How to select k `best’ nodes for immunization?57

34

33

2526

27

28

29

30

31 32

22

21

20

19

18

17

23 2412

1314

1516

1

9

10

11

3

4

56

7

8

2

Page 58: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: SIS Virus Model [Chakrabarti+ 2008]

• ‘Flu’ like: Susceptible-Infectious-Susceptible

• If virus ‘strength’ s < 1/ λ1,A , an epidemic can not happen

58Footnote: Think of s as # of sneeze before heal.

Background

Page 59: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Optimal Method• Select k nodes, whose absence creates the

largest drop in λ1,A

59

1

9

10

3

4

5

7

8

6

2

9

1

11

10

3

4

56

7

8

2

9

Original Graph: λ1,A Without {2, 6}: λ1,A~

Page 60: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Optimal Method

• Select k nodes, whose absence creates the largest drop in λ1,A

• But, we need in time– Example:

• 1,000 nodes, with 10,000 edges • It takes 0.01 seconds to compute λ• It takes 2,615 years to find best-5 nodes !

60

Leading eigenvalue w/o subset of nodes S

Page 61: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Netshield to Rescue

61

Theorem: (Tong+ 2009)(1)

A u = λ1,AX

u(i): eigen-score 1

2

3 4

9

10

11 12

5

6

7 8

13

14

15 16

10

1010

1

11

1

1

11

1

1

11

1

u

Think of u(i) as PageRank or in-degree

Page 62: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Netshield to Rescue (Intuition)

• find a set of nodes S, which– (1) each has high eigen-scores– (2) diverse among themselves

1

2

3 4

9

10

11 12

5

6

7 8

13

14

15 16

10

1010

1

11

1

1

11

1

1

11

1

1

2

3 4

9

10

11 12

5

6

7 8

13

14

15 16

10

1010

1

11

1

1

11

1

1

11

1

1

2

3 4

9

10

11 12

5

6

7 8

13

14

15 16

10

1010

1

11

1

1

11

1

1

11

1

Theorem: (Tong+ 2009)(1)

Page 63: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Netshield to Rescue

• Example: – 1,000 nodes, with 10,000 edges – Netshield takes < 0.1 seconds to find best-5 nodes !– … as opposed to 2,615 years

63

Theorem: (Tong+ 2009)(1)

(2) Br(S) is sub-modular(3) Netshield is near-optimal (wrt max Br(S))(4) Netshield is O(nk2+m)

Footnote: near-optimal means Br(S Netshield) >= (1-1/e) Br(S Opt)

Page 64: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Why Netshield is Near-Optimal?

64

1

310

8 7

4

6

5

2

9

B

B

1

310

8 7

4

6

5

2

9

Blue Bar: Marginal benefit of deleting blues nodesGreen Bar: Benefit of deleting green nodes

A

A

A Sub-Modular (i.e., Diminishing Returns) >=

B

Theorem: k-step greedy alg. to maximize a sub-modular function guarantees (1-1/e) optimal [Nemhauster+ 78]

Page 65: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Why Br(S) is sub-modular?

65

1

310

8 7

4

6

5

2

9

Marginal Benefit

Pure from Blue Interaction between Blue and Green

Only purple term depends on {1, 2}!

Footnote: greens {1, 2} are nodes already deleted; blue {5,6} nodes are the nodes to be deleted

=

-

details

Page 66: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

66

1

310

8 7

4

6

5

2

9

Marginal Benefit = Blue –Purple

More Green

Footnote: greens are nodes already deleted; blue {5,6} nodes are the nodes to be deleted

1

310

8 7

4

6

5

2

9

More Purple Less Red

Marginal Benefit of Left >= Marginal Benefit of Right

M1: Why Br(S) is sub-modular? details

Page 67: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M2: Quality of Netshield

67

Eig

-Dro

p

k

Netshield

Optimal

(1-1/e) x Optimal

(better)

Page 68: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

M1: Speed of Netshield

68

Tim

e

k

> 10 days

0.1 secondsNetshield

NIPS co-authorship Network

(better)

Page 69: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Scalability of NetshieldT

ime

# of edges

(better)

X 108

Page 70: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 71: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Motivation [Tong+ KDD 08 b]

• Q: How to find patterns from a large graph?– e.g., communities, anomalies, etc.

71Author Conference

John

KDD

Tom

Bob

Carl

Van

RoyRECOMB

ISMB

ICDM

Page 72: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Motivation [Tong+ KDD 08 b]

• Q: How to find patterns from a large graph?– e.g., communities, anomalies, etc.

• A: Low-Rank Approximation (LRA) for adjacency matrix of the graph.

72

A L

M RX X

~~

Page 73: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

1 1 0 0

1 1 0 0

1 1 0 0

0 1 1 1

0 0 1 1

0 0 1 1

LRA for Graph Mining

John

KDD

Tom

Bob

Carl

Van

RoyRECOMB

ISMB

ICDM

Author Conference Adjacency matrix: A

73

Conference

Au

tho

r

Page 74: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

LRA for Graph Mining: Communities

John

KDD

Tom

Bob

Carl

Van

RoyRECOMB

ISMB

ICDM

Author Conf.

~~X X

Adj. matrix: AR: Conf. Group

M: Group-Group Interaction

L: author group

74

Page 75: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

LRA for Graph Mining: Anomalies

John

KDD

Tom

Bob

Carl

Van

RoyRECOMB

ISMB

ICDM

Author Conf.

Adj. matrix: A

Recon. error is high ‘Carl’ is abnormal

75

Reconstructed A~

Page 76: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Challenges: How to Get (L, M, R)?

• Efficiently • both time and space

• Intuitively• easy for interpretation

• Dynamically • track patterns over time

76

None of existing methods fully meets our wish list!

Page 77: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Why Not SVD and CUR/CMD?

• SVD (Optimal in L2 and LF )

– Efficiency• Time:• Space: (L, R) are dense

– Interpretation• Linear Combination of

many columns

– Dynamic: Not Easy

77

2 2(min( , ))O n m nm

• CUR/CMD (Example-based)

– Efficiency• Better than SVD• Redundancy in L

– Interpretation• Actual Columns from A

xxxx

– Dynamic: Not Easy

Page 78: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Solutions: Colibri [Tong+ KDD 08 b]

• Colibri-S: for static graphs– Basic idea: remove linear redundancy

• Colibri-D: for dynamic graphs– Basic idea: leverage smoothness over time

78

Theorem: (Tong+ 2008)(1) Colibri = CUR/CMD in accuracy

(2) Colibri <= CUR/CMD in time(3) Colibri <= CUR/CMD in space

Page 79: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Comparison SVD, CUR vs. Colibri

s

Wish List SVD [Golub+ 1989]

CUR[Drineas+ 2005]

Colibri[Tong+ 2008]

Efficiency

Interpretation

Dynamics79

details

Page 80: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Performance of Colibri-S

Time SpaceOurs

CUR CUR

CMDCMD

80

SVD SVD

• Accuracy• Same 91%+

• Time• 12x of CMD• 28x of CUR

• Space• ~1/3 of CMD• ~10% of CUR Ours

Page 81: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Performance of Colibri-D

Time

# of changed cols

CMD

Colibri-S

Colibri-D achieves up to 112x speedup

Colibri-D

81

Network traffic

- 21,837 nodes

- 1,220 hours

- 22,800 edge/hr

(Prior Best Method)

Accuracy

- Same 93%+

Page 82: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Overview

CePS, iPoG (KDD06, ICDM08, CIKM09)

Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

FastProx(SDM08, SAM08)Q3

NetShield

M1

Colibri-S (KDD08)

M2 Colibri-D (KDD08)

M2

Page 83: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Some of my other work• #1: FastDAP (in KDD07 a)– Predict Link Direction

• #2: Graph X-Ray (in KDD 07 b)– Best Effort Pattern Match in Attributed Graphs.

• #3: GhostEdge (in KDD 08 a)– Classification in Sparsely Labeled Network

• #4: TANGENT (in KDD09)– ``surprise-me’’ recommendation

• #5: GMine (in VLDB 06)– Interactive Graph Visualization and Mining

• #6: Graphite (in ICDM 08)– Visual Query System for Attributed Graphs

• # 7: T3/MT3: (in CIKM 08)– Mine Complex Time-stamped Events

• #8: BlurDetect (in ICME 04)– Determine whether or not, and how, an image is blurred

• #9: MRBIR (in MM 04, TIP06)– Manifold-Ranking based Image Retrieval

• #10: GBMML (in CVPR05, ACM/Multimedia 05)– Graph-based Multiple Modality Learning

83

Page 84: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Tasks Static Graphs Dynamic Graphs Images

84

Overview(this talk + others)

Queryin

gM

ining

CePS, iPoG, Basset, DAP, G-Ray, Grahite, TANGENT, FastRWR(KDD06, CDM06, KDD07a, KDD07b, IICDM08, KAIS08,

CIKM09, KDD09)

pTrack, cTrack, Fast-Update

(SDM08, SAM08)

Netshield, Colibri-S, GhostEdge, Gmine,

Pack, Shiftr(VLDB06, KDD08a, KDD08b,

SDM-LinkAnalysis 09, )

T3/MT3, Colibri-D (KDD08a, CIKM08)

MRBIR, UOLIR(MM04, CVPR05)

BlurDetect, GBMML, iQuality,

iExpertise(ICDE04, ICIP04,

MMM05, PCM05, MM05)

Page 85: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Plans

Goals

Step 1 (this talk)

Step 2 (medium term)

Step 3 (long term)

G1 Querying

CePS, iPoG, pTrack

Recommendation Interpretable Q

Querying rich data

G2Mining

Netshield, Colibri

Immunization Interpretable M

Mining rich data

G3 Scalability

All above O(m) or better (single machine)

Scalable by parallel Scalable on rich data

What is Next?

85

Research Theme: Help users to understand and utilize large graph-related data

Page 86: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Current Recommendation (Focus on Relevance)

86

1001

1

Sci. fiction

comedy

horror

Footnote: Nodes are movies; Edge is similarity between movies

adventure

Red nodes: by (most of) existing algorithms

Page 87: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

``Broad Spectrum Recommendation’’(focus on completeness = relevance + diversity + novelty)

87

1001

1

adventureS

ci. fictioncom

edy

horror

Footnote: Nodes are movies; Edge = similarity between movies

Page 88: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Research Theme: Help users to understand and utilize large graph-related data

Plans

Goals

Step 1 (this talk)

Step 2 (medium term)

Step 3 (long term)

G1 Querying

CePS, iPoG, pTrack

Recommendation Interpretable Q

Querying rich data

G2Mining

Netshield, Colibri

Immunization Interpretable M

Mining rich data

G3 Scalability

All above O(m) or better (single machine)

Scalable by parallel Scalable on rich data

What is Next?

88

Page 89: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Interpretable Recommendation

• Amazon.com recommends

• (based on items you purchased or told us your own)

Current Recommendation 89

Page 90: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Interpretable Recommendation

• Amazon.com recommends

• (based on items you purchased or told us your own)

• Amazing.com recommends

• Because it has the topics • You are interested

• Graph mining• Linear algebra

• You might be interested• Hadoop• Submodularity

Current Recommendation Interpretable Recommendation

Page 91: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Plans

Goals

Step 1 (this talk)

Step 2 (medium term)

Step 3 (long term)

G1 Querying

CePS, iPoG, pTrack

Recommendation Interpretable Q

Querying rich data

G2Mining

Netshield, Colibri

Immunization Interpretable M

Mining rich data

G3 Scalability

All above O(m) or better (single machine)

Scalable by parallel Scalable on rich data

What is Next?

91

Research Theme: Help users to understand and utilize large graph-related data

Page 92: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Immunization

• This Talk: SIS (e.g., flu)• In the Future

– Immunize for SIR (e.g., chicken pox)– Immunize in Dynamic Settings

Dynamics of Graphs, e.g., edges/nodes are changing

Dynamics of Virus, e.g., the infection/healing rates are changing

92Footnote: SIR stands for susceptible-infectious-recovered.

Page 93: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Plans

Goals

Step 1 (this talk)

Step 2 (medium term)

Step 3 (long term)

G1 Querying

CePS, iPoG, pTrack

Recommendation Interpretable Q

Querying rich data

G2Mining

Netshield, Colibri

Immunization Interpretable M

Mining rich data

G3 Scalability

All above O(m) or better (single machine)

Scalable by parallel Scalable on rich data

What is Next?

93

Research Theme: Help users to understand and utilize large graph-related data

Page 94: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Interpretable Mining

94

Find CommunitiesFind a few nodes/edges to describe

each community relationship between 2 communities

Footnote: Nodes are actors; edges indicate co-play in a movie.

Page 95: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Plans

Goals

Step 1 (this talk)

Step 2 (medium term)

Step 3 (long term)

G1 Querying

CePS, iPoG, pTrack

Recommendation Interpretable Q

Querying rich data

G2Mining

Netshield, Colibri

Immunization Interpretable M

Mining rich data

G3 Scalability

All above O(m) or better (single machine)

Scalable by parallel Scalable on rich data

What is Next?

95

Research Theme: Help users to understandand utilize large graph-related data

Page 96: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Querying Rich Graphs(e.g., geo-coded, attributed)

96

What is difference between North America and Asia?

Teenage

Adult

Phone

MSN

Page 97: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Mining Rich Graphs(e.g., geo-coded, attributed)

97

Teenager

Adult

Phone

MSN

How to find patterns? (e.g., communities, anomalies)

telemarketer

Page 98: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Plans

Goals

Step 1 (this talk)

Step 2 (medium term)

Step 3 (long term)

G1 Querying

CePS, iPoG, pTrack

Recommendation Interpretable Q

Querying rich data

G2Mining

Netshield, Colibri

Immunization Interpretable M

Mining rich data

G3 Scalability

All above O(m) or better (single machine)

Scalable by parallel Scalable on rich data

What is Next?

98

Research Theme: Help users to understand and utilize large graph-related data

Page 99: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Scalability• Two orthogonal efforts

–E1: O(m) or better on a single machine–E2: Parallelism (e.g., hadoop)

• (implementation, decouple, analysis)

99

Page 100: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Research Theme: Help users to understand and utilize large graph-related data

100

Real Data

User

Scalability

Page 101: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

CePSiPoGBasset

pTrack

BLin

BBLin

FastUpdate

Fast-iPoG

Colibri

GhostEdge

Graphite

Pack

TANGENT

GMine

T3

Min

ingQ1

Q2

Q3

M3M2

M1

My Collaboration Graph (During Ph.D Study)

Legends:Green: QueryingYellow: MiningPurple: Others

G-Ray

DAP

NBLin

cTrack

Basset

MT3

NetShield

Page 102: Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University htong@cs.cmu.edu htong

Q & A

Thank you!

102