robust local community detection: on free rider effect and its elimination 1 case western reserve...

17
Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1 , Ruoming Jin 2 , Jing Li 1 , Xiang Zhang 1 2 Kent State University

Upload: jean-owen

Post on 18-Jan-2018

225 views

Category:

Documents


0 download

DESCRIPTION

Community Goodness Metrics [1] B. Saha, et al. RECOMB’10. [2] C. Tsourakakis, et al. SIGMOD’14. [3] M. Sozio, et al. KDD’10. [4] W. Cui, et al. SIGMOD’14. [5] F. Luo, et al. WIAS’08. [6] K. J. Lang, CIKM’07. [7] R. Andersen, et al. FOCS’06. [8] A. Clauset, PRE’05. IntuitionsGoodness metricsRef. Internal denseness Classic density[1] Edge-surplus[2] Minimum degree[3,4] Internal denseness & external sparseness Subgraph modularity[5] Density-isolation[6] External conductance[7] Boundary sharpness Local modularity[8]

TRANSCRIPT

Page 1: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Robust Local Community Detection:On Free Rider Effect and Its Elimination

1Case Western Reserve University

Yubao Wu1, Ruoming Jin2, Jing Li1, Xiang Zhang1

2Kent State University

Page 2: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Generic Local Community Detection Problem

Input: a) Graph b) A set of query nodes c) A goodness metric

Output: Subgraph such that:1) contains ()2) is maximized

[1] M. Sozio, et al. KDD’10.[2] W. Cui, et al. SIGMOD’14.[3] L. Ma, et al. DaWak’13.[4] B. Saha, et al. RECOMB’10.

[5] C. Tsourakakis, et al. SIGMOD’14.[6] A. Clauset, PRE’05.[7] F. Luo, et al. WIAS’08.[8] R. Andersen, et al. FOCS’06.

A

Page 3: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Community Goodness Metrics

[1] B. Saha, et al. RECOMB’10.[2] C. Tsourakakis, et al. SIGMOD’14.[3] M. Sozio, et al. KDD’10.[4] W. Cui, et al. SIGMOD’14.

[5] F. Luo, et al. WIAS’08.[6] K. J. Lang, CIKM’07. [7] R. Andersen, et al. FOCS’06.[8] A. Clauset, PRE’05.

Intuitions Goodness metrics Ref. Formulas

Internal denseness

Classic density [1]

Edge-surplus [2]concave

Minimum degree [3,4]

Internal denseness &

external sparseness

Subgraph modularity [5]Density-isolation [6]

External conductance [7]

Boundary sharpness Local modularity [8]

Page 4: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Free Rider Effect

Goodness metrics A A B A CClassic density 2.50 2.95 2.83Edge-surplus 15.3 26.5 22.8

Minimum degree 4 4 4Subgraph modularity 2.0 3.6 4.6

Density-isolation -2.6 3.8 1.5Ext. conductance 0.25 0.14 0.11Local modularity 0.63 0.70 0.78

[1] B. Saha, et al. RECOMB’10.[2] C. Tsourakakis, et al. SIGMOD’14.[3] M. Sozio, et al. KDD’10.[4] W. Cui, et al. SIGMOD’14.

[5] F. Luo, et al. WIAS’08.[6] K. J. Lang, CIKM’07. [7] R. Andersen, et al. FOCS’06.[8] A. Clauset, PRE’05.

Page 5: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Free Rider Effect in Real Networks

(a) Co-author network (b) Biological network

Barna, Saha, et al. Dense subgraphs with restrictions and applications to gene annotation graphs. RECOMB, 2010.

One existing method: classic density

Page 6: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Query Biased Node Weighting

: proximity value w.r.t. the query

Node Weight:

Query biased density:

𝜌 (𝑆)=𝑒(𝑆)𝜋 (𝑆)

: sum of node weights

Subgraph A becomes the query biased densest subgraph

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 7: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

QDC ProblemQuery biased densest connected subgraph (QDC) problem:

Input: a) Graph b) A set of query nodes

Output: Subgraph such that:1) contains ()2) Query biased density is maximized3) is connected

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 8: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

QDC QDC’ QDC’’

Input 1) 2) query

1) 2) query

Output:

1) contains 2) is maximized3) is connected

1) contains 2) is maximized is maximized

Complexity NP-hard Polynomial Polynomial

QDC Problem and Two Related Problems

If contains

Optimal

If is connected

Optimal

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 9: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Finding the QDC’’

1. Removing Low Degree Nodes

2. Detect the Densest Subgraph

Finding the QDC’

Subgraph contraction

• Reduce the search space• Retain the densest subgraph

• On the reduced search space

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 10: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Finding the QDC

Greedy Node Deletion Local Expansion

1) Delete low degree nodes

2) Maintain the connectivity

1) Connect the query nodes with a Steiner tree

2) Greedy local expansion

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 11: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Experiments——Datasets

Dataset # Nodes # Edges # Communities

Amazon 00,334,863 0,000,925,872 0,151,037

DBLP 00,317,080 0,001,049,866 0,013,477

Youtube 01,134,890 0,002,987,624 0,008,385

Orkut 03,072,441 0,117,185,083 6,288,363

LiveJournal 03,997,962 0,034,681,189 0,287,512

Friendster 65,608,366 1,806,067,135 0,957,154

[1] J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM, 2012.[2] snap.stanford.edu

Page 12: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Experiments——State-of-the-Art Methods

Classes Abbr. Ref. Key Idea

Internal denseness

DS [1] Densest subgraph with query constraint

OQC [2] Optimal quasi-clique; edge-surplus

MDG [3] Minimum degree

Internal denseness & external sparseness

PRN [4] External conductance

LS [5] Local spectral

EMC [6] More internal edges than external edges

SM [7] Subgraph modularity

Boundary LM [8] Local modularity

[1] B. Saha, et al. RECOMB’10.[2] C. Tsourakakis, et al. SIGMOD’14.[3] M. Sozio, et al. KDD’10.[4] R. Andersen, et al. FOCS’06.

[5] M. W. Mahoney, et al. JMLR’12.[6] G. W. Flake, KDD’00.[7] F. Luo, et al. WIAS’08.[8] A. Clauset, PRE’05.

Page 13: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Experiments——Effectiveness Evaluat. MetricsMetrics Formulas

F-score

Community goodness metrics

Density

Cohesiveness

Separability

Consistency

[1] J. Yang and J. Leskovec. Dening and evaluating network communities based on ground-truth. In ICDM, pages 745-754, 2012.[2] Ma, Lianhang, et al. GMAC: A seed-insensitive approach to local community detection. In DaWak, pages 297-308, 2013.

Page 14: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Effectiveness Evaluation —— F-ScoreF-score QDC DS OQC MDG PRN LS EMC SM LM

Amazon 0.83 0.52 0.54 0.46 0.69 0.66 0.61 0.60 0.58

DBLP 0.46 0.31 0.33 0.32 0.48 0.42 0.34 0.36 0.37

Youtube 0.43 0.23 0.22 0.17 0.26 0.24 0.21 0.21 0.22

Orkut 0.47 0.15 0.16 0.13 0.21 0.17 0.19 0.16 0.18

LiveJournal 0.64 0.48 0.47 0.40 0.52 0.51 0.47 0.48 0.49

Friendster 0.32 -- 0.14 0.12 0.17 0.16 -- 0.14 0.13

Avg. F-score 0.53 0.3 0.31 0.27 0.39 0.36 0.33 0.33 0.33

Avg. Precision 0.65 0.46 0.45 0.29 0.51 0.41 0.34 0.38 0.48

Avg. Recall 0.78 0.61 0.58 0.69 0.67 0.64 0.66 0.63 0.59

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 15: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Effectiveness Evaluation——Goodness Metrics

Community goodness metrics on LiveJournal graph

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 16: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Effectiveness Evaluation——Consistency

Consistency QDC DS OQC MDG PRN LS EMC SM LM

Amazon 0.94 0.77 0.76 0.58 0.79 0.69 0.74 0.67 0.61

DBLP 0.88 0.62 0.64 0.37 0.65 0.53 0.56 0.43 0.56

Youtube 0.85 0.61 0.54 0.46 0.71 0.41 0.57 0.37 0.36

Orkut 0.83 0.56 0.52 0.32 0.68 0.43 0.51 0.54 0.47

LiveJournal 0.93 0.74 0.67 0.43 0.84 0.64 0.73 0.58 0.52

Friendster 0.78 -- 0.56 0.45 0.65 0.49 -- 0.32 0.39

Average 0.87 0.64 0.62 0.44 0.72 0.53 0.61 0.49 0.49

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

Page 17: Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang

Conclusions

1) Free rider effect is a serious problem;

Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. Robust local community detection: on free rider effect and its elimination. PVLDB, 8(7):798-809, 2015.

2) Query biased node weighting scheme can effectively eliminate the free rider effect thus improve the accuracy.