7270 community detection

Upload: kunal-singh

Post on 07-Jul-2018

225 views

Category:

Documents


5 download

TRANSCRIPT

  • 8/18/2019 7270 Community Detection

    1/28

    Community Detection and

    Graph-based Clustering

     Chapter 3

    Of Lei Tang and HuanLiu’s oo!

    "lides prepared by

    #iang $ang%&"T% Hong'ong

    (Chapter 3% Community Detection and )ining in "ocial )edia* Lei Tangand Huan Liu% )organ , Claypool% "eptember% .(.*

  • 8/18/2019 7270 Community Detection

    2/28

  • 8/18/2019 7270 Community Detection

    3/28

    Community

    • Community/ 0t is formed by indi1iduals such that those2ithin a group interact 2ith each other morefreuently than 2ith those outside the group – a*!*a* group% cluster% cohesi1e subgroup% module in di4erent

    conte5ts

    • Community detection/ disco1ering groups in a net2or!2here indi1iduals’ group memberships are note5plicitly gi1en

    •6hy communities in social media7 – Human beings are social – 8asy-to-use social media allo2s people to e5tend their

    social life in unprecedented 2ays – Di9cult to meet friends in the physical 2orld% but much

    easier to :nd friend online 2ith similar interests

     – 0nteractions bet2een nodes can help determinecommunities

    3

  • 8/18/2019 7270 Community Detection

    4/28

    Communities in "ocial)edia

    •  T2o types of groups in social media – 85plicit Groups/ formed by user subscriptions – 0mplicit Groups/ implicitly formed by social

    interactions

    • "ome social media sites allo2 people to ;oingroups% is it necessary to e5tract groups based onnet2or! topology7 –

  • 8/18/2019 7270 Community Detection

    5/28

    COMMUNITY DETECTION

    @

  • 8/18/2019 7270 Community Detection

    6/28

    "ub;ecti1ity of CommunityDe:nition

    8ach componentis a communityA densely-!nit

    community

    De:nition of a communitycan be sub;ecti1e*

    unsuper1ised learningB

  • 8/18/2019 7270 Community Detection

    7/28

     Ta5onomy of CommunityCriteria

    • Criteria 1ary depending on the tas!s• oughly% community detection methods can be

    di1ided into ? categories not e5clusi1eB/• artition the 2hole net2or! into se1eral dis;oint sets

    • Hierarchy-Centric Community – Construct a hierarchical structure of communities

    E

  • 8/18/2019 7270 Community Detection

    8/28

  • 8/18/2019 7270 Community Detection

    9/28

    Complete )utuality/ Cliues

    • Cliue/ a ma5imum complete subgraph in2hich all nodes are ad;acent to each other

    • -hard to :nd the ma5imum cliue in anet2or!

    • "traightfor2ard implementation to :ndcliues is 1ery e5pensi1e in time comple5ity

  • 8/18/2019 7270 Community Detection

    10/28

    inding the )a5imum Cliue

    • 0n a cliue of si=e !% each node maintains degreeIJ !-( –

  • 8/18/2019 7270 Community Detection

    11/28

    )a5imum Cliue 85ample

    • "uppose 2e sample a sub-net2or! 2ith nodes (-M and :nd a cliue (% % 3M of si=e 3

    • 0n order to :nd a cliue I3% remo1e all nodes 2ithdegree KJ3-(J – emo1e nodes and

     – emo1e nodes ( and 3

     – emo1e node ?

    ((

  • 8/18/2019 7270 Community Detection

    12/28

    Cliue >ercolation )ethodC>)B

    • Cliue is a 1ery strict de:nition% unstable• ) is such a method to :nd o1erlapping communities – Input

    • A parameter !% and a net2or! – Procedure

    • ind out all cliues of si=e ! in a gi1en net2or!• Construct a cliue graph* T2o cliues are ad;acent if

    they share !-( nodes• 8ach connected component in the cliue graph forms

    a community(

  • 8/18/2019 7270 Community Detection

    13/28

    C>) 85ample

    Cliques of sie !"(% % 3M% (% 3% ?M%?% @% M% @% % EM%@% % FM% @% E% FM%% E% FM

    Communities/

    (% % 3% ?M?% @% % E% FM

    (3

  • 8/18/2019 7270 Community Detection

    14/28

    eachability / !-cliue% !-club

    • Any node in a group should be reachable in !hops

    • !-cliue/ a ma5imal subgraph in 2hich the largestgeodesic distance bet2een any t2o nodes KJ !

    • !-club/ a substructure of diameter KJ !

    • A !-cliue might ha1e diameter larger than ! inthe subgraph – 8*g* (% % 3% ?% @M

    •Commonly used in traditional "

  • 8/18/2019 7270 Community Detection

    15/28

    Group-Centric CommunityDetection/ Density-ased

    Groups•  The group-centric criterion reuires the 2holegroup to satisfy a certain condition – 8*g*% the group density IJ a gi1en threshold

    • A subgraph is a uasi-cliue if 

    2here the denominator is the ma5imum number of

    degrees*

    • A similar strategy to that of cliues can be used – "ample a subgraph% and :nd a ma5imal

    uasi-cliue say% of si=e B

     – emo1e nodes 2ith degree less than the a1erage degree(@

    ,

    <

  • 8/18/2019 7270 Community Detection

    16/28

  • 8/18/2019 7270 Community Detection

    17/28

    Clustering based on Nerte5"imilarity

    • Apply !-means or similarity-based clustering tonodes

    • Nerte5 similarity is de:ned in terms of thesimilarity of their neighborhood

    • "tructural eui1alence/ t2o nodes are structurallyeui1alent i4 they are connecting to the same setof actors

    •"tructural eui1alence is too strict for practicaluse*

  • 8/18/2019 7270 Community Detection

    18/28

    Nerte5 "imilarity

    •  Paccard "imilarity

    • Cosine similarity

    (F

    (1) Clustering based on vertex similarity

    (4) S t l l t i

  • 8/18/2019 7270 Community Detection

    19/28

    Cut

    • )ost interactions are 2ithin group 2hereasinteractions bet2een groups are fe2

    • community detection  minimum cut problem

    • Cut/ A partition of 1ertices of a graph into t2odis;oint sets

    • )inimum cut problem/ :nd a graph partition suchthat the number of edges bet2een the t2o sets isminimi=ed

    (4) Spectral clustering

    (4) S t l l t i

  • 8/18/2019 7270 Community Detection

    20/28

    atio Cut ,

  • 8/18/2019 7270 Community Detection

    21/28

    atio Cut ,

  • 8/18/2019 7270 Community Detection

    22/28

    )odularity )a5imi=ation

    • )odularity measures the strength of a communitypartition by ta!ing into account the degreedistribution

    • Gi1en a net2or! 2ith m edges% the e5pectednumber of edges bet2een t2o nodes 2ithdegrees di and d j  is

    • "trength of a community/

    • )odularity/ •

     The e5pected number ofedges bet2een nodes (

    and is3RS R(?B J 3S(?

    E

    (5) odularity maximi!ation

    "iven t#e degree distribution

  • 8/18/2019 7270 Community Detection

    23/28

    Hierarchy-Centric CommunityDetection

    • Goal/ build a hierarchical structure ofcommunities based on net2or! topology

    • Allo2 the analysis of a net2or! at di4erentresolutions

    • epresentati1e approaches/ – Di1isi1e Hierarchical Clustering top-

    do2nB

     – Agglomerati1e Hierarchical clustering

    bottom-upB 3(

  • 8/18/2019 7270 Community Detection

    24/28

    Di1isi1e HierarchicalClustering

    • Di1isi1e clustering – >artition nodes into se1eral sets

     – 8ach set is further di1ided into smaller ones

     –

  • 8/18/2019 7270 Community Detection

    25/28

    8dge et2eenness

    •  The strength of a tie can be measured by edgebet2eenness

    • 8dge bet2eenness/ the number of shortest pathsthat pass along 2ith the edge

    •  The edge 2ith higher bet2eenness tends to bethe bridge bet2een t2o communities*

     The edge bet2eenness ofe(% B is ? JS V (B% asall the shortest paths from to ?% @% % E% F% M ha1eto either pass e(% B or e%3B% and e(%B is theshortest path bet2een (and

    33

  • 8/18/2019 7270 Community Detection

    26/28

    Di1isi1e clustering based onedge bet2eenness

    After remo1e e?%@B% thebet2eenness of e?% B 

    becomes .% 2hich is thehighest

    After remo1e e?%B% the edgeeE%B has the highest

    bet2eenness 1alue ?% and

    should be remo1ed*

    0nitial bet2eenness 1alue

    3?$dea% progressively removing edges &it# t#e #ig#est bet&eenness

  • 8/18/2019 7270 Community Detection

    27/28

    Agglomerati1e HierarchicalClustering

    • 0nitiali=e each node as a community

    • )erge communities successi1ely intolarger communities follo2ing a certain

    criterion – 8*g*% based on modularity increase

    3@

    'endrogram according to gglomerative Clustering based on odularity

  • 8/18/2019 7270 Community Detection

    28/28

    "ummary of CommunityDetection