supervised rank aggregation approach for link prediction in...

26
Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari & Rushed Kanawati LIPN - UMR CNRS 7030 Universit´ e Paris Nord 99 Av. J.B. Clement 93430, Villetaneuse, FRANCE [email protected] 16 April, 2012 Mining Social Network Dynamics Workshop WWW-2012, Lyon,France M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 1/22

Upload: others

Post on 26-Jan-2021

2 views

Category:

Documents


3 download

TRANSCRIPT

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Supervised Rank Aggregation Approach for LinkPrediction in Complex Networks

    Manisha Pujari & Rushed Kanawati

    LIPN - UMR CNRS 7030Université Paris Nord

    99 Av. J.B. Clement 93430, Villetaneuse, [email protected]

    16 April, 2012

    Mining Social Network Dynamics Workshop

    WWW-2012, Lyon,France

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 1/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    1 Link Prediction

    2 Supervised Rank Aggregation based Link Prediction

    3 Experiment

    4 Conclusion

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 2/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Problem

    Link Prediction

    Predicting new links between nodes of a graph.

    Applications

    Recommender systems

    Academic/Professionalcollaborations

    Identification of structures ofcriminal networks

    Biological networks

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 3/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Link Prediction Approaches

    Dyadic: Computation of link score for unlinked vertices

    Structural: Mining rules for evolution of sub-graphs

    Topology based: Attributes computed for graph

    Node-feature based: Attributes computed for nodes

    Hybrid : Combination of the two

    Temporal: Consider dynamics of the networks

    Static: Do not consider the dynamics of a network

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 4/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Link Prediction Approaches

    Dyadic: Computation of link score for unlinked vertices

    Structural: Mining rules for evolution of sub-graphs

    Topology based: Attributes computed for graph

    Node-feature based: Attributes computed for nodes

    Hybrid : Combination of the two

    Temporal: Consider dynamics of the networks

    Static: Do not consider the dynamics of a network

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 4/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Dyadic Topological Approaches

    Work of [Liben-Nowell & al.,2007]

    Prediction on a co-authorship network.

    For each unlinked node pair (u, v) , compute a set of topologicalattributes [A1,A2, ...,An].

    Rank all (u, v) based on attribute values.

    Considering only top k ranked edges as predicted edges,performance of each attribute is found.

    Attributes :

    Neighborhood-based attributes: Jaccard’s coefficient,Commonneighbors,Adamic/Adar [Adamic & al.2003], Preferentialattachment etc.

    Distance-based attributes: Shortest path distance,Katz [Katz,1953],Maximum forest algorithm etc.

    Centrality-based attributes: PageRank, Degree centrality, Clusteringcoefficient etc.

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 5/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Dyadic Topological Approaches

    Combining the effect of different topological measures: Application ofsupervised machine learning algorithms[Benchettara & al.,2010],[Hasan & al., 2006]

    Examples:

    (Nodex ,Nodey ) −→[a0, a1, a2, ...., an]

    Can we apply rank aggregation methods ?

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 6/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Dyadic Topological Approaches

    Combining the effect of different topological measures: Application ofsupervised machine learning algorithms[Benchettara & al.,2010],[Hasan & al., 2006]

    Examples:

    (Nodex ,Nodey ) −→[a0, a1, a2, ...., an]

    Can we apply rank aggregation methods ?

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 6/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Rank Aggregation (Social choice theory)

    ⇒ To find an aggregated list with minimum possible disagreement⇒ Equal weight to all experts

    Expert1 =⇒ L1 = [A,B,C ,D]Expert2 =⇒ L2 = [B,D,A,C ]Expert3 =⇒ L3 = [C ,D,A,B]

    ...

    ...

    ...Expertn =⇒ Ln = [D,C ,A,B]

    ———————————————Laggregate = [?, ?, ?, ?]

    Types of input lists

    Full/Complete lists

    Partial lists

    Disjoint lists

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 7/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Distance Measure

    Spearman Footrule Distance: F (L1, L2) = Σi∈n | L1(i)− L2(i) |Kendall Tau Distance:K (L1, L2) =| (i , j) s.t. L1(i) < L2(j) & L1(i) > L2(j) |

    Example:

    L1 = [A, B, C, D] and L2 = [B, D, C, A]

    F (L1, L2) = | L1 (A) - L2 (B) | + | L1 (B) - L2 (B)| + | L1 (C) - L2(C)| + | L1 (D) - L2 (D)| = 7

    K(L1, L2) = 4

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 8/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Borda’s Method [Borda, 1781]

    Based on absolute positioning of elements

    BLk (i) = {count(j)|Lk (j) < Lk (i)&j ∈ Lk} ; B(i) =k∑

    t=1

    BLt (i) (1)

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 9/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Kemeny Optimal Aggregation [Dwork & al.,2001]

    Based on relative ranking of elements

    SK (π, L1, L2, L3, ....., Ln) =∑

    i∈[1,n]

    K (π, Li ) (2)

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 10/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Supervised Rank Aggregation

    Combining different rankings to get an aggregation giving differentweights to the experts

    ⇒ Proposed approachesSupervised Borda

    Supervised local Kemeny

    w1 ← Expert1 =⇒ L1 → [k elements]w2 ← Expert2 =⇒ L2 → [k elements]w3 ← Expert3 =⇒ L3 → [k elements]

    ...

    ...

    ...wn ← Expertn =⇒ Ln → [k elements]

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 11/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Supervised Borda Method

    Borda score

    B(i) =n∑

    t=1

    wi ∗ BLt (i) ; where t ∈ [1, k] (3)

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 12/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Supervised Local Kemeny Aggregation

    Steps:

    1 L = [L1, L2, . . . , Ln], [w1,w2, . . . ,wn] , m elements(U)

    2 Initialize m ×m matrix M with M(x , y) = 03 ∀(x , y) ∈ U, Compute

    score(x , y) =∑n

    i=1(wi ∗ (x � y)) where

    x � y =

    {0 if Li (x) < Li (y)

    1 if Li (x) > Li (y)

    4 If score(x , y) > 0.5 ∗∑n

    i=1 wi , Insert M(x , y) = true andM(y , x) = false

    5 Initial aggregation R = L16 For x , y ∈ R, Swap(x , y) if M(x , y) = false7 R is the final aggregation.

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 13/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Supervised Local Kemeny Aggregation

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 14/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Link Prediction based on Supervised Rank Aggregation

    Examples: (Nodex ,Nodey ) −→ [a0, a1, a2, ...., an]

    Steps:

    1 Rank learning examples by attribute values

    2 Consider only top k examples and compute attribute weight wai3 Rank test examples by attribute to get n ranked lists

    4 Apply supervised rank aggregation

    5 Consider only top k examples of the aggregate list and computeperformance.

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 15/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Link Prediction Based on Supervised Rank Aggregation

    Computation of attribute weights:

    Maximization of positive precision:

    Wai = n ∗ Precisionai (4)

    Minimization of false positive rate:

    Wai =n

    FPRai(5)

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 16/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Experiment

    DBLP database

    DatasetsTraining Validation Training examples Test examples

    Time Time Positive Total Positive TotalDataset 1 [1970-1975] [1971-1976] 30 1693 41 3471Dataset 2 [1972-1977] [1973-1978] 87 19332 82 18757Dataset 3 [1974-1979] [1975-1980] 102 35190 164 60046

    Table: DBLP Datasets

    Performance measure:

    F =Precision ∗ RecallPrecision + Recall

    (6)

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 17/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Results

    Experiment-1 : Performance based on learning on complete trainingdataset

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 18/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Results

    Experiment-2 : Performance in terms of precision based on learning onsamples of training dataset

    P is the number of positive examples and N is the number of negative

    examples. In any sample, N = m ∗ P where m is any positive integer.M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 19/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Results

    Experiment-3 : Performance based on learning on complete trainingdataset and validation on test sample (N = 5 ∗ P).

    DatasetsTraining examples Test examplesPositive Total Positive Total

    Dataset 1 30 1693 41 246Dataset 2 87 19332 82 492Dataset 3 102 35190 164 984

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 20/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    Conclusion

    A new definition for supervised rank aggregation.

    Application of supervised rank aggregation to link prediction.

    Future work:

    ⇒ Validation on other types of networks like e-commerce networks.⇒ Application for tag recommendation in folksonomy

    [Pujari & al., 2011],[Pujari & al., 2012]⇒ Application of our approach with community detection methods.

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 21/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 22/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    References I

    [Benchettara & al.,2010] N. Benchettara, R. Kanawati, C. Rouveirol. Supervised machine learning applied to link prediction in

    bipartite social networks . In International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, 2010.

    [Dwork & al.,2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank Aggregation method for Web. WWW 01: Proceedings of

    10th international conference on World Wide Web, pages 613-622 (2001).

    [Kumar & al., 2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank aggregation revisited . Manuscript, 1953.

    [Katz,1953] L.Katz. A new status index derived from socimetric analysis. (article) . Vol. 18, pages 39-43, 2001.

    [Borda, 1781] J.C.Borda FAG03 Mémoire sur les élections au Scrutin . Histoire de l’Académie Royale des Sciences,1781 .

    [Fagin & al., 2003] R.Fagin, R.Kumar,D. Sivakumar. Efficient similarity search and classification via rank aggregation.

    Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312, 2003, New York.

    [Sculley,2007]D.Sculley. Rank Aggregation for Similar Items. Proceedings of the Seventh SIAM International Conference on Data

    Mining, April 26-28, 2007, Minneapolis, Minnesota, USA (2007)).

    [Adamic & al.2003]L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the web. First Monday, Vol. 8, No.

    6. (June 2003).

    [Bisson & al., 2008] G. Bisson and F. Hussain. χ-Sim: A new similarity measure for the co- clustering task. In Seventh

    International Conference on Machine Learning and Application(ICMLA), IEEE Computer Society (2008) , pages 211-217.

    [Mrosek & al.,2009]J. Mrosek, S. Bussmann, H. Albers, K. Posdziech, B. Hengefeld, N. Opperman, S. Robert and G. Spirar.

    Content- and graph-based tag recommendation: Two variatons. ECML PKDD Discovery Challenge 2009 DC09, 497, pages189-199. Bled, Slovenia, CEUR Workshop Proceedings, September 2009.

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 23/22

  • Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

    References II

    [Lipczak, 2008]M.Lipczak. Tag recommendation for folksonomies oriented towards individual users . In Proceedings of ECML

    PKDD Discovery Challenge (RSDC08) (2008), pages. 84-95.

    [Liben-Nowell & al.,2007] David Liben-Nowell, and Jon Kleinberg The link prediction problem for social networks. In Proceedings

    of the 16th international conference on World Wide Web), pages. 481-490,New York,USA,2007.

    [Hasan & al., 2006] Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. Link prediction using supervised

    learning. SIAM Workshop on Link Analysis, Counterterrorism and Security with SIAM Data Mining Conference, 2006.

    [Acar & al.,2009]Evrim Acar, Daniel M. Dunlavy and Tamara G Kolda. Link Prediction on Evolving Data Using Matrix and

    Tensor Factorizations.. ICDM Workshops, pages.262-269, 2009.

    [Lahiri & al., 2007]Lahiri, Mayank and Berger-Wolf, Tanya Y. Structure Prediction in Temporal Networks using Frequent

    Subgraphs. CIDM, pages 35-42,2007.

    [Pujari & al., 2011]Manisha Pujari and Rushed Kanawati. Supervised machine learning link prediction approach for tag

    recommandation. 4th International Conference on Online Communities and Social Computing @ HCI International,Orlando,Florida9-14 July 2011.

    [Pujari & al., 2012]Manisha Pujari and Rushed Kanawati. Tag recommendation by link prediction based on supervised machine

    learning. Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012). 5-8 June 2012, Dublin..

    M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 24/22

    Link PredictionSupervised Rank Aggregation based Link PredictionExperimentConclusion