supervised rank aggregation approach for link prediction in...

Outline Link Prediction Supervised Rank Aggregation based Link Prediction Experiment Conclusion

Supervised Rank Aggregation Approach for LinkPrediction in Complex Networks

Manisha Pujari & Rushed Kanawati

LIPN - UMR CNRS 7030Université Paris Nord

99 Av. J.B. Clement 93430, Villetaneuse, [email protected]

16 April, 2012

Mining Social Network Dynamics Workshop

WWW-2012, Lyon,France

M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 1/22


1 Link Prediction

2 Supervised Rank Aggregation based Link Prediction

3 Experiment

4 Conclusion



Problem

Link Prediction

Predicting new links between nodes of a graph.

Applications

Recommender systems

Academic/Professionalcollaborations

Identification of structures ofcriminal networks

Biological networks



Link Prediction Approaches

Dyadic: Computation of link score for unlinked vertices

Structural: Mining rules for evolution of sub-graphs

Topology based: Attributes computed for graph

Node-feature based: Attributes computed for nodes

Hybrid : Combination of the two

Temporal: Consider dynamics of the networks

Static: Do not consider the dynamics of a network



Dyadic Topological Approaches

Work of [Liben-Nowell & al.,2007]

Prediction on a co-authorship network.

For each unlinked node pair (u, v) , compute a set of topologicalattributes [A1,A2, ...,An].

Rank all (u, v) based on attribute values.

Considering only top k ranked edges as predicted edges,performance of each attribute is found.

Attributes :

Neighborhood-based attributes: Jaccard’s coefficient,Commonneighbors,Adamic/Adar [Adamic & al.2003], Preferentialattachment etc.

Distance-based attributes: Shortest path distance,Katz [Katz,1953],Maximum forest algorithm etc.

Centrality-based attributes: PageRank, Degree centrality, Clusteringcoefficient etc.



Dyadic Topological Approaches

Combining the effect of different topological measures: Application ofsupervised machine learning algorithms[Benchettara & al.,2010],[Hasan & al., 2006]

Examples:

(Nodex ,Nodey ) −→[a0, a1, a2, ...., an]

Can we apply rank aggregation methods ?



Rank Aggregation (Social choice theory)

⇒ To find an aggregated list with minimum possible disagreement⇒ Equal weight to all experts

Expert1 =⇒ L1 = [A,B,C ,D]Expert2 =⇒ L2 = [B,D,A,C ]Expert3 =⇒ L3 = [C ,D,A,B]

...

...

...Expertn =⇒ Ln = [D,C ,A,B]

———————————————Laggregate = [?, ?, ?, ?]

Types of input lists

Full/Complete lists

Partial lists

Disjoint lists



Borda’s Method [Borda, 1781]

Based on absolute positioning of elements

BLk (i) = {count(j)|Lk (j) < Lk (i)&j ∈ Lk} ; B(i) =k∑

t=1

BLt (i) (1)



Kemeny Optimal Aggregation [Dwork & al.,2001]

Based on relative ranking of elements

SK (π, L1, L2, L3, ....., Ln) =∑

i∈[1,n]

K (π, Li ) (2)



Supervised Rank Aggregation

Combining different rankings to get an aggregation giving differentweights to the experts

⇒ Proposed approachesSupervised Borda

Supervised local Kemeny

w1 ← Expert1 =⇒ L1 → [k elements]w2 ← Expert2 =⇒ L2 → [k elements]w3 ← Expert3 =⇒ L3 → [k elements]

...

...

...wn ← Expertn =⇒ Ln → [k elements]



Supervised Borda Method

Borda score

B(i) =n∑

t=1

wi ∗ BLt (i) ; where t ∈ [1, k] (3)



Supervised Local Kemeny Aggregation

Steps:

1 L = [L1, L2, . . . , Ln], [w1,w2, . . . ,wn] , m elements(U)

2 Initialize m ×m matrix M with M(x , y) = 03 ∀(x , y) ∈ U, Compute

score(x , y) =∑n

i=1(wi ∗ (x � y)) where

x � y =

{0 if Li (x) < Li (y)

1 if Li (x) > Li (y)

4 If score(x , y) > 0.5 ∗∑n

i=1 wi , Insert M(x , y) = true andM(y , x) = false

5 Initial aggregation R = L16 For x , y ∈ R, Swap(x , y) if M(x , y) = false7 R is the final aggregation.



Supervised Local Kemeny Aggregation



Link Prediction based on Supervised Rank Aggregation

Examples: (Nodex ,Nodey ) −→ [a0, a1, a2, ...., an]

Steps:

1 Rank learning examples by attribute values

2 Consider only top k examples and compute attribute weight wai3 Rank test examples by attribute to get n ranked lists

4 Apply supervised rank aggregation

5 Consider only top k examples of the aggregate list and computeperformance.



Link Prediction Based on Supervised Rank Aggregation

Computation of attribute weights:

Maximization of positive precision:

Wai = n ∗ Precisionai (4)

Minimization of false positive rate:

Wai =n

FPRai(5)



Experiment

DBLP database

DatasetsTraining Validation Training examples Test examples

Time Time Positive Total Positive TotalDataset 1 [1970-1975] [1971-1976] 30 1693 41 3471Dataset 2 [1972-1977] [1973-1978] 87 19332 82 18757Dataset 3 [1974-1979] [1975-1980] 102 35190 164 60046

Table: DBLP Datasets

Performance measure:

F =Precision ∗ RecallPrecision + Recall

(6)



Results

Experiment-1 : Performance based on learning on complete trainingdataset



Results

Experiment-2 : Performance in terms of precision based on learning onsamples of training dataset

P is the number of positive examples and N is the number of negative

examples. In any sample, N = m ∗ P where m is any positive integer.M.Pujari & R.Kanawati Supervised Rank Aggregation Approach for Link Prediction 19/22


Results

Experiment-3 : Performance based on learning on complete trainingdataset and validation on test sample (N = 5 ∗ P).

DatasetsTraining examples Test examplesPositive Total Positive Total

Dataset 1 30 1693 41 246Dataset 2 87 19332 82 492Dataset 3 102 35190 164 984



Conclusion

A new definition for supervised rank aggregation.

Application of supervised rank aggregation to link prediction.

Future work:

⇒ Validation on other types of networks like e-commerce networks.⇒ Application for tag recommendation in folksonomy

[Pujari & al., 2011],[Pujari & al., 2012]⇒ Application of our approach with community detection methods.



References I

[Benchettara & al.,2010] N. Benchettara, R. Kanawati, C. Rouveirol. Supervised machine learning applied to link prediction in

bipartite social networks . In International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, 2010.

[Dwork & al.,2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank Aggregation method for Web. WWW 01: Proceedings of

10th international conference on World Wide Web, pages 613-622 (2001).

[Kumar & al., 2001] C. Dwork, R. Kumar, M. Naor, D.Sivakumar. Rank aggregation revisited . Manuscript, 1953.

[Katz,1953] L.Katz. A new status index derived from socimetric analysis. (article) . Vol. 18, pages 39-43, 2001.

[Borda, 1781] J.C.Borda FAG03 Mémoire sur les élections au Scrutin . Histoire de l’Académie Royale des Sciences,1781 .

[Fagin & al., 2003] R.Fagin, R.Kumar,D. Sivakumar. Efficient similarity search and classification via rank aggregation.

Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 301-312, 2003, New York.

[Sculley,2007]D.Sculley. Rank Aggregation for Similar Items. Proceedings of the Seventh SIAM International Conference on Data

Mining, April 26-28, 2007, Minneapolis, Minnesota, USA (2007)).

[Adamic & al.2003]L. A. Adamic, O. Buyukkokten, and E. Adar. A social network caught in the web. First Monday, Vol. 8, No.

6. (June 2003).

[Bisson & al., 2008] G. Bisson and F. Hussain. χ-Sim: A new similarity measure for the co- clustering task. In Seventh

International Conference on Machine Learning and Application(ICMLA), IEEE Computer Society (2008) , pages 211-217.

[Mrosek & al.,2009]J. Mrosek, S. Bussmann, H. Albers, K. Posdziech, B. Hengefeld, N. Opperman, S. Robert and G. Spirar.

Content- and graph-based tag recommendation: Two variatons. ECML PKDD Discovery Challenge 2009 DC09, 497, pages189-199. Bled, Slovenia, CEUR Workshop Proceedings, September 2009.



References II

[Lipczak, 2008]M.Lipczak. Tag recommendation for folksonomies oriented towards individual users . In Proceedings of ECML

PKDD Discovery Challenge (RSDC08) (2008), pages. 84-95.

[Liben-Nowell & al.,2007] David Liben-Nowell, and Jon Kleinberg The link prediction problem for social networks. In Proceedings

of the 16th international conference on World Wide Web), pages. 481-490,New York,USA,2007.

[Hasan & al., 2006] Mohammad Al Hasan, Vineet Chaoji, Saeed Salem, and Mohammed Zaki. Link prediction using supervised

learning. SIAM Workshop on Link Analysis, Counterterrorism and Security with SIAM Data Mining Conference, 2006.

[Acar & al.,2009]Evrim Acar, Daniel M. Dunlavy and Tamara G Kolda. Link Prediction on Evolving Data Using Matrix and

Tensor Factorizations.. ICDM Workshops, pages.262-269, 2009.

[Lahiri & al., 2007]Lahiri, Mayank and Berger-Wolf, Tanya Y. Structure Prediction in Temporal Networks using Frequent

Subgraphs. CIDM, pages 35-42,2007.

[Pujari & al., 2011]Manisha Pujari and Rushed Kanawati. Supervised machine learning link prediction approach for tag

recommandation. 4th International Conference on Online Communities and Social Computing @ HCI International,Orlando,Florida9-14 July 2011.

[Pujari & al., 2012]Manisha Pujari and Rushed Kanawati. Tag recommendation by link prediction based on supervised machine

learning. Sixth International AAAI Conference on Weblogs and Social Media (ICWSM 2012). 5-8 June 2012, Dublin..


Link PredictionSupervised Rank Aggregation based Link PredictionExperimentConclusion

supervised rank aggregation approach for link prediction in...

Documents