top-n recommendations from implicit feedback leveraging linked open data

36
RecSys 2013 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China Top-N Recommendations from Implicit Feedback leveraging Linked Open Data Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, Roberto Mirizzi [email protected], [email protected], [email protected], [email protected] Polytechnic University of Bari - Bari (ITALY)

Upload: vito-ostuni

Post on 11-May-2015

1.451 views

Category:

Health & Medicine


2 download

TRANSCRIPT

Page 1: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Top-N Recommendations from Implicit Feedback

leveraging Linked Open Data

Vito Claudio Ostuni, Tommaso Di Noia, Eugenio Di Sciascio, Roberto Mirizzi

[email protected], [email protected], [email protected], [email protected]

Polytechnic University of Bari - Bari (ITALY)

Page 2: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Outline

Introduction and motivation

SPrank: Semantic Path-based ranking Data model and Problem formulation

Path-based features

Learning the ranking function

Experimental Evaluation

Contributions and Conclusion

Page 3: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Linked Open Data

• Initiative for publishing and connecting data on the Web using Semantic Web technologies;

• >30 billion of RDF triples from hundreds of data sources;

• Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]

Page 4: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Linked Open Data

• Initiative for publishing and connecting data on the Web using Semantic Web technologies;

• >30 billion of RDF triples from hundreds of data sources;

• Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]

Page 5: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Hong Kong in DBpedia

db:Hong_Kong

db:thumbnail

8134 triples

subject predicate

object

Page 6: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Hong Kong in DBpedia

db:Hong_Kong

db:thumbnail

select * where { ?s dbpedia-owl:location <http://dbpedia.org/resource/Hong_Kong>. ?s dcterms:subject category:Skyscrapers_over_350_meters. }

Skyscrapers over 350 meters in Hong Kong?

Page 7: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Hong Kong in DBpedia

db:category:Skyscrapers_over_350_meters)

db:International_Commerce_centre

db:thumbnail

db:Central_Plaza_(Hong_Kong)

db:location

db:Hong_Kong

db:thumbnail

dcterms:subject

db:thumbnail

Page 8: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Motivation

Traditional Ontological/Semantic Recommender Systems:

• make use of limited domain ontologies;

• rely on explicit feedback data;

• address the rating prediction task.

Page 9: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Motivation

Traditional Ontological/Semantic Recommender Systems:

• make use of limited domain ontologies;

• rely on explicit feedback data;

• address the rating prediction task.

But…

• a lot of structured semantic data on the Web;

• Implicit feedback are easier to collect;

• Top-N Recommendations is a more realistic task.

Page 10: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Motivation

Traditional Ontological/Semantic Recommender Systems:

• make use of limited domain ontologies;

• rely on explicit feedback data;

• address the rating prediction task.

But…

• a lot of structured semantic data on the Web;

• Implicit feedback are easier to collect;

• Top-N Recommendations is a more realistic task.

Challenge:

• compute Top-N Item Recommendations from implicit feedback exploiting the Web of Data.

Page 11: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Our approach

• Usage of structured semantic data freely available on the Web (Linked Open Data) to describe items

DBpedia ontology

Page 12: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Our approach

• Analysis of complex relations between the user preferences and the target item (extraction of path-based features)

Page 13: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Our approach

• Analysis of complex relations between the user preferences and the target item (extraction of path-based features)

• Formalization of the Top-N Item recommendation problem from implicit feedback in a Learning To Rank setting

Page 14: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Data model

I1 i2 i3 i4

u1 1 1 0 0

u2 1 0 1 0

u3 0 1 1 0

u4 0 1 0 1

Implicit Feedback Matrix Knowledge Graph ^

S

Page 15: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Data model

Implicit Feedback Matrix Knowledge Graph ^

S

I1 i2 i3 i4

u1 1 1 0 0

u2 1 0 1 0

u3 0 1 1 0

u4 0 1 0 1

Page 16: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Data model

Implicit Feedback Matrix Knowledge Graph ^

S

I1 i2 i3 i4

u1 1 1 0 0

u2 1 0 1 0

u3 0 1 1 0

u4 0 1 0 1

Page 17: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Problem formulation

^

^

*

^*

{ | 1}

{ | 0}

, ( )

uiu

uiu

u u

D

ui

uiui u u

u

I i I s

I i I s

I I

x

TR x s i I I

Feature vector

Set of irrelevant items for u

Set of relevant items for u

Training Set

Sample of irrelevant items for u

Page 18: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Path-based features

1

# ( )( )

# ( )

uiui D

ui

d

path jx j

path d

path acyclic sequence of relations ( s , .. rl , .. rL )

Frequency of pathj in the sub-graph related to u ad i

• The more the paths, the more the item is relevant. • Different paths have different meaning. • Not all types of paths are relevant.

u3 s i2 p2 e1 p1 i1 (s, p2 , p1)

Page 19: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Path-based features

xu3i1 ?

Page 20: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Path-based features

path1 (s, s, s) : 1

Page 21: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Path-based features

path1 (s, s, s) : 2

Page 22: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Path-based features

path1 (s, s, s) : 2 path2 (s, p2, p1) : 1

Page 23: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Path-based features

path1 (s, s, s) : 2 path2 (s, p2, p1) : 2

Page 24: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Path-based features

path1 (s, s, s) : 2 path2 (s, p2, p1) : 2 path3 (s, p2, p3, p1) : 1

Page 25: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Path-based features

path1 (s, s, s) : 2 path2 (s, p2, p1) : 2 path3 (s, p2, p3, p1) : 1

3 1

3 1

3 1

2(1)

5

2 (2)

5

1 (3)

5

u i

u i

u i

x

x

x

u1

i1

u2

u3

i2

i3

e1

e3

e4

e2

e5

u4

i4

Page 26: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Learning the ranking function

Point-wise Learning To Rank

• Simplest LTR technique

• Very effective in practice (Yahoo! Learning to Rank Challenge best

solution was extremely randomized trees in a standard regression setting)

: Df Learn a prediction function ^

( ) uiuif x ss.t.

Assumption: if f is accurate, then the ranking induced by f should be close to the desired ranking

Page 27: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

BagBoo

BagBoo: a scalable hybrid bagging-the-boosting model [D. Pavlov, A. Gorodilov, C. Brunk CIKM2010]

• Combination of Random Forest (Bagging) and Gradient Boosted Regression Trees (Boosting)

• Combines the high accuracy of gradient boosting with the resistance to overfitting of random forests

For b=1 to B:

fb learn GBRT from Tb

bT TR

1

1 B

b

b

f fB

Page 28: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Evaluation Methodology

• Top-N Item recommendation task

• Evaluation methodology similar to: [Cremonesi, Koren and Turrin, RecSys 2010]

• Evaluation with different user profile size: User

profile Test Set

10

5

User profile Test Set 10

User profile Test Set

……

given 5

given 10

given All

Page 29: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Datasets

Subset of Movielens mapped to DBpedia 3,792 users 2,795 movies 104,351 entities

Subset of Last.fm mapped to DBpedia

852 users 6,256 artists 150,925 entities

Mappings http://sisinflab.poliba.it/mappingdatasets2dbpedia.zip

Page 30: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Evaluation of different ranking functions

0

0,1

0,2

0,3

0,4

0,5

0,6

given 5 given 10 given 20 given 30 given 50 given All

reca

ll@5

user profile size

Movielens

BagBoo

GBRT

Sum

Page 31: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Evaluation of different ranking functions

0

0,1

0,2

0,3

0,4

0,5

0,6

given 5 given 10 given 20 given All

reca

ll@5

user profile size

Last.fm

BagBoo

GBRT

Sum

Page 32: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Comparative approaches

• BPRMF, Bayesian Personalized Ranking for Matrix Factorization

• BPRLin, Linear Model optimized for BPR (Hybrid alg.)

• SLIM, Sparse Linear Methods for Top-N Recommender Systems

• SMRMF, Soft Margin Ranking Matrix Factorization

MyMediaLite

Page 33: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Comparison with other approaches

0

0,1

0,2

0,3

0,4

0,5

0,6

given 5 given 10 given 20 given 30 given 50 given All

reca

ll@5

user profile size

Movielens

SPrank

BPRMF

SLIM

BPRLin

SMRMF

Page 34: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Comparison with other approaches

0

0,1

0,2

0,3

0,4

0,5

0,6

given 5 given 10 given 20 given All

reca

ll@5

user profile size

Last.fm

SPrank

BPRMF

SLIM

BPRLin

SMRMF

Page 35: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Contributions

SPrank: Semantic Path-based ranking

Combination of semantic item descriptions from the Web of Data and implicit feedback

Mining of the semantic graph using path-based features

Learning To Rank setting

Future Work:

Deeper analysis of the path-based features

Usage of other Learning To Rank approaches

Page 36: Top-N Recommendations from Implicit Feedback leveraging Linked Open Data

RecSys 2013 – 7th ACM Conference on Recommender Systems October 12-16, 2013 Hong Kong, China

Q & A

A Little Semantics Goes a Long Way.

Hendler Hypothesis