wise2017 - factorization machines leveraging lightweight linked open data-enabled features for top-n...

22
Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations Guangyuan Piao, John G. Breslin Insight Centre for Data Analytics, National University of Ireland Galway The 18th International Conference on Web Information Systems Engineering Moscow, Russia, 7-10 th , October

Upload: guangyuan-piao

Post on 22-Jan-2018

241 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Guangyuan Piao, John G. Breslin Insight Centre for Data Analytics, National University of Ireland Galway

The 18th International Conference on Web Information Systems Engineering Moscow, Russia, 7-10th, October

Page 2: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Background

2

Linked Open Data (LOD) provides domain knowledge and rich Information about items

content-based recommender systems [source]: http://lod-cloud.net

Page 3: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  1st class citizen in LOD cloud

•  Structured information from Wikipedia

•  4.58 million things •  1,445,000 persons, 87,000 films etc.

Background

3

Linked Open Data (LOD) provides domain knowledge and rich Information about items

[source]: http://lod-cloud.net

knowledge base

Page 4: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Background Knowledge from DBpedia

4

Chase_films Auto_racing_films …

Page 5: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  Knowledge is represented as SPO triples •  SPO: Subject ! Property ! Object

•  Knowledge is freely accessible via a public SPARQL Endpoint

Background Knowledge from DBpedia

5

musicComposer

(Subject)

(Property)

(Object)

Page 6: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

(Some) Related Work

•  Semantic Similarity/Distance Measures •  [Passant et al. ISWC’10, AAAI’10] •  [Piao et al. SAC’16]

•  Graph-based algorithms such as PageRank •  [Musto et al. UMAP’16] •  [Nguyen et al. WWW’15]

•  Machine learning approaches •  [Noia et al. RecSys’12], VSM + SVM classifier •  [Noia et al. TIST’16], semantic paths + learning-to-rank (SPRank)

6

Page 7: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

(Some) Related Work

•  Semantic Similarity/Distance Measures •  [Passant et al. ISWC’10, AAAI’10] •  [Piao et al. SAC’16]

•  Graph-based algorithms such as PageRank •  [Musto et al. UMAP’16] •  [Nguyen et al. WWW’15]

•  Machine learning approaches •  [Noia et al. RecSys’12], VSM + SVM classifier •  [Noia et al. TIST’16], semantic paths + learning-to-rank (SPrank)

7

user-item interactions

item background knowledge

build a graph

extract features

feed to algorithms

SPARQL Endpoint

Page 8: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Combined Graph

8

Chase_films …

user-item interactions

item background knowledge

build a graph

extract features

feed to algorithms

SPARQL Endpoint

Page 9: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  Using lightweight LOD features from DBpedia •  lightweight: directly obtained via SPARQL Endpoint

•  Lightweight LOD features •  Property-Object list (PO)

Proposed Approach: Features

9

user-item interactions

item background knowledge

SPARQL Endpoint

dbr:The_Godfather

dbr:Carlo_Savina

dbo:knownFor

dbr:Francis_Ford_Coppola

dbr:The_Godfather_Returns dbc:Gangster_films

dbo:series

dbo:director

dc:subject

feed to algorithms

Page 10: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  Using lightweight LOD features from DBpedia •  lightweight: directly obtained via SPARQL Endpoint

•  LOD features •  Property-Object list (PO) •  Subject-Property list (SP)

Proposed Approach: Features

10

user-item interactions

item background knowledge

SPARQL Endpoint

dbr:The_Godfather

dbr:Carlo_Savina

dbo:knownFor

dbr:Francis_Ford_Coppola

dbr:The_Godfather_Returns dbc:Gangster_films

dbo:series

dbo:director

dc:subject

feed to algorithms

Page 11: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  Using lightweight LOD features from DBpedia •  lightweight: directly obtained via SPARQL Endpoint

•  LOD features •  Property-Object list (PO) •  Subject-Property list (SP) •  PageRank score (PR)

Proposed Approach: Features

11

user-item interactions

item background knowledge

SPARQL Endpoint

dbr:The_Godfather

dbr:Carlo_Savina

dbo:knownFor

dbr:Francis_Ford_Coppola

dbr:The_Godfather_Returns dbc:Gangster_films

dbo:series

dbo:director

dc:subject

feed to algorithms

Page 12: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  Factorization Machines (FMs)

•  Optimization: Bayesian Personalized Ranking (BPR)

Proposed Approach: Algorithms

12

Page 13: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Proposed Approach

13

1 0 … 1 0 … 0.2 0.2 … 0.1 0 … 0.1

0 1 … 0 1 … 0.3 0.5 … 0 0.3 … 0.2

user item PO SP PR

1

0

x1

Feature vector x Target y

x2

•  Overall features for Factorization Machines

Page 14: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  Movielens dataset for LOD-enabled recommender systems

•  80% for training set, and 20% for test set

Experimental Setup: Dataset

14

Page 15: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  P@N: the precision at rank N

•  R@N: the recall at rank N

•  nDCG@N: normalized Discounted Cumulative Gain

•  MRR: Mean Reciprocal Rank

•  MAP: Mean Average Precision

Experimental Setup: Evaluation Metrics

15

Page 16: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  PopRank: baseline approach

•  kNN-item: item-based k-nearest neighbors algorithm

•  BPRMF: matrix factorization with the BPR optimization

•  SPRank: learning-to-rank using semantic paths based on LOD

•  LODFM: our proposed approach

Experimental Setup: Compared Methods

16

Page 17: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Results

17

best tuned parameters: m=200, PO+PR

Page 18: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Model Analysis: Features (m=10)

18

Page 19: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Model Analysis: Dimensionality

19

Page 20: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

Model Analysis: Dimensionality

20

Page 21: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

•  LODFM provides state-of-the-art performance

•  Using FMs with lightweight LOD-enabled features •  directly obtained via a public SPARQL Endpoint of DBpedia •  without maintaining graph, and extracting features from it

•  Useful features: Property-Object list & PageRank •  Feature work

•  investigate other lightweight LOD-enable features •  evaluate in other domain dataset

Conclusions

21

Page 22: WISE2017 - Factorization Machines Leveraging Lightweight Linked Open Data-enabled Features for Top-N Recommendations

22

Guangyuan Piao e-mail: [email protected] twitter: https://twitter.com/parklize slideshare: http://www.slideshare.net/parklize