revisiting the multi-criteria recommender system of a learning portal
DESCRIPTION
Presentation of paper for Recommender Systems in Technology Enhanced Learning (RecSysTEL) workshop, ECTEL'12, Saarbruecken, GermanyTRANSCRIPT
Revisiting the Multi-Criteria Recommender System of a
Learning Portal
Nikos Manouselis1, Giorgos Kyrgiazos2, Giannis Stoitsis1
1Agro-Know Technologies, 2CTI@RecSysTEL’12, Saarbruecken, 19/9/12
our nice portal
our nice portal
collected dataitem
URL
•Dimension•Valueratings
•Value•Date
Organic.Edunet social data schema
reviews
•id•Name*•Email*
user
•Value•Date
tags
current service• recommendation of potentially interesting learning
resources to users– not very “loud”
• one recommendation algorithm based on collaborative filtering– rating history– neighborhood-based– multi-attribute over 3 criteria
[Subject Relevance, Educational Usefulness, Metadata]
– parameters defined & hard-coded
issues• lots of parameters could be different
– selected recommendation methods– neighborhood size– similarity measures
• parameterization took place using a similar dataset [but not the same]
– EUN’s Learning Resource Exchange (MELT) multi-attribute ratings dump
• Organic.Edunet’s user/content base continuously evolves
in the year 2007…
in the year 2007…
problem outline
• How do we know that the selected algorithm is still(?) good for the given portal?– specific rating dimensions (criteria)– selected parameterization – alternative algorithms– specific dataset & its expected evolution
experiment
approach• carry out same experiment: simulation of
how multi-attribute collaborative filtering algorithms perform– real data from Organic.Edunet users– simulated/synthetic data from expected future
scenario (when more ratings will be provided) – base algorithms from 2007 vs.
additional/alternative algorithms
real data from Organic.Edunet• 477 ratings
– 99 users (only 0.02% of registered ones)– 345 items (only 0.03% of indexed resources)
simulated/synthetic data• used Monte Carlo simulator to generate more
ratings of the same users – 1,280 ratings
2007 base algorithms
• Manouselis & Costopoulou (2006;2007)• classic neighborhood-based
collaborative filtering– extended for multi-criteria ratings– prediction per criterion (PG)– many parameters open for
tweaking/experimentation• different algorithm variations
START
current user c=0
Check if c is the active user
YES
NO
Normalization of partial usefulness ui
d(x)
Check if c is the last user
YES
NO
END
Selection of neighborhood D
Examine next user c=c+1
Calculation of similarity factor between active user and user c
Calculation of factor characteristics weighting
Process of similarity factor
Check if D is empty
YES
NO
Current neighbor d=0
Examine next neighbor d=d+1
Check if d has evaluated x
YES
NO
Check if d is the last neighbor
NO
YES
Calculation of prediction ui
α(x)
Check if at least one d has evaluated x
YES
NO
Result: Impossible prediction
Result: Prediction Uα(x)
i=0
Next criterio gi (i=i+1)
Load some active user’s usefulnesses ui
α(s) on criterio gi
Load some usefulnesses ui
c(s) on criterio gi
Detection of y common evaluated Learning Objects by
active user and user c
Check if y>0
YES
NO
Check if gi is the last criterio
YES
NO
START
current user c=0
Check if c is the active user
YES
NO
Normalization of partial usefulness ui
d(x)
Check if c is the last user
YES
NO
END
Selection of neighborhood D
Examine next user c=c+1
Calculation of similarity factor between active user and user c
Calculation of factor characteristics weighting
Process of similarity factor
Check if D is empty
YES
NO
Current neighbor d=0
Examine next neighbor d=d+1
Check if d has evaluated x
YES
NO
Check if d is the last neighbor
NO
YES
Calculation of prediction ui
α(x)
Check if at least one d has evaluated x
YES
NO
Result: Impossible prediction
Result: Prediction Uα(x)
i=0
Next criterio gi (i=i+1)
Load some active user’s usefulnesses ui
α(s) on criterio gi
Load some usefulnesses ui
c(s) on criterio gi
Detection of y common evaluated Learning Objects by
active user and user c
Check if y>0
YES
NO
Check if gi is the last criterio
YES
NO
additional/alternative algorithms
• Adomavicius & Kwon (2007)• similar approach, neighborhood-based
collaborative filtering extended for multi-criteria ratings– weights prediction based with average (AS) or
minimum (WS) similarities per criterion– same parameters open for
tweaking/experimentation• different algorithm variations
overall experiment setting
• 18 variations of each examined algorithm (PG, AW, WS) – plus some base non-personalised ones
• various values for parameters defining the neighborhood size
-> over 1,080 algorithmic variations executed and compared over each dataset
results: real dataset
results: synthetic dataset
best over both
Algorithm SimilarityNormalization
methodAVG Coverage AVG MAE
MNN variations
PG Cosine Deviation-from-Mean 61.33% 0.8855
PG Euclidian Simple Mean 61.33% 0.8626
CWT variations
PG Cosine Deviation-from-Mean 57.91% 0.8908
PG Cosine Simple Mean 57.91% 0.8673
2007:
implementation implications
• based on existing dataset and the foreseen future scenario– keep same algorithm (PG) for
recommendation service– adapt selection of options and their
parameterization– “actual” performance (vs. 2007) is probably
worse
conclusions
lessons learnt
• after 2 years of service operation– tried to repeat an offline experimental simulation– candidate multi-criteria recommendation
algorithms– data from real usage vs. synthetic data
• feeling better about algorithm choice– some insight into expected performance– not real impact into the actual service
to explore• would be interesting to experiment with more future
scenarios– make various estimations/projections about dataset size and
sparseness– execute algorithms over synthetic datasets simulating these
projections
• would be interesting to make a service that is really used– get more ratings, on more items– provide visible recommendations– measure impact to search/discovery behaviour
up & beyond
experiments beyond a single dataset
• combining data from various sources to boost the way recommenders work
• design algorithms that could provide cross-border recommendations
• provide many parallel/cascading/competing options for recommendation algorithms
• not really care about data size & storage
a social data infrastructure for learning
Aggregation of metadata, social and usage dataAggregation of metadata, social and usage data
Social DataSocial Data
Anonymised
Federated Recommendation
Services
Federated Recommendation
Services
Social DataSocial Data
Social DataSocial Data
Social DataSocial Data
…portals…
Metadata per URI
Metadata per URI
MetadataMetadata
MetadataMetadata
MetadataMetadata
Resolution services
APIAPI APIAPI APIAPI APIAPI
Social DataSocial Data
www.opendiscoveryspace.eu
challenges
• define common metadata schema(s)• aggregate (e.g. harvest/crawl) social data• transform each social data schema• URI resolution• scalability• anonymised approach• …
thank [email protected]
http://wiki.agroknow.gr http://www.organic-edunet.eu