lecture 4: social web personalization (2012)
DESCRIPTION
This is the fourth lecture in the Social Web course at the VU University Amsterdam Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2012/ Thanks to Fabian Abel for letting me adopt slides from his lecturesTRANSCRIPT
Social WebLecture IV Personalization on the Social Web
(some slides were adopted from Fabian Abel)
Lora AroyoThe Network Institute
VU University Amsterdam
Monday February 27 12
Personalization amp Social Web
bull Applications on the Social Web use web data [last week] amp are lsquosocialrsquo
bull To design lsquosocialrsquo functionality we need to understand how out of the data the application can provide relevant information (what users perceive as relevant)
bull Therefore we need to understand
bull how good personalization (recommenders) are
bull how good the user models are they are based on
bull In this lecture we consider theory amp techniques for how to design and evaluate recommenders and user models (for use in SW applications)
Monday February 27 12
total transparency - is it desired
Monday February 27 12
User Modeling
httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg
How to infer amp represent user information that supports a given application or context
Monday February 27 12
User Modeling Challengebull Application has to obtain
understand amp exploit information about the user
bull Information (need amp context) about user
bull Inferring information about user amp representing it so that it can be consumed by the application
bull Data relevant for inferring information about user
Monday February 27 12
User amp Usage Data is everywhere
bull People leave traces on the Web and on their computers
bull Usage data eg query logs click-through-data
bull Social data eg tags (micro-)blog posts comments bookmarks friend connections
bull Documents eg pictures videos
bull Personal data eg affiliations locations
bull Products applications services - bought used installed
bull Not only a userrsquos behavior but also interactions of other users
bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Personalization amp Social Web
bull Applications on the Social Web use web data [last week] amp are lsquosocialrsquo
bull To design lsquosocialrsquo functionality we need to understand how out of the data the application can provide relevant information (what users perceive as relevant)
bull Therefore we need to understand
bull how good personalization (recommenders) are
bull how good the user models are they are based on
bull In this lecture we consider theory amp techniques for how to design and evaluate recommenders and user models (for use in SW applications)
Monday February 27 12
total transparency - is it desired
Monday February 27 12
User Modeling
httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg
How to infer amp represent user information that supports a given application or context
Monday February 27 12
User Modeling Challengebull Application has to obtain
understand amp exploit information about the user
bull Information (need amp context) about user
bull Inferring information about user amp representing it so that it can be consumed by the application
bull Data relevant for inferring information about user
Monday February 27 12
User amp Usage Data is everywhere
bull People leave traces on the Web and on their computers
bull Usage data eg query logs click-through-data
bull Social data eg tags (micro-)blog posts comments bookmarks friend connections
bull Documents eg pictures videos
bull Personal data eg affiliations locations
bull Products applications services - bought used installed
bull Not only a userrsquos behavior but also interactions of other users
bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
total transparency - is it desired
Monday February 27 12
User Modeling
httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg
How to infer amp represent user information that supports a given application or context
Monday February 27 12
User Modeling Challengebull Application has to obtain
understand amp exploit information about the user
bull Information (need amp context) about user
bull Inferring information about user amp representing it so that it can be consumed by the application
bull Data relevant for inferring information about user
Monday February 27 12
User amp Usage Data is everywhere
bull People leave traces on the Web and on their computers
bull Usage data eg query logs click-through-data
bull Social data eg tags (micro-)blog posts comments bookmarks friend connections
bull Documents eg pictures videos
bull Personal data eg affiliations locations
bull Products applications services - bought used installed
bull Not only a userrsquos behavior but also interactions of other users
bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling
httpfarm5staticflickrcom40384553496383_5b6a5f1485_ojpg
How to infer amp represent user information that supports a given application or context
Monday February 27 12
User Modeling Challengebull Application has to obtain
understand amp exploit information about the user
bull Information (need amp context) about user
bull Inferring information about user amp representing it so that it can be consumed by the application
bull Data relevant for inferring information about user
Monday February 27 12
User amp Usage Data is everywhere
bull People leave traces on the Web and on their computers
bull Usage data eg query logs click-through-data
bull Social data eg tags (micro-)blog posts comments bookmarks friend connections
bull Documents eg pictures videos
bull Personal data eg affiliations locations
bull Products applications services - bought used installed
bull Not only a userrsquos behavior but also interactions of other users
bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling Challengebull Application has to obtain
understand amp exploit information about the user
bull Information (need amp context) about user
bull Inferring information about user amp representing it so that it can be consumed by the application
bull Data relevant for inferring information about user
Monday February 27 12
User amp Usage Data is everywhere
bull People leave traces on the Web and on their computers
bull Usage data eg query logs click-through-data
bull Social data eg tags (micro-)blog posts comments bookmarks friend connections
bull Documents eg pictures videos
bull Personal data eg affiliations locations
bull Products applications services - bought used installed
bull Not only a userrsquos behavior but also interactions of other users
bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User amp Usage Data is everywhere
bull People leave traces on the Web and on their computers
bull Usage data eg query logs click-through-data
bull Social data eg tags (micro-)blog posts comments bookmarks friend connections
bull Documents eg pictures videos
bull Personal data eg affiliations locations
bull Products applications services - bought used installed
bull Not only a userrsquos behavior but also interactions of other users
bull ldquopeople can make statements about merdquo ldquopeople who are similar to me can reveal information about merdquo --gt ldquosocial learningrdquo collaborative recommender systems
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
UM Basic Concepts
bull User Profile a data structure that represents a characterization of a user at a particular moment of time represents what from a given (system) perspective there is to know about a user The data in the profile can be explicitly given by the user or derived by the system
bull User Model contains the definitions amp rules for the interpretation of observations about the user and about the translation of that interpretation into the characteristics in a user profile user model is the recipe for obtaining and interpreting user profiles
bull User Modeling the process of representing the user
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling Approaches
bull Overlay User Modeling describe user characteristics eg ldquoknowledge of a userrdquo ldquointerests of a userrdquo with respect to ldquoidealrdquo characteristics
bull Customizing user explicitly provides amp adjusts elements of the user profile
bull User model elicitation ask amp observe the user learn amp improve user profile successively ldquointeractive user modelingrdquo
bull Stereotyping stereotypical characteristics to describe a user
bull User Relevance Modeling learninfer probabilities that a given item or concept is relevant for a user
Related scientific conference httpumap2011org Related journal httpumuaiorg
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Which approach suits
best the conditions of applications
httpfarm7staticflickrcom62406346803873_e756dd9bae_bjpg
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Overlay User Models
bull among the oldest user models
bull used for modeling student knowledge
bull the user is typically characterized in terms of domain concepts amp hypotheses of the userrsquos knowledge about these concepts in relation to an (ideal) expertrsquos knowledge
bull concept-value pairs
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Model Elicitation
bull Ask the user explicitly learn
bull NLP intelligent dialogues
bull Bayesian networks Hidden Markov models
bull Observe the user learn
bull Logs machine learning
bull Clustering classification data mining
bull Interactive user modeling mixture of direct inputs of a user observations and inferences
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
httphunchcomMonday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Stereotyping
bull set of characteristics (eg attribute-value pairs) that describe a group of users
bull user is not assigned to a single stereotype - user profile can feature characteristics of several different stereotypes
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Why are stereotypes
useful
httpfarm1staticflickrcom155413650229_31ef379b0b_bjpg
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Can we use Social Web
data for user
modeling
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Can we infer a Twitter-based user profile
User Modeling (4 building blocks)
Semantic Enrichment Linkage and Alignment
Personalized News Recommender
Profile
I want my
personalized news recommendations
Example from Abel et al (2011)Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
time
1 Which tweets of the user should be
analyzed
Morning Afternoon Night
1 Temporal Constraints
June 27 July 4 July 11
(b) temporal patterns
weekends start end
(a) time period
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won French Open fo2010
Francesca Schiavone
French Open
Francesca Schiavone French Open entity-based
Sport T
T topic-based
2 What type of concepts should represent ldquointerestsrdquo
fo2010
fo2010 hashtag-based
1 Temporal Constraints
time
June 27 July 4 July 11
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
Francesca Schiavone won httpbitly2f4t7a
Francesca Schiavone
3 Further enrich the semantics of tweets
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
Francesca wins French Open Thirty in womens tennis is primordially old an age when agility and desire recedes as the hellip
French Open
Tennis
French Open
Tennis
(b) further enrichment
(a) tweet-based
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modeling Building Blocks
Profile concept weight
2 Profile Type
4 How to weight the concepts
1 Temporal Constraints
3 Semantic Enrichment
Francesca Schiavone
French Open
Tennis
4 Weighting Scheme
time
June 27 July 4 July 11
weight(Francesca Schiavone)
Concept frequency (TF)
4
3 6
TFxIDF Time-sensitive
weight(French Open)
weight(Tennis)
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Observationsbull Profile characteristics
bull Semantic enrichment solves sparsity problems
bull Profiles change over time fresh profiles reflect better current user demands
bull Temporal patterns weekend profiles differ significantly from weekday profiles
bull Impact on news recommendations
bull The more fine-grained the concepts the better the recommendation performance entity-based gt topic-based gt hashtag-based
bull Semantic enrichment improves recommendation quality
bull Time-sensitivity (adapting to trends) improves performance
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User Modelingit is not about putting everything in a user profile
it is about making the right choices
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
User AdaptationKnowing the user - this knowledge - can be applied to adapt
a system or interface to the user to improve the system functionality and user experience
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
user modeling
user profile
observations data and information about user
profile analysis
adaptation decisions
User-Adaptive Systems
A Jameson Adaptive interfaces and agents The HCI handbook fundamentals evolving technologies and emerging applications pp 305ndash330 2003
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
user modeling (infer current musical taste)
user profile interests in
genres artists tags
history of songs like ban pause skip
compare profile with possible next
songs to play
next song to be played
Lastfm adapts to your music taste
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Google Adaptive Search
httpwwwgooglecomgoodtoknow
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Issues in User-Adaptive Systems
bull Overfitting ldquobubble effectsrdquo loss of serendipity problem
bull systems may adapt too strongly to the interestsbehavior
bull eg an adaptive radio station may always play the same or very similar songs
bull We search for the right balance between novelty and relevance for the user
bull ldquoLost in Hyperspacerdquo problem
bull when adapting the navigation ndash ie the links on which users can click to findaccess information
bull eg re-orderinghiding of menu items may lead to confusion
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
httpwwwflickrcomphotosbellarosebyliz4729613108
What is good user modeling amp personalization
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Success perspectives
bull From the consumer perspective of an adaptive system
bull From the provider perspective of an adaptive system
Adaptive system maximizes satisfaction of the user
hard to measureobtain
Adaptive system maximizes the profit
influence of UM amp personalization may be hard to measureobtain
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Evaluation Strategiesbull User studies Clean-room study askobserve (selected) people
whether you did a good job
bull Log analysis Analyze (click) data and infer whether you did a good job eg cross-validation by ldquoLeave-one-outrdquo
bull Evaluation of user modeling
bull measure quality of profiles directly eg measure overlap with existing (true) profiles or let people judge the quality of the generated user profiles
bull measure quality of application that exploits the user profile eg apply user modeling strategies in a recommender system
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Possible Metricsbull The usual IR metrics
bull Precision fraction of retrieved items that are relevant
bull Recall fraction of relevant items that have been retrieved
bull F-Measure (harmonic) mean of precision and recall
bull Metrics for evaluating recommendation (rankings)
bull Mean Reciprocal Rank (MRR) of first relevant item
bull Successk probability that a relevant item occurs within the top k
bull If a true ranking is given rank correlations
bull Precisionk Recallk amp F-Measurek
bull Metrics for evaluating prediction of user preferences
bull MAE = Mean Absolute Error
bull TrueFalse PositivesNegatives
runs
performance strategy X baseline
Is strategy X better than the baseline
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Example Evaluationbull [Rae et al] shows a typical example of how to investigate and evaluate a
proposal for improving (tag) recommendations (using social networks)
bull Task test how well the different strategies (here different tag contexts) can be used for tag predictionrecommendation
bull Steps
1 Gather a dataset of tag data part of which can be used as input and aim to test the recommendation on the remaining tag data
2 Use the input data and calculate for the different strategies the predictions
3 Measure the performance using standard (IR) metrics Precision of the top 5 recommended tags (P5) Mean Reciprocal Rank (MRR) Mean Average Precision (MAP)
4 Test the results for statistical significance using Studentrsquos T-test relative to the baseline (eg existing approach competitive approach)
[Rae et al Improving Tag Recommendations Using Social Networks RIAOrsquo10]]
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Example Evaluationbull [Guy et al] shows another example of a similar evaluation
approach
bull Here the different strategies differ in the way people and tags are used in the strategies with these tag-based systems there are complex relationships between users tags and items and strategies aim to find the relevant aspects of these relationships for modeling and recommendation
bull Here their baseline is the strategy of the lsquomost popularrsquo tags this is a strategy often used to compare the globally most popular tags to the tags predicted by a particular personalization strategy thus investigating whether the personalization is worth the effort and is able to outperform the easily available baseline
[Guy et al Social Media Recommendation based on People and Tags SIGIRrsquo10]]
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
user interactions (level amp type) instead of general social context - better for recommendations
does hybrid always work worseMonday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Recommendation Systems
Predict items that are relevantusefulinteresting (and to what extent)
for given user (in a given context)
itrsquos often a ranking task
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
httpwwwwiredcommagazine201111mf_artsyall1Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Collaborative Filtering
u1 likes u2
likes likes u1 likes Pulp Fiction
bull Memory-based User-Item matrix ratingspreferences of users =gt compute similarity between users amp recommend items of similar users
bull Model-based Item-Item matrix similarity (eg based on user ratings) between items =gt recommend items that are similar to the ones the user likes
bull Model-based Clustering cluster users according to their preferences =gt recommend items of users that belong to the same cluster
bull Model-based Bayesian networks P(u likes item B | u likes item A) = how likely is it that a user who likes item A will like item B learn probabilities from user ratingspreferences
bull Others rule-based other data mining techniques
bull
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Memory vs Model-based
bull complete input data is required
bull pre-computation not possible
bull does not scale well (ldquotricksrdquo are needed)
bull high quality of recommendations
bull abstraction (model) of input data
bull pre-computation (partially) possible (model has to be re-built from time to time)
bull scales better
bull abstraction may reduce recommendation quality
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Social Networks amp Interest Similarity
bull collaborative filtering lsquoneighborhoodsrsquo of people with similar interest amp recommending items based on likings in neighborhood
bull limitations next to lsquocold startrsquo and lsquosparsityrsquo the lack of control (over onersquos neighborhood) is also a problem ie cannot add lsquotrustedrsquo people nor exclude lsquostrangersquo ones
bull therefore interest in lsquosocial recommendersrsquo where presence of social connections defines the similarity in interests (eg social tagging CiteULike)
bull does a social connection indicate user interest similarity
bull how much users interest similarity depends on the strength of their connection
bull is it feasible to use a social network as a personalized recommendation
[Lin amp Brusilovsky Social Networks and Interest Similarity The Case of CiteULike HTrsquo10]Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Conclusionsbull pairs unilaterally connected have more common information
items metadata and tags than non-connected pairs
bull the similarity was largest for direct connections and decreased with the increase of distance between users in the social networks
bull users involved in a reciprocal relationship exhibited significantly larger similarity than users in a unidirectional relationship
bull traditional item-level similarity may be less reliable way to find similar users in social bookmarking systems
bull items collections of peers connected by self-defined social connections could be a useful source for cross-recommendation
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Social recommendations on conflicting groups
are all interests of our friends relevant Is it application generic
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Content-based Recommendations
bull Input characteristics of items amp interests of a user into characteristics of items =gt Recommend items that feature characteristics which meet the userrsquos interests
bull Techniques
bull Data mining methods Cluster items based on their characteristics =gt Infer usersrsquo interests into clusters
bull IR methods Represent items amp users as term vectors =gt Compute similarity between user profile vector and items
bull Utility-based methods Utility function that gets an item as input the parameters of the utility function are customized via preferences of a user
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Government stops renovation of tower bridge Oct 13th 2011
Tower Bridge today Under construction
Tower Bridge is a combined bascule and suspension bridge in London England over the River Thames
Category politics england Related Twiper news bob Why do they stop tohellip [more] mary London stops renohellip [more]
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
02 0 0
02 04 01 01
= a
Weighting strategy - occurrence frequency - normalize vectors (1-norm sum of vector equals 1)
Content Features
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
RT Government stops renovation of tower bridge Oct 13th 2011
Userrsquos Twitter history
I am in London at the moment Oct 13th 2011
I am doing sports Oct 12th 2011
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
0 01 0
05 02 02 0
= u
Weighting strategy - occurrence frequency (eg smoothened by occurrence time recent concepts are more important - normalize vectors (1-norm sum of vector equals 1)
User Model
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
dbPolitics dbSports
dbEducation dbLondon
dbTower_Bridge dbGovernment
dbUK
u 0
01 0
05 02 02 0
candidate items user a
02 0 0
02 04 01 01
b 0 0 0
08 02 0 0
c 0
05 02 0 0 0
03
cosine similarities
a b c
u 067 092 014
Ranking of recommended items 1 b 2 a 3 c
Recommendations
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
RecSys Problemsbull Cold-start problem New User problem no or little data is available to infer
the preferences of new users
bull Changing User Preferences user interests may change over time
bull Sparsity problem New Item problem item descriptions are sparse eg not many user rated or tagged an item
bull Lack of Diversity Overfitting for many applications good recommendations should be relevant and new to the user (exceptions predicting re-visiting behavior etc) When adapting too strongly to the preferences of a user the user might see again and again samesimilar recommendations
bull Use the right context users do lots of things which might not be relevant for their user model eg try out things do stuff for other people
bull Research challenge find right balance between serendipity amp personalization
bull Research challenge find right way to use the influence of the recommendations on the userrsquos behavior
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
What is true personalization
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12
Hands-on Teaser
bull Build your own recommender system 101
bull Recommend pages on delicious
bull Recommend pages to your Facebook friends
image source httpwwwflickrcomphotosbionicteaching1375254387
Monday February 27 12