department of computer science & engineering university of ...€¦ · user eric has to decide...
TRANSCRIPT
RecentAdvancesinRecommenderSystemsandFutureDirec5ons
GeorgeKarypis DepartmentofComputerScience&EngineeringUniversityofMinnesota
1
OVERVIEWOFRECOMMENDERSYSTEMS
2
RecommenderSystems
Recommendersystemshaveemergedasakeyenablingtechnologyforecommerce.
– Virtualexpertsthatarekeenlyawareofacustomer’spreferences.– Theyareusedtofiltervastamountsofdatainordertoiden5fyinforma5onthatisrelevanttoauser.
Recommender system Recommendation
3
Datasources
• Informa5onabouttheitems:– Productdescrip5ons,productspecifica5ons,ar5clecontent,media
aRributes,etc.
• Informa5onabouttheusers:– Profiledatasuchasdemographic,social-economic,andbehavioral
informa5on.– Social/trustnetworks;eitherimplicitorexplicit.
• Transac5onalinforma5on:– Historyofuser-itempurchases.– Productra5ngs,productreviews.– Implicitvs.explicitinforma5on.
4
Problems&approaches
• Ra5ngpredic5on– Predictthera5ngthatauserwillgivetoa
par5cularitem.
• Top-Nrecommenda5on– Iden5fyasetofitemsthattheuserwilllike.
• Cold-startrecommenda5ons– Computerecommenda5onsforpreviously
unseenusersand/oritems.
• Grouprecommenda5ons– Computerecommenda5onsthatsa5sfya
groupofusers.
• Assortmentrecommenda5ons– Recommendanautoma5callyconstructedassortmentofitems:e.g.,vaca5onpackage,ouYit,playlist.
• Contextawarerecommenda5ons.
• Non-personalizedmethods– Rule-basedsystems,popularity-based
models,methodsbasedonglobalmodels.
• Methodsrelyingonauser’shistoricalprofile– Predic5vemodelses5matedusingauser’s
individualpreferences.
• Collabora5vefilteringmethods– Approachesthatleverageinforma5onfrom
otherusersanditems.
Recommenda7onproblems Recommenda7onapproaches
5
Collabora5veFiltering(CF)
§ Derivesrecommenda5onsbyexploi5ngthe“wisdomofthecrowd”.
§ Itcanbeviewedasaninstanceofmul5-tasklearningandtransferlearning.
§ Itisthemostprominentapproachtoday:– Usedbylarge,commerciale-commercesites.– Well-understood,variousalgorithmsandvaria5onsexist.– Applicableinmanydomains(book,movies,DVDs,..).– Itcanoperatebothinacontentagnos5candcontent
awareseang.– Itcanhandlebothexplicitandimplicitpreference
informa5on.
6
OverviewofCFmethods
• User-,item-,andgraph-basedneighborhoodmethods.– Lazylearners.
• Methodsthatusevariousmachinelearningapproachestolearnapredic5vemodelfromhistoricaldata.– Latent-spacemodelsbasedonmatrix/tensorcomple5on.– Linearandnon-linearmul5-regressionmodels.– Probabilis5cmodels.– Auto-encoder-basedneuralnetworks.– ….
7
Item-basednearest-neighbor(1)
Thebasictechnique:– GivenEric(“ac5veuser”)andTitanic(“ac5veitem”)thatErichasnotyet
seen.– Findasetofitems(nearestneighbors)thatEricalreadyratedsuchthat
otheruserslikedeachoftheminasimilarfashionastheylikedTitanic.
– UseEric’sra5ngsontheseitemstopredicthowmuchEricwilllikeTitanic.
Fortop-N,dothisforeveryunrateditemandreturntheNitemswiththehighestpredictedra5ng.
2 A Comprehensive Survey of Neighborhood-Based Recommendation Methods 43
by their “interestingness” to u. If the test set is built by randomly selecting, for eachuser u, a single item iu of Iu, the performance of L can be evaluated with the AverageReciprocal Hit-Rank (ARHR):
ARHR.L/ D 1
jUjX
u2U
1
rank.iu;L.u//; (2.5)
where rank.iu;L.u// is the rank of item iu in L.u/, equal to1 if iu 62 L.u/. A moreextensive description of evaluation measures for recommender systems can be foundin Chap. 8 of this book.
2.3 Neighborhood-Based Recommendation
Recommender systems based on neighborhood automate the common principle thatsimilar users prefer similar items, and similar items are preferred by similar users.To illustrate this, consider the following example based on the ratings of Fig. 2.1.
Example 2.1. User Eric has to decide whether or not to rent the movie “Titanic”that he has not yet seen. He knows that Lucy has very similar tastes when it comesto movies, as both of them hated “The Matrix” and loved “Forrest Gump”, so heasks her opinion on this movie. On the other hand, Eric finds out he and Diane havedifferent tastes, Diane likes action movies while he does not, and he discards heropinion or considers the opposite in his decision.
2.3.1 User-Based Rating Prediction
User-based neighborhood recommendation methods predict the rating rui of a useru for a new item i using the ratings given to i by users most similar to u, callednearest-neighbors. Suppose we have for each user v ¤ u a value wuv representingthe preference similarity between u and v (how this similarity can be computed will
TheTitanic
Die ForrestWall-E
Matrix Hard Gump
John 5 1 2 2Lucy 1 5 2 5 5Eric 2 ? 3 5 4
Diane 4 3 5 3
Fig. 2.1 A “toy example” showing the ratings of four users for five movies
Keyassump5ons:• Itemsbelonginto(overlapping)groupsthatelicitsimilarlikes/dislikes.
• Users’preferencesremainstableandconsistentover5me.
8
Item-basednearest-neighbor(2)
• Item-basedmethodspre-computeandstorethekmostsimilarotheritemsforeachitem.
• Thepredic5onscoresarees5matedusingonlythosemostsimilaritems.
• Similari5es:cosine,Jaccard,correla5oncoeeficient,condi5onalprobability,…
9
rui = f(ru:, s:i) ;ĞŐ rui = ru: s:iͿ
ǁŚĞƌĞ
R Rnm ŝƐ ƚŚĞ ŵĂƚƌŝdž ŽĨ ŚŝƐƚŽƌŝĐĂů ƌĂƟŶŐƐ ĂŶĚ
S Rmm ŝƐ ƚŚĞ ŝƚĞŵͲŝƚĞŵ ƐŝŵŝůĂƌŝƚLJ ŵĂƚƌŝdž ƐƚŽƌŝŶŐ ƚŚĞ kŚŝŐŚĞƐƚ ƐŝŵŝůĂƌŝƟĞƐ ĂůŽŶŐ ĞĂĐŚ ƌŽǁ
Low-rankmatrixapproxima5on
Thera5ngmatrixcanbeapproximatedastheproductoftwolow-rankmatrices:
rui = puq0i
Thislow-rankdecomposi5oncanbeusedtocomputethera5ngsforunseenitemsas:
10
Interpreta5onoflatentfactors
Thereisalowdimensionalfeaturespace(whosedimensionalityistherankofthedecomposi5on)onwhichbothusersanditemscanbeembeddedin.Latent variable view
Geared towards females
Geared towards males
serious
escapist
The PrincessDiaries
The Lion King
Braveheart
Lethal Weapon
Independence Day
AmadeusThe Color Purple
Dumb and Dumber
Ocean’s 11
Sense and Sensibility
Gus
Dave
Latent factor models
• Theitemvectorcanbethoughtofascapturinghowmuchofeachofthelatentfeaturestheitempossess.
• Theuservectorcanbethoughtofascapturingtheuser’spreferencefortheselatentfeatures.
11
Latentfactorsviamatrixcomple5on
• Es5matethelatentfactormatricesbasedonlyontheobservedentriesofthematrix.
• Thisisanon-convexop5miza5onproblemandcanconvergetoalocalminima.
>Ğƚ ďĞ ƚŚĞ ƐĞƚ ŽĨ ŽďƐĞƌǀĞĚ ĞŶƚƌŝĞƐ ŝŶ ƚŚĞ ƌĂƟŶŐ ŵĂƚƌŝdž RdŚĞ P ĂŶĚ Q ĨĂĐƚŽƌƐ ĂƌĞ ĞƐƟŵĂƚĞĚ ďLJ ƐŽůǀŝŶŐ ƚŚĞ ĨŽůůŽǁŝŶŐŽƉƟŵŝnjĂƟŽŶ ƉƌŽďůĞŵ
ŵŝŶŝŵŝnjĞP,Q
=
(u,i)
(rui puqi)
2.
12
Improvingaccuracyoflatentfactormodels
Theperformanceofthera5ngpredic5oncanbeimprovedby
– Explicitlymodelingglobal,user,anditembiases• Biases:thebaseline/expectedra5ngsforallitemsbyallusers,fortheitemsratedbyauser,andfortheusersforagivenitem.
– Reducingmodelover-fiangviaregulariza5on:
ŵŝŶŝŵŝnjĞP,Q,µ,b∗
=!
(u,i)∈Ω
(rui−µ− bu− bi− puq′i)
2+λ(µ+ ||b∗||22+ ||P ||2F + ||Q||2F ).
dŚĞ ĞƐƟŵĂƚĞĚ ƌĂƟŶŐ ĨŽƌ ƵƐĞƌ u ŽŶ ŝƚĞŵ i ŝƐ ŐŝǀĞŶ ďLJrui = µ + bu + bi + pq
13
Recenttrends
• Theboundariesbetweentradi5onalneighborandlatentfactormodelshavebecomelessseparated.
• Neighborschemesrelyonthelatentspacetocomputeitem-itemsimilari5es.
• Latentfactoriza5onmethodsrelyonsimilaritemstoderivealatentrepresenta5onofauser.
R ≈ PQ′, Ɛŝŵ(i, j) = qiq′j .
14
rui = µ+ bu + bi +
⎛
⎝ 1
|U|α∑
j∈Uruiqj
⎞
⎠ q′i.
ASMALLDETOUR
15
• Chemicalgene5csisapromisingapproachforstudyingbiologicalsystems.
• Ithasanumberofkeyadvantagesoverapproachesbasedonmoleculargene5cs:– smallmoleculescanworkrapidly,– theirac5onisreversible,– canmodulatesinglefunc5onsofmul5-func5onproteins,– candisruptprotein-proteininterac5ons,and– ifthetargetispharmaceu5calrelevant,itcanleadtothediscoveryofnewdrugs.
Chemical Gene,cs (Genomics): The research field that is designed to discover and synthesizeprotein-binding small organicmolecules that can alter the func5on of all the proteins and usethemtostudybiologicalsystems.
(Na5onalIns5tuteofGeneralMedicalSciences)
Chemicalgene5cs(genomics)
16
RS&CG–Datasimilari5es
Target-ligandac7vitymatrix User-itemra7ngmatrix
Target-intrinsicinforma5on• sequence,structure,homology,etc.
User-intrinsicinforma5on• profile,socialnetwork,etc.
Ligand-intrinsicinforma5on• chemicalstructure,pharmacophores,etc.
Item-intrinsicinforma5on• descrip5ons,content,specifica5on,aRributes,etc.
Ac5vityinforma5on• IC50,dose-response,etc.
Transac5oninforma5on• ra5ng,view,review,etc.
17
RS&CG–Similarproblems
• Ac5vitypredic5on=>Ra5ngpredic5on
• Secondaryscreeninglibrarydesign=>Top-Nrecommenda5on
• Librarydesignforanoveltarget=>Cold-startrecommenda5on
• Denovocompounddesign=>Assortmentrecommenda5on
18
RS&CG–Similarprinciples
• Ligandbindingisaprocessthatinvolves:– Structureofprotein’sbindingsiteandthestructureoftheligand.– Non-covalentinterac5onsbetweentheatomsoftheprotein’s
bindingsiteandthatoftheligand.
• Asaresult:– Thesameligandwillbindtosimilartargets.– Similarligandswillbindtothesametarget.– Similarligandswillbindtosimilartargets.
• Thesearethesameprinciplesandassump5onsbehindrecommendersystems.
19
RS&CG–Similarmethods
• Target-specificStructure-Ac5vityRela5onship(SAR)models.– Thebiologicalac5vityofachemical
compoundismathema5callyexpressedasafunc5onofitschemicalstructure.
• Chemogenomicapproaches.– Proteinsofthesamefamilytendto
bindtoligandswithcertaincommoncharacteris5cs.
• Mul5-tasklearningapproaches.– Modelsthates5materela5ons
betweenprotein-andligand-derivedfeatures.
• Personalizedcontent-basedrecommendersystems.– Theuser’spasttransac5onsare
usedtoderivecontent-basedmodelsforlike/dislikepredic5on.
• Cluster-basedrecommendersystems.– Content-basedmodelsforsimilar
groupsofusers.
• Collabora5vefilteringapproacheswithsideinforma5on.
20
RECENTDEVELOPMENTSINITEM-BASEDAPPROACHES
21
Es5ma5ngthesimilaritymatrix
Goal:Insteadofusingapre-determinedfunc5ontocomputetheitem-itemsimilari5es,“learn”themdirectlyfromthedata.
Learningproblemformula5on:Es5matealinearmodeltopredicteachitembasedonthehistoricalinforma5onoftheotheritems.
>Ğƚ wi ďĞ ƚŚĞ ůŝŶĞĂƌ ŵŽĚĞů ĨŽƌ ŝƚĞŵ i ůĞƚ R¬i ďĞ ƚŚĞ n (m 1)ƐƵďͲŵĂƚƌŝdž ŽĨ R ŽďƚĂŝŶĞĚ ďLJ ƌĞŵŽǀŝŶŐ ĐŽůƵŵŶ i ;ŝĞ ƚŚĞ ŚŝƐƚŽƌŝĐĂůŝŶĨŽƌŵĂƟŽŶ ĨŽƌ ŝƚĞŵ iͿ ĂŶĚ ůĞƚ r:i ďĞ ƚŚĞ iƚŚ ĐŽůƵŵŶ ŽĨ RdŚĞ wi ŝƐ ĞƐƟŵĂƚĞĚ ĂƐ
ŵŝŶŝŵŝnjĞwi
1
2||r:i R¬iwi||2 + ƌĞŐ(wi)
.
dŚŝƐ ŵŽĚĞů ďĞĐŽŵĞƐ ƚŚĞ iƚŚ ĐŽůƵŵŶ ŽĨ ƚŚĞ ŝƚĞŵͲŝƚĞŵ ƐŝŵŝůĂƌŝƚLJ ŵĂͲƚƌŝdž S
r:i
R
22
SLIM–SparseLinearMethod
• Themodel(S)ises5matedas:
• Goodperformanceisopenachievedwith50-200non-zerospercolumn.– Lowstoragerequirementsandlowrecommenda5on5me.
• Themodelcanbees5matedefficiently:– EachcolumnofScanbecomputedindependently.– Thesolu5onofacolumncan“warmstart”othersimilarcolumns.– Regulariza5onspacecanbeexploredefficiently.– Itemneighborscanbeusedforini5al“feature”selec5onandrestrictsparsity
structure.
ThisisaStructuralEqua5onModel(SEM)withnoexogenousvariables.Itcanalsobeviewedasanetworkinferencemodel.
ŵŝŶŝŵŝnjĞS
1
2||R RS||2F +
2||S||2F + ||S||1
ƐƵďũĞĐƚ ƚŽ S 0
ĚŝĂŐ(S) = 0
23
SLIM–Performance
HitRate@10
AverageReciprocalHitRate
Hit Rate (HR) @ N =tp
N
2 A Comprehensive Survey of Neighborhood-Based Recommendation Methods 43
by their “interestingness” to u. If the test set is built by randomly selecting, for eachuser u, a single item iu of Iu, the performance of L can be evaluated with the AverageReciprocal Hit-Rank (ARHR):
ARHR.L/ D 1
jUjX
u2U
1
rank.iu;L.u//; (2.5)
where rank.iu;L.u// is the rank of item iu in L.u/, equal to1 if iu 62 L.u/. A moreextensive description of evaluation measures for recommender systems can be foundin Chap. 8 of this book.
2.3 Neighborhood-Based Recommendation
Recommender systems based on neighborhood automate the common principle thatsimilar users prefer similar items, and similar items are preferred by similar users.To illustrate this, consider the following example based on the ratings of Fig. 2.1.
Example 2.1. User Eric has to decide whether or not to rent the movie “Titanic”that he has not yet seen. He knows that Lucy has very similar tastes when it comesto movies, as both of them hated “The Matrix” and loved “Forrest Gump”, so heasks her opinion on this movie. On the other hand, Eric finds out he and Diane havedifferent tastes, Diane likes action movies while he does not, and he discards heropinion or considers the opposite in his decision.
2.3.1 User-Based Rating Prediction
User-based neighborhood recommendation methods predict the rating rui of a useru for a new item i using the ratings given to i by users most similar to u, callednearest-neighbors. Suppose we have for each user v ¤ u a value wuv representingthe preference similarity between u and v (how this similarity can be computed will
TheTitanic
Die ForrestWall-E
Matrix Hard Gump
John 5 1 2 2Lucy 1 5 2 5 5Eric 2 ? 3 5 4
Diane 4 3 5 3
Fig. 2.1 A “toy example” showing the ratings of four users for five movies
24
SLIM–Long-tailperformance
Ra5ngDistribu5oninML10M
HR@10inML10MLongTail
ARHRinML10MLongTail
25
SLIMextensions
• Low-rankconstraintsonSasanalternatewaytocontrolmodelcomplexity.– Eitherviatheproductoftwolowrankmatricesorbyminimizing
thenuclearnormofS.• Incorpora5onofitemsideinforma5on.• Incorpora5onofdifferentcontexts.• Incorpora5onoftemporalinforma5on.• Higher-orderregressionmodels.• FusionofglobalandlocalSLIMmodels.
ŵŝŶŝŵŝnjĞS
1
2||R RS||2F +
2||S||2F + ||S||1
ƐƵďũĞĐƚ ƚŽ S 0
ĚŝĂŐ(S) = 0
26
Factoredsimilaritymatrix
• SLIM(andtradi5onalitem-itemapproaches)cannotcompute/learnrela5onsbetweenpairsofitemsthatarenotco-rated.– Thees5matedsimilari5esforsuchpairofitemswill
alwaysbe0.
• Theycannotproducemeaningfulrecommenda5onsthatrelyontransi5vitywithintheitem-itemsimilaritygraph.
• Canleadtopoorrecommenda5ons,especiallyforsparserdatasets,inwhichtherearefewpairsofco-rateditems.
r:i
R
27
FISM–FactoredSLIM
0
10
20
30
40
50
60
Sparse(-1) Sparser(-2) Sparsest(-3)
Percen
tageIm
provem
ent
ML100K NeUlix Yahoo
Performanceoverbestcompe5ngmethodamong(SLIM,itemKNN,PureSVD,BPRMF,BPRkNN)
rui =X
j2Ru
rujpjqTi
ŵŝŶŝŵŝnjĞP,Q
!1
2||R−RPQT ||2F +
β
2(||P ||2F + ||Q||2F )
"
Sincetheoverallproblemisnotconvex,op5miza5onandsearchovertheregulariza5onspacebecomesconsiderablymoreexpensive.ScalableapproachesrelyonSGD.
28
Learningfromhigher-orderrela5ons
l Ifacustomerbuysacertaingroupofitems,theyaremore(orless)likelytobuysomeotheritems.
l Thejointdistribu5onofasetofitemscanbedifferentfromthedistribu5onsoftheindividualitemsintheset.
l Higherordermodelsareneededtocapturesuchrela5ons.
29
HOSLIM—Higher-orderSLIM
• KeyIdea:Usefrequentitemsetstocapturehigherorderrela5ons.• Two-stepapproach:
– Givenauser-itempurchasematrixR,findallfrequentitemsetsandcreateauser-itemsetmatrixRf.
– LearntwosimilaritymatricesSandSfthatcaptureitem-itemanditemset-itemrela5ons.
• Predictnewitemsbycombininginforma5onfromSandSf.
Modeles5ma5on:
Predic5on: rui = ru: s:i + rfu: s
f:i
ŵŝŶŝŵŝnjĞS
1
2||R RS RfSf ||2F +
2(||S||2F + ||Sf ||2F ) + (||S||1 + ||Sf ||1)
ƐƵďũĞĐƚ ƚŽ S 0, Sf 0,
ĚŝĂŐ(S) = 0, ĂŶĚ sfji = 0 ĨŽƌ i Ij
30
HOSLIMperformance
Considerablegainscanbeobtainedfordatasetsthathavesuchcharacteris5cs.
31
Onemodeldoesnotfitall
• Initem-basedmethods,personaliza5onoccursbasedontheitemsthattheuserhaspreviouslyactedon.– The“recommenda5ons”thattheseitemstriggerarenotspecifictoeachuserbutareglobal.
• BeRerperformancecanpoten5allybeachievedbyhavinguser-specificitem-basedmodels.
32
GLSLIM–FusionofglobalandlocalSLIMmodels
ŵŝŶŝŵŝnjĞS,S1,...,SK ,c,g
1
2
ui
(rui rui)2 +
2(||S||2F +
k
||Sk||2F ) + (||S||1 +
k
||Sk||1)
ƐƵďũĞĐƚ ƚŽ S 0, Sk 0 ĨŽƌ k = 1, . . . , K,
ĚŝĂŐ(S) = 0, ĚŝĂŐ(Sk) = 0 ĨŽƌ k = 1, . . . , K,
u, 0 gu 1, ĂŶĚu, cu 1, . . . , K.
globalmodel
localmodel
user’smembership
rui =!
j∈Ru
ruj"gusji + (1− gu)s
cuji
#
Op5mizedusingalternateop5miza5onbetweensolvingforthevariousSLIMmodelsandsolvingfortheclusteringandusermembership.
33
S1
S2
S
GLSLIM–FusionofglobalandlocalSLIMmodels
15%
11%
1%
7%
18%
12%
6%8%
0%
5%
10%
15%
20%
groceries(65) ml10m(20) flixter(5) jester(60)
ImprovementofGL-SLIMoverSLIM
HR ARHR
34
GLSLIM–FusionofglobalandlocalSLIMmodels
Smartini5aliza5onreducesthenumberofitera5onsbutdoesnotsignificantlyimpactthefinalsolu5on.
35
WRAPPINGUP
36
Currentstateoftheart
• Ra5ngpredic5on– Factoriza5on-basedapproachesarehighlyeffec5ve:
• Goodpredic5onperformance,fasttraining5me,theycanincorporatesideinforma5on,diverseobjec5ves,etc.
• Top-Nrecommenda5onmethods– Noclearwinningstrategyhasemerged.
• Item-basedmethodsoutperformsignificantlymoresophis5catedmethods.
– Thereissignificantongoingresearchonrankinglossfunc5ons.
37
Scalingup
• Es5mateonlyitemfactorsfromasubsetofusers&computeuserfactorsonthefly.
• Warm-startthesearchovertheregulariza5onspace:– ForMFuseSVDtoeliminatebadlocalminima.– ForSLIMuseprevioussolu5onsand/orsolu5onsforsimilaritems.
• Sampletheusers.Thegoalistogetareliablesetofitem-basedmodels:– Itemfactorsoritem-itemmodels.
38
Futuredirec5ons
• Deeppersonaliza5on– Context,loca5on,5me,etc.
• Behaviorsteering– Acquiringtaste,modelingstate,futurebenefits,etc.
• Contentcrea5on– Packagerecommenda5on,ar5clegenera5on,etc.
• Liveevalua5ons– OpenA/Btes5ngplaYorms.
39
Finalwords…
• Recommendersystemshaveextensiveapplica5ons.– Bothcommercial,scien5fic,andsocietalapplica5ons.
• Therearealreadyhigh-qualitysopwareimplementa5onsofmanyofthesealgorithms.– Theycanbeusedasis,orusedtoquicklyexperimentwithnewmodelingapproachesanddatasources.
• Mul5-tasklearningunderliesmanyofthelearningmethods.– Averyac5veareaofresearchwithbroadapplica5ons.
• Thefieldisripefornewmethodologicaladvancesthatwillgetustothenextlevel.
40
Acknowledgments
Students(current&former)§ XiaNing§ EvangeliaChristakopoulou§ SantoshKabbur§ AsmaaElBadrawy§ AgoritsaPolyzou
Funding§ NSF,ARO,NIH,PayPal,Samsung,Intel.References:
Recommender Systems Handbook, 2015 http://www.springer.com/us/book/9781489976369
Papers at hRp://www.cs.umn.edu/~karypis
41