department of computer science & engineering university of ...€¦ · user eric has to decide...

Post on 17-Oct-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

RecentAdvancesinRecommenderSystemsandFutureDirec5ons

GeorgeKarypis DepartmentofComputerScience&EngineeringUniversityofMinnesota

1

OVERVIEWOFRECOMMENDERSYSTEMS

2

RecommenderSystems

Recommendersystemshaveemergedasakeyenablingtechnologyforecommerce.

–  Virtualexpertsthatarekeenlyawareofacustomer’spreferences.–  Theyareusedtofiltervastamountsofdatainordertoiden5fyinforma5onthatisrelevanttoauser.

Recommender system Recommendation

3

Datasources

•  Informa5onabouttheitems:–  Productdescrip5ons,productspecifica5ons,ar5clecontent,media

aRributes,etc.

•  Informa5onabouttheusers:–  Profiledatasuchasdemographic,social-economic,andbehavioral

informa5on.–  Social/trustnetworks;eitherimplicitorexplicit.

•  Transac5onalinforma5on:–  Historyofuser-itempurchases.–  Productra5ngs,productreviews.–  Implicitvs.explicitinforma5on.

4

Problems&approaches

•  Ra5ngpredic5on–  Predictthera5ngthatauserwillgivetoa

par5cularitem.

•  Top-Nrecommenda5on–  Iden5fyasetofitemsthattheuserwilllike.

•  Cold-startrecommenda5ons–  Computerecommenda5onsforpreviously

unseenusersand/oritems.

•  Grouprecommenda5ons–  Computerecommenda5onsthatsa5sfya

groupofusers.

•  Assortmentrecommenda5ons–  Recommendanautoma5callyconstructedassortmentofitems:e.g.,vaca5onpackage,ouYit,playlist.

•  Contextawarerecommenda5ons.

•  Non-personalizedmethods–  Rule-basedsystems,popularity-based

models,methodsbasedonglobalmodels.

•  Methodsrelyingonauser’shistoricalprofile–  Predic5vemodelses5matedusingauser’s

individualpreferences.

•  Collabora5vefilteringmethods–  Approachesthatleverageinforma5onfrom

otherusersanditems.

Recommenda7onproblems Recommenda7onapproaches

5

Collabora5veFiltering(CF)

§  Derivesrecommenda5onsbyexploi5ngthe“wisdomofthecrowd”.

§  Itcanbeviewedasaninstanceofmul5-tasklearningandtransferlearning.

§  Itisthemostprominentapproachtoday:–  Usedbylarge,commerciale-commercesites.–  Well-understood,variousalgorithmsandvaria5onsexist.–  Applicableinmanydomains(book,movies,DVDs,..).–  Itcanoperatebothinacontentagnos5candcontent

awareseang.–  Itcanhandlebothexplicitandimplicitpreference

informa5on.

6

OverviewofCFmethods

•  User-,item-,andgraph-basedneighborhoodmethods.–  Lazylearners.

•  Methodsthatusevariousmachinelearningapproachestolearnapredic5vemodelfromhistoricaldata.–  Latent-spacemodelsbasedonmatrix/tensorcomple5on.–  Linearandnon-linearmul5-regressionmodels.–  Probabilis5cmodels.–  Auto-encoder-basedneuralnetworks.–  ….

7

Item-basednearest-neighbor(1)

Thebasictechnique:–  GivenEric(“ac5veuser”)andTitanic(“ac5veitem”)thatErichasnotyet

seen.–  Findasetofitems(nearestneighbors)thatEricalreadyratedsuchthat

otheruserslikedeachoftheminasimilarfashionastheylikedTitanic.

–  UseEric’sra5ngsontheseitemstopredicthowmuchEricwilllikeTitanic.

Fortop-N,dothisforeveryunrateditemandreturntheNitemswiththehighestpredictedra5ng.

2 A Comprehensive Survey of Neighborhood-Based Recommendation Methods 43

by their “interestingness” to u. If the test set is built by randomly selecting, for eachuser u, a single item iu of Iu, the performance of L can be evaluated with the AverageReciprocal Hit-Rank (ARHR):

ARHR.L/ D 1

jUjX

u2U

1

rank.iu;L.u//; (2.5)

where rank.iu;L.u// is the rank of item iu in L.u/, equal to1 if iu 62 L.u/. A moreextensive description of evaluation measures for recommender systems can be foundin Chap. 8 of this book.

2.3 Neighborhood-Based Recommendation

Recommender systems based on neighborhood automate the common principle thatsimilar users prefer similar items, and similar items are preferred by similar users.To illustrate this, consider the following example based on the ratings of Fig. 2.1.

Example 2.1. User Eric has to decide whether or not to rent the movie “Titanic”that he has not yet seen. He knows that Lucy has very similar tastes when it comesto movies, as both of them hated “The Matrix” and loved “Forrest Gump”, so heasks her opinion on this movie. On the other hand, Eric finds out he and Diane havedifferent tastes, Diane likes action movies while he does not, and he discards heropinion or considers the opposite in his decision.

2.3.1 User-Based Rating Prediction

User-based neighborhood recommendation methods predict the rating rui of a useru for a new item i using the ratings given to i by users most similar to u, callednearest-neighbors. Suppose we have for each user v ¤ u a value wuv representingthe preference similarity between u and v (how this similarity can be computed will

TheTitanic

Die ForrestWall-E

Matrix Hard Gump

John 5 1 2 2Lucy 1 5 2 5 5Eric 2 ? 3 5 4

Diane 4 3 5 3

Fig. 2.1 A “toy example” showing the ratings of four users for five movies

Keyassump5ons:•  Itemsbelonginto(overlapping)groupsthatelicitsimilarlikes/dislikes.

• Users’preferencesremainstableandconsistentover5me.

8

Item-basednearest-neighbor(2)

•  Item-basedmethodspre-computeandstorethekmostsimilarotheritemsforeachitem.

•  Thepredic5onscoresarees5matedusingonlythosemostsimilaritems.

•  Similari5es:cosine,Jaccard,correla5oncoeeficient,condi5onalprobability,…

9

rui = f(ru:, s:i) ;ĞŐ rui = ru: s:iͿ

ǁŚĞƌĞ

R Rnm ŝƐ ƚŚĞ ŵĂƚƌŝdž ŽĨ ŚŝƐƚŽƌŝĐĂů ƌĂƟŶŐƐ ĂŶĚ

S Rmm ŝƐ ƚŚĞ ŝƚĞŵͲŝƚĞŵ ƐŝŵŝůĂƌŝƚLJ ŵĂƚƌŝdž ƐƚŽƌŝŶŐ ƚŚĞ kŚŝŐŚĞƐƚ ƐŝŵŝůĂƌŝƟĞƐ ĂůŽŶŐ ĞĂĐŚ ƌŽǁ

Low-rankmatrixapproxima5on

Thera5ngmatrixcanbeapproximatedastheproductoftwolow-rankmatrices:

rui = puq0i

Thislow-rankdecomposi5oncanbeusedtocomputethera5ngsforunseenitemsas:

10

Interpreta5onoflatentfactors

Thereisalowdimensionalfeaturespace(whosedimensionalityistherankofthedecomposi5on)onwhichbothusersanditemscanbeembeddedin.Latent variable view

Geared towards females

Geared towards males

serious

escapist

The PrincessDiaries

The Lion King

Braveheart

Lethal Weapon

Independence Day

AmadeusThe Color Purple

Dumb and Dumber

Ocean’s 11

Sense and Sensibility

Gus

Dave

Latent factor models

•  Theitemvectorcanbethoughtofascapturinghowmuchofeachofthelatentfeaturestheitempossess.

•  Theuservectorcanbethoughtofascapturingtheuser’spreferencefortheselatentfeatures.

11

Latentfactorsviamatrixcomple5on

•  Es5matethelatentfactormatricesbasedonlyontheobservedentriesofthematrix.

•  Thisisanon-convexop5miza5onproblemandcanconvergetoalocalminima.

>Ğƚ ďĞ ƚŚĞ ƐĞƚ ŽĨ ŽďƐĞƌǀĞĚ ĞŶƚƌŝĞƐ ŝŶ ƚŚĞ ƌĂƟŶŐ ŵĂƚƌŝdž RdŚĞ P ĂŶĚ Q ĨĂĐƚŽƌƐ ĂƌĞ ĞƐƟŵĂƚĞĚ ďLJ ƐŽůǀŝŶŐ ƚŚĞ ĨŽůůŽǁŝŶŐŽƉƟŵŝnjĂƟŽŶ ƉƌŽďůĞŵ

ŵŝŶŝŵŝnjĞP,Q

=

(u,i)

(rui puqi)

2.

12

Improvingaccuracyoflatentfactormodels

Theperformanceofthera5ngpredic5oncanbeimprovedby

–  Explicitlymodelingglobal,user,anditembiases•  Biases:thebaseline/expectedra5ngsforallitemsbyallusers,fortheitemsratedbyauser,andfortheusersforagivenitem.

–  Reducingmodelover-fiangviaregulariza5on:

ŵŝŶŝŵŝnjĞP,Q,µ,b∗

=!

(u,i)∈Ω

(rui−µ− bu− bi− puq′i)

2+λ(µ+ ||b∗||22+ ||P ||2F + ||Q||2F ).

dŚĞ ĞƐƟŵĂƚĞĚ ƌĂƟŶŐ ĨŽƌ ƵƐĞƌ u ŽŶ ŝƚĞŵ i ŝƐ ŐŝǀĞŶ ďLJrui = µ + bu + bi + pq

13

Recenttrends

•  Theboundariesbetweentradi5onalneighborandlatentfactormodelshavebecomelessseparated.

•  Neighborschemesrelyonthelatentspacetocomputeitem-itemsimilari5es.

•  Latentfactoriza5onmethodsrelyonsimilaritemstoderivealatentrepresenta5onofauser.

R ≈ PQ′, Ɛŝŵ(i, j) = qiq′j .

14

rui = µ+ bu + bi +

⎝ 1

|U|α∑

j∈Uruiqj

⎠ q′i.

ASMALLDETOUR

15

•  Chemicalgene5csisapromisingapproachforstudyingbiologicalsystems.

•  Ithasanumberofkeyadvantagesoverapproachesbasedonmoleculargene5cs:–  smallmoleculescanworkrapidly,–  theirac5onisreversible,–  canmodulatesinglefunc5onsofmul5-func5onproteins,–  candisruptprotein-proteininterac5ons,and–  ifthetargetispharmaceu5calrelevant,itcanleadtothediscoveryofnewdrugs.

Chemical Gene,cs (Genomics): The research field that is designed to discover and synthesizeprotein-binding small organicmolecules that can alter the func5on of all the proteins and usethemtostudybiologicalsystems.

(Na5onalIns5tuteofGeneralMedicalSciences)

Chemicalgene5cs(genomics)

16

RS&CG–Datasimilari5es

Target-ligandac7vitymatrix User-itemra7ngmatrix

Target-intrinsicinforma5on•  sequence,structure,homology,etc.

User-intrinsicinforma5on•  profile,socialnetwork,etc.

Ligand-intrinsicinforma5on•  chemicalstructure,pharmacophores,etc.

Item-intrinsicinforma5on•  descrip5ons,content,specifica5on,aRributes,etc.

Ac5vityinforma5on•  IC50,dose-response,etc.

Transac5oninforma5on•  ra5ng,view,review,etc.

17

RS&CG–Similarproblems

•  Ac5vitypredic5on=>Ra5ngpredic5on

•  Secondaryscreeninglibrarydesign=>Top-Nrecommenda5on

•  Librarydesignforanoveltarget=>Cold-startrecommenda5on

•  Denovocompounddesign=>Assortmentrecommenda5on

18

RS&CG–Similarprinciples

•  Ligandbindingisaprocessthatinvolves:–  Structureofprotein’sbindingsiteandthestructureoftheligand.–  Non-covalentinterac5onsbetweentheatomsoftheprotein’s

bindingsiteandthatoftheligand.

•  Asaresult:–  Thesameligandwillbindtosimilartargets.–  Similarligandswillbindtothesametarget.–  Similarligandswillbindtosimilartargets.

•  Thesearethesameprinciplesandassump5onsbehindrecommendersystems.

19

RS&CG–Similarmethods

•  Target-specificStructure-Ac5vityRela5onship(SAR)models.–  Thebiologicalac5vityofachemical

compoundismathema5callyexpressedasafunc5onofitschemicalstructure.

•  Chemogenomicapproaches.–  Proteinsofthesamefamilytendto

bindtoligandswithcertaincommoncharacteris5cs.

•  Mul5-tasklearningapproaches.–  Modelsthates5materela5ons

betweenprotein-andligand-derivedfeatures.

•  Personalizedcontent-basedrecommendersystems.–  Theuser’spasttransac5onsare

usedtoderivecontent-basedmodelsforlike/dislikepredic5on.

•  Cluster-basedrecommendersystems.–  Content-basedmodelsforsimilar

groupsofusers.

•  Collabora5vefilteringapproacheswithsideinforma5on.

20

RECENTDEVELOPMENTSINITEM-BASEDAPPROACHES

21

Es5ma5ngthesimilaritymatrix

Goal:Insteadofusingapre-determinedfunc5ontocomputetheitem-itemsimilari5es,“learn”themdirectlyfromthedata.

Learningproblemformula5on:Es5matealinearmodeltopredicteachitembasedonthehistoricalinforma5onoftheotheritems.

>Ğƚ wi ďĞ ƚŚĞ ůŝŶĞĂƌ ŵŽĚĞů ĨŽƌ ŝƚĞŵ i ůĞƚ R¬i ďĞ ƚŚĞ n (m 1)ƐƵďͲŵĂƚƌŝdž ŽĨ R ŽďƚĂŝŶĞĚ ďLJ ƌĞŵŽǀŝŶŐ ĐŽůƵŵŶ i ;ŝĞ ƚŚĞ ŚŝƐƚŽƌŝĐĂůŝŶĨŽƌŵĂƟŽŶ ĨŽƌ ŝƚĞŵ iͿ ĂŶĚ ůĞƚ r:i ďĞ ƚŚĞ iƚŚ ĐŽůƵŵŶ ŽĨ RdŚĞ wi ŝƐ ĞƐƟŵĂƚĞĚ ĂƐ

ŵŝŶŝŵŝnjĞwi

1

2||r:i R¬iwi||2 + ƌĞŐ(wi)

.

dŚŝƐ ŵŽĚĞů ďĞĐŽŵĞƐ ƚŚĞ iƚŚ ĐŽůƵŵŶ ŽĨ ƚŚĞ ŝƚĞŵͲŝƚĞŵ ƐŝŵŝůĂƌŝƚLJ ŵĂͲƚƌŝdž S

r:i

R

22

SLIM–SparseLinearMethod

•  Themodel(S)ises5matedas:

•  Goodperformanceisopenachievedwith50-200non-zerospercolumn.–  Lowstoragerequirementsandlowrecommenda5on5me.

•  Themodelcanbees5matedefficiently:–  EachcolumnofScanbecomputedindependently.–  Thesolu5onofacolumncan“warmstart”othersimilarcolumns.–  Regulariza5onspacecanbeexploredefficiently.–  Itemneighborscanbeusedforini5al“feature”selec5onandrestrictsparsity

structure.

ThisisaStructuralEqua5onModel(SEM)withnoexogenousvariables.Itcanalsobeviewedasanetworkinferencemodel.

ŵŝŶŝŵŝnjĞS

1

2||R RS||2F +

2||S||2F + ||S||1

ƐƵďũĞĐƚ ƚŽ S 0

ĚŝĂŐ(S) = 0

23

SLIM–Performance

HitRate@10

AverageReciprocalHitRate

Hit Rate (HR) @ N =tp

N

2 A Comprehensive Survey of Neighborhood-Based Recommendation Methods 43

by their “interestingness” to u. If the test set is built by randomly selecting, for eachuser u, a single item iu of Iu, the performance of L can be evaluated with the AverageReciprocal Hit-Rank (ARHR):

ARHR.L/ D 1

jUjX

u2U

1

rank.iu;L.u//; (2.5)

where rank.iu;L.u// is the rank of item iu in L.u/, equal to1 if iu 62 L.u/. A moreextensive description of evaluation measures for recommender systems can be foundin Chap. 8 of this book.

2.3 Neighborhood-Based Recommendation

Recommender systems based on neighborhood automate the common principle thatsimilar users prefer similar items, and similar items are preferred by similar users.To illustrate this, consider the following example based on the ratings of Fig. 2.1.

Example 2.1. User Eric has to decide whether or not to rent the movie “Titanic”that he has not yet seen. He knows that Lucy has very similar tastes when it comesto movies, as both of them hated “The Matrix” and loved “Forrest Gump”, so heasks her opinion on this movie. On the other hand, Eric finds out he and Diane havedifferent tastes, Diane likes action movies while he does not, and he discards heropinion or considers the opposite in his decision.

2.3.1 User-Based Rating Prediction

User-based neighborhood recommendation methods predict the rating rui of a useru for a new item i using the ratings given to i by users most similar to u, callednearest-neighbors. Suppose we have for each user v ¤ u a value wuv representingthe preference similarity between u and v (how this similarity can be computed will

TheTitanic

Die ForrestWall-E

Matrix Hard Gump

John 5 1 2 2Lucy 1 5 2 5 5Eric 2 ? 3 5 4

Diane 4 3 5 3

Fig. 2.1 A “toy example” showing the ratings of four users for five movies

24

SLIM–Long-tailperformance

Ra5ngDistribu5oninML10M

HR@10inML10MLongTail

ARHRinML10MLongTail

25

SLIMextensions

•  Low-rankconstraintsonSasanalternatewaytocontrolmodelcomplexity.–  Eitherviatheproductoftwolowrankmatricesorbyminimizing

thenuclearnormofS.•  Incorpora5onofitemsideinforma5on.•  Incorpora5onofdifferentcontexts.•  Incorpora5onoftemporalinforma5on.•  Higher-orderregressionmodels.•  FusionofglobalandlocalSLIMmodels.

ŵŝŶŝŵŝnjĞS

1

2||R RS||2F +

2||S||2F + ||S||1

ƐƵďũĞĐƚ ƚŽ S 0

ĚŝĂŐ(S) = 0

26

Factoredsimilaritymatrix

•  SLIM(andtradi5onalitem-itemapproaches)cannotcompute/learnrela5onsbetweenpairsofitemsthatarenotco-rated.–  Thees5matedsimilari5esforsuchpairofitemswill

alwaysbe0.

•  Theycannotproducemeaningfulrecommenda5onsthatrelyontransi5vitywithintheitem-itemsimilaritygraph.

•  Canleadtopoorrecommenda5ons,especiallyforsparserdatasets,inwhichtherearefewpairsofco-rateditems.

r:i

R

27

FISM–FactoredSLIM

0

10

20

30

40

50

60

Sparse(-1) Sparser(-2) Sparsest(-3)

Percen

tageIm

provem

ent

ML100K NeUlix Yahoo

Performanceoverbestcompe5ngmethodamong(SLIM,itemKNN,PureSVD,BPRMF,BPRkNN)

rui =X

j2Ru

rujpjqTi

ŵŝŶŝŵŝnjĞP,Q

!1

2||R−RPQT ||2F +

β

2(||P ||2F + ||Q||2F )

"

Sincetheoverallproblemisnotconvex,op5miza5onandsearchovertheregulariza5onspacebecomesconsiderablymoreexpensive.ScalableapproachesrelyonSGD.

28

Learningfromhigher-orderrela5ons

l  Ifacustomerbuysacertaingroupofitems,theyaremore(orless)likelytobuysomeotheritems.

l  Thejointdistribu5onofasetofitemscanbedifferentfromthedistribu5onsoftheindividualitemsintheset.

l  Higherordermodelsareneededtocapturesuchrela5ons.

29

HOSLIM—Higher-orderSLIM

•  KeyIdea:Usefrequentitemsetstocapturehigherorderrela5ons.•  Two-stepapproach:

–  Givenauser-itempurchasematrixR,findallfrequentitemsetsandcreateauser-itemsetmatrixRf.

–  LearntwosimilaritymatricesSandSfthatcaptureitem-itemanditemset-itemrela5ons.

•  Predictnewitemsbycombininginforma5onfromSandSf.

Modeles5ma5on:

Predic5on: rui = ru: s:i + rfu: s

f:i

ŵŝŶŝŵŝnjĞS

1

2||R RS RfSf ||2F +

2(||S||2F + ||Sf ||2F ) + (||S||1 + ||Sf ||1)

ƐƵďũĞĐƚ ƚŽ S 0, Sf 0,

ĚŝĂŐ(S) = 0, ĂŶĚ sfji = 0 ĨŽƌ i Ij

30

HOSLIMperformance

Considerablegainscanbeobtainedfordatasetsthathavesuchcharacteris5cs.

31

Onemodeldoesnotfitall

•  Initem-basedmethods,personaliza5onoccursbasedontheitemsthattheuserhaspreviouslyactedon.–  The“recommenda5ons”thattheseitemstriggerarenotspecifictoeachuserbutareglobal.

•  BeRerperformancecanpoten5allybeachievedbyhavinguser-specificitem-basedmodels.

32

GLSLIM–FusionofglobalandlocalSLIMmodels

ŵŝŶŝŵŝnjĞS,S1,...,SK ,c,g

1

2

ui

(rui rui)2 +

2(||S||2F +

k

||Sk||2F ) + (||S||1 +

k

||Sk||1)

ƐƵďũĞĐƚ ƚŽ S 0, Sk 0 ĨŽƌ k = 1, . . . , K,

ĚŝĂŐ(S) = 0, ĚŝĂŐ(Sk) = 0 ĨŽƌ k = 1, . . . , K,

u, 0 gu 1, ĂŶĚu, cu 1, . . . , K.

globalmodel

localmodel

user’smembership

rui =!

j∈Ru

ruj"gusji + (1− gu)s

cuji

#

Op5mizedusingalternateop5miza5onbetweensolvingforthevariousSLIMmodelsandsolvingfortheclusteringandusermembership.

33

S1

S2

S

GLSLIM–FusionofglobalandlocalSLIMmodels

15%

11%

1%

7%

18%

12%

6%8%

0%

5%

10%

15%

20%

groceries(65) ml10m(20) flixter(5) jester(60)

ImprovementofGL-SLIMoverSLIM

HR ARHR

34

GLSLIM–FusionofglobalandlocalSLIMmodels

Smartini5aliza5onreducesthenumberofitera5onsbutdoesnotsignificantlyimpactthefinalsolu5on.

35

WRAPPINGUP

36

Currentstateoftheart

•  Ra5ngpredic5on–  Factoriza5on-basedapproachesarehighlyeffec5ve:

• Goodpredic5onperformance,fasttraining5me,theycanincorporatesideinforma5on,diverseobjec5ves,etc.

•  Top-Nrecommenda5onmethods–  Noclearwinningstrategyhasemerged.

•  Item-basedmethodsoutperformsignificantlymoresophis5catedmethods.

–  Thereissignificantongoingresearchonrankinglossfunc5ons.

37

Scalingup

•  Es5mateonlyitemfactorsfromasubsetofusers&computeuserfactorsonthefly.

•  Warm-startthesearchovertheregulariza5onspace:–  ForMFuseSVDtoeliminatebadlocalminima.–  ForSLIMuseprevioussolu5onsand/orsolu5onsforsimilaritems.

•  Sampletheusers.Thegoalistogetareliablesetofitem-basedmodels:–  Itemfactorsoritem-itemmodels.

38

Futuredirec5ons

•  Deeppersonaliza5on–  Context,loca5on,5me,etc.

•  Behaviorsteering–  Acquiringtaste,modelingstate,futurebenefits,etc.

•  Contentcrea5on–  Packagerecommenda5on,ar5clegenera5on,etc.

•  Liveevalua5ons– OpenA/Btes5ngplaYorms.

39

Finalwords…

•  Recommendersystemshaveextensiveapplica5ons.–  Bothcommercial,scien5fic,andsocietalapplica5ons.

•  Therearealreadyhigh-qualitysopwareimplementa5onsofmanyofthesealgorithms.–  Theycanbeusedasis,orusedtoquicklyexperimentwithnewmodelingapproachesanddatasources.

•  Mul5-tasklearningunderliesmanyofthelearningmethods.–  Averyac5veareaofresearchwithbroadapplica5ons.

•  Thefieldisripefornewmethodologicaladvancesthatwillgetustothenextlevel.

40

Acknowledgments

Students(current&former)§  XiaNing§  EvangeliaChristakopoulou§  SantoshKabbur§  AsmaaElBadrawy§  AgoritsaPolyzou

Funding§  NSF,ARO,NIH,PayPal,Samsung,Intel.References:

Recommender Systems Handbook, 2015 http://www.springer.com/us/book/9781489976369

Papers at hRp://www.cs.umn.edu/~karypis

41

top related