Context-aware Image Tweet Modelling and Recommendation Tao Chen, Xiangnan He and Min-Yen Kan
School of Computing, National University of Singapore
Introduction ResearchQues+ons• Howtorepresenttheseman.csofanimagetweet?
• Arevisualobjectssufficient?• Howtou.lizesuchrepresenta.onsforpersonalizedimagetweet
recommenda.ontask?
ACM MM 2016, Amsterdam, The Netherlands
Personalized Image Tweet Recommendation
CITING: Context-aware Image Tweet Modelling
Tao Chen: [email protected]
Experiments
Comingsoon!!!imdb.to/IGxE9f
1.Hashtagenhancedtext URLs
3.Extendedcontext
WordBreaker
2.TextinImage
OCRTool GoogleImageSearch
4.Similarcontexts
Twi@erUsers Retweets AllTweets Ra+ngs Training
926 174,765 1,316,645 1,592,837
Test 9,021 77,061 82,743
Method Features P@1 P@3 P@5 MAP 1 Random 0.114** 0.115 0.115 0.156**
2 Length Post’s text 0.176** 0.158 0.150 0.173**
3 Profiling Post’s text 0.336** 0.227 0.197 0.202**
4 FAMF Visual objects 0.211** 0.205 0.192 0.211**
5 FAMF Post’s text 0.359* 0.325 0.287 0.275**
6 FAMF Non-filtered context 0.413 0.352 0.319 0.296
7 FAMF CITING 0.419 0.355 0.319 0.298
8 FAMF CITING + Visual objects 0.425 0.350 0.313 0.298
1 1 0 0
1 1 0 0
0 1 1 0
1 1 0 ?
? ?
? ?
? ?
? ?
I1 I2 I3 I4 I5 I6
U1
U2
U3
U4
Coldstart
OurContribu+ons• WeproposeaCITINGframeworktomodelimagetweetbyitscontextual
text• Wedevelopafeature-awarematrixfactoriza.onmodeltocaptureuser’s
personalinterest• Wehavereleasedcodeanddatasets:hVps://github.com/kite1988/famf
• Conflatethevariantsofhashtags• #icebucket,#ALSIceBucketChallenge
• Overlaidtextandscreenshotofpuretext
82%containtheimage
22.7%
24.6%
14.3%
icebucket ALSicebucketchallenge
70.6%Nameden.ty
Bestguess
PagesthatcontaintheimageProposefilteredrulestofusethefourcontextualtext
• Textquality: 1>3>2>4• Saveacquisi.oncostby18%andimproverepresenta.onquality
• Thetradi.onalMatrixFactoriza.onmodeldoesnotworkwellduetoseriouscoldstartproblems
• Toalleviatethis,weproposefeature-awareMatrixFactoriza.on(FAMF)tomodeluser’sinterest• Decomposeuser-iteminterac.ontouser-feature
interac.on
• Pair-wiseLearningtoRank• Posi.vetweets(retweets)shouldberankedbeVerthan
nega.vetweets(non-retweets)• BayesianPersonalizedRanking[Rendleetal.2009]• Infertheparametersviastochas.cgradientdescent(SGD)
Item’s latent factor User’s latent factor
N types of features(e.g., CITING text)
A feature’s latent factor
DatasetfromTwi@er
Evalua+on• Foreachuser,keeptherecent10retweetsastestsetandthe
restastrainingset• Reportmeanaverageprecision(MAP)andprecisionattop
posi.ons
• 4-8vs.1-3:FAMFiseffec.veinmodellinguserinterest• 4vs.5:VisualobjectsarenotsufficienttomodelTwiVerimage’s
seman.cs• 7vs.other:OurCITINGtextsignificantlyoutperformstheothers• 7vs.6:Thefilteredfusionrulesimprovetextquality• 8vs.7:Thefurtherincorpora.onofvisualobjectsdoes
consistentlyimprovetheperformance
**: p<0.01, *: p<0.05