Ranking Item Features by Mining Online User ItemInteractions
Sofiane Abbar2, Habibur Rahman1 2, Saravanan Thirumuruganathan1 2, Carlos Castillo2 and Gautam Das1 2
1The University of Texas at Arlington,2Qatar Computing Research Institute
Motivation
Sell Consume
Feedback
Business Owners
Social Media
Advertised
Products
User
I Business owners relies on user’s feedback for the sucess of their businesses.I It is important for them to understand what are the features which makes an item
popular.I User’s put feedback on items in the form of reviews, tags, likes or +1’s etc.I Can we leverage this infomration to find the ranking of features in an item ?I Can we find the global ranking or popularity of the features?
Motivating Example
A Few Good Men
10000
10
Sleepers
Tom Cruise
Kevin Bacon
Brad pitt
Bipartite Graph representing Items and Features.
Ranking of actors in Item (Goal 1).
Global ranking the actors(Goal 2).
Challenges
Actors VisitCount(Naive
Transfer)
Rank(Naive)
10000 2
10000 + 10 1
10 3
Ranking of actors using Tag-Cloud Approach.
I like the movie “A few Good Men”
because of “Tom Cruise”.
Users do not put elaborate reviewsabout his preference.
I User’s do not give any direct cue why an item is liked.I Transfer of visits/likes from items to feature leads to incorrect ranking.I Although Kevin Bacon gets the highest ranking, it doesn’t answer the question
why Sleepers gets so few likes.
Model
u
A Few Good Men(AF)
10000
10
hTC
uh
KB
hBP
Sleepers(SL)
WAF,TC
WSL,TC
WSL,KB
WAF,KB
WAF,BP
WSL,BP
I A typical user picks an actor according to the initial probability of an actorpreference, hj.
I Once selected an actor, user picks an item movie to the transistion probability Wij
I Hence, we can model this problem as Wh ≈ v.
Network Flow Modelling
u
A Few Good Men(AF)0.33
Sleepers(SL)
?
?
?
?
?
?
0.33
0.33
0.7
0.3
SinkSource
I We can model this problem as maximum flow in the network.I We assume there is a total flow of 1.0 inside the network. User is modelled as a
source from which an uniform flow is directed towards the features.I This algorithm finds feature to item transition matrix(W) by minimizing|V −Wh| error.
Solution(Aggregated User-Item Interaction)
1000
200
400
50
10
X X
X
X
X X
X
?
?
?
?
Items
Features
Items
Items
User User
v W h
If h is Unknown
1000
200
400
50
10
? ?
?
?
? ?
?
X
X
X
X
Items
Features
Items
Items
User User
v W h
If W is unknown
I If h is unknown:. Solve for h by minimizing ||v −Wh||2 using ordinary least squares such that∑|h|
i=1 hi = 1.I If W is unknown:. Solve for W by minimizing ||v −Wh||2 using ordinary least squares such that∑|h|
j=1Wij = 1, for all item i. .I Once we have found W and h we can find the ranking of features for each item.
Solution(Individual User-Item Interaction)
? ?
?
?
?
?
Items
FeaturesFeatures
Items
Users Users
V W H
? ?
? ?
?
? ?
I Solve for W and H using marginal Non-Negative Matrix Factorization withstochasticity and sparsity constraints.
I H gives individual users’ preference vector over features.I From H, we can calculate h, the global preference vector over features.This can
be further used to find the ranking of features for each item.
Experiments
I Dataset: We considered 1500 movies(items) , 3500 distinct actors(features)from movielens where each movie has rating at least 50.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
3 4 5
Pre
cis
ion
@1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
3 4 5 6 7 8 9
Pre
cis
ion
@1
pco(Prolificity cut -off)
FR-AGG-W-LS
FR-AGG-h-LS
FR-INDIV-MNMF
BLnb0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1
10
100
1000
10000
100000
5000 10000 15000 20000
Seco
nds
Seco
nds (
log sc
ale)
n
FR-AGG-W-LS
FR-AGG-h-LS
FR-INDIV-MNMF
FR-AGG-h-NF
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1
10
100
1000
10000
100000
5000 10000 15000 20000
Seco
nds
Seco
nds (
log sc
ale)
l
I Aggregated User-Item Interaction and Individual User-Item Interaction producedbetter ranking than baseline methods.
I Networkflow outperforms other methods in terms of efficiency.
References
[1] P. O. Hoyer. Non-negative matrix factorization with sparseness constraints.JMLR, 5:1457-1469, Dec.2004.
http://dbxlab.uta.edu/ [email protected]