sofiane abbar, habibur rahman, saravana n thirumuruganathan, carlos castillo, g autam das qatar...
TRANSCRIPT
SOFIANE ABBAR, HABIBUR RAHMAN, SARA-VANAN THIRUMURUGANATHAN, CARLOS
CASTILLO, GAUTAM DASQATAR COMPUTING RESEARCH INSTITUTE
UNIVERSITY OF TEXAS AT ARLINGTON
Ranking Item Features by Min-ingOnline User-Item Interactions
Outline
Introduction Motivation and ChallengeModel and ExtensionsExperimental EvaluationRelated WorksConclusion
Business owners relies on user's feedback for the success of their businesses.
It is important for them to understand what are the features which makes an item popular.
User's put feedback on items in the form of reviews, tags, likes or +1's etc.
Can we leverage this information to and the ranking of features in an item ?
Can we and the global ranking or popularity of the features?
IntroductionThe main focus in this paper is the investigation of
a novel-problem: how to rank the features of each item from user-item interactions.
The principal problem investigated in this paper is stated as FEATURE RANKING(FR) PROBLEM:
Where a set of features, and rudimentary user-item interactions (either at aggregate or individual level)
is given, and how to identify the most important fea-
turesper item (alternatively, a ranked list of features per
item).
In this paper, the approach propose a probabilistic model that describes user-item interactions in terms of user preference distribution over featureand a feature-item transition matrix that determinethe probability that an item will be chosen, given a feature.
This paper, used a database of items, where each item is described by a set of attributes, some of which are multi valued. We refer to each of the dis-tinct attribute values of an item as features(or equiva-lently, an item can be described as a set of features)
Sparsity assumption. This paper assumes that among all the features available, each user expresses ℓpreference over a relatively small fraction of them:
Motivation
For example Netflix, a simple user-item interaction wouldInvolve whether the user watched the movie. While someusers could have watched the movie because it starred Tom Hanks, others could have watched it because, in addition it was also directed by Steven Spielberg. Similarly, while some users might buy a car due to its manufacturer, others might buy it for the model and transmission type.
Example
Challenges
Models
A ranking is a relationship between a set of items such that, for any two items, the first is either “ranked higher than”, “ranked lower than” or “ranked equal” ranking is the popularity of items features and suggesting popular item features.
Feature Ranking with Aggregate interaction information
This model assumed that user u first picked a sin-gle
feature j based on their individual preference vec-tor hu
and then selected an item i containing j with prob-ability
proportional to Wij
FR-AGG-W
Algorithm:Input: Database D and aggregate visit vector v1: W = Estimate feature item transition matrix2: constraints = { ∀i ∈[1, n] hi ≥ 0, ||h||1= 1 }3: h = argmin Error(v,Wh) subject to constraintsh4: Compute Xi = Wi· ◦ h ∀i ∈[1, n]5: return X = {X1,X2, . . . ,Xn}
FR-AGG-h
Algorithm :
Input: Database D and aggregate visit vector v1: W = Estimate feature-item presence matrix2: h = Estimate aggregate preference vector3: constraints = { W ≤ W and ∀j||W·j ||1= 1 and ∀i, jWij ≥
0 } 4: W = argmin Error(v,Wh) subject to constraints W5: Compute Xi = Wi· ◦ h ∀i ∈[1, n]6: return X = {X1,X2, .. . ,Xn}
Variant Problem 1: (FR-AGG): Given a database D and
Aggregate interaction Vector v, estimate the item-featurevisit vector X (where Xi=Wi·◦h)
For each item I such that Error (v, W h) is mini-mized.
Variant Problem 2 (FR-INDIV): Given a database D and individual interaction matrix V, estimate the item-feature visit vector Xi for each item i (where Xi = Wi· ◦ h, is
the average of columns of H) such that Error (V, W H)
is minimized.
Network Flow
In this, they consider a graph-based represen-tation of the problem that maps to the element.
This algorithm finds feature to item transitionmatrix (W) by minimizing |V-Wh| error
Extensions
Feature Ranking with Composite Features.Baselines.Algorithms
- FR-AGG-W-LS- FR-AGG-h-LS- FR-AGG-h-NF
Evaluation Metrics -jrecision@1 -nDcG@kRanking quality
Proposed method(FR-INDIV-MNMF)
We choose Kullback-Leibler divergence D(V||W H) in order to measure the reconstruction error Between V and W H. This
choice (instead of other Measures such as L2 distance) al-
lows us to design an algorithm that preserves the col-
umn stochasticity constraints in the solution. In what
follows,They propose a four-step algorithm to solve the
problemof ranking item features in the presence of individ-
ual interaction matrix.
Step 1: Imposing sparsity constraints over H. They impose a (row) sparsity constraint over the factorW by assuming a sparse binary matrix W such that W ≤ W
An entry(W)ij = 0 iff item I does not contain feature j
A seemingly similar approach can be used to also impose (column) sparsity constraints over the Factor H by defining a sparse binary matrix H such thatH ≤ H, where an entry (H) jk= 0 if user k has not visitedany item that contains feature j
However, this straightforward approach may not generate adequate sparsity constraints, since the union of distinct fea-
tures of the items that a user has visited may be quite large
Step 2: Iterative algorithm with multiplicative update rules.
In the second step, they propose modifications to thealgorithm to discover factors W and H such that the Reconstruction error D (V ||W H)is minimized
Step 3: Imposing stochastic constraints on W and HThe matrices W and H produced by Step 2 satisfy thesparsity requirements, however, they may not satisfy the col-umn stochastic constraints, which requires that the weightsof each column of W and H sum to 1. In this step we describe a procedure for further modifying W and H such that the sto-chastic constraints are satisfied. We make use of the following theo-rem by Ho and Dooren
Step 4: Computing item-feature visit vectors Xi. Once the feature-item transition matrix W and individual preference matrix H are obtained, then the feature ranking
of any Item can be computed as follows.
First, compute the aggregate preference vector h by averag-ing all
column-wise vectors H.j ∈ , then perform a component wise multiplication between the item’s feature transition vector
Wi. And h,i.e. Xi = Wi. ◦ h .
FR-INDIV-MNMF
Algorithm: Input: Database D and individual interaction matrix
V1: W = Estimate feature-item presence matrix2: H0 = Initialize a column-wise sparse individual
preferencematrix using setCover (Step 1)
3: Compute W1, H1 = M-NMF(W , H0) (Step 2)4: W, H = Impose stochastic constraints (Step 3)5: Compute h = average (H)6: Compute Xi = Wi oh ∀I ∈ [1, n](Step 4)7: return X ={X1, X2, . . . , Xn}
Experiment
They conduct a comprehensive set of experiments toevaluate the effectiveness and efficiency of various Methods for ranking item features. The ranking quality measured within two scenarios: prediction of the most prominent feature (precision@1) and overall ranking of item features (nDCG@k)
Dataset: MovieLens joint with cast data from IMDB
Result
Related Work
Nonnegative Matrix Factorization (NMF)
Attributes ranking
Feature Ranking.
Conclusion
In this paper, they consider the feature ranking problem
that ranks features of an item by only considering user-item interaction information such as visits.,
definedtwo variants problem based on the granularity of
the interaction information available and proposed dif-
ferent algorithms (based on constrained convex optimiza-
tion, network flow approximation and marginal NMF) to
solvethese variants. In the future, they wish to investi-
gate a variant where users can choose anitem through a weighted combination of features.
Thank You