cstalks-quaternary semantics recomandation system-24 aug
TRANSCRIPT
1
A Unified Framework for
Recommendations Based on
Quaternary Semantic Analysis
Wei Chen*, Wynne Hsu*, Mong Li Lee*
*School of Computing, National University of Singapore
Introduction
The amount of information on the web is increasing
at a lightning pace. E.g products in Amazon, videos
in Youtube, movies in Netflix
Recommendation is necessary.
Introduction
Recommendation systems are typically classified
according to four types :
User recommendation
Item recommendation
Tag recommendation
Item rating prediction
Related Work
Most of the work in recommendation systems utilize only ternary relationships in generating recommendations.
The collaborative filtering-based recommendation systems use <user ,rating, items >
[B. Sarwar,WWW’01,SIGIR’09]
Tag-based recommendation systems utilize the <users, tags, items >.
Motivation
We argue that recommendations based on ternary
relationships are not accurate as they would have
missed out important associations
Motivation Example
Motivation Example
Beautiful Mind and Groundhog day will be recommended to U3
Motivation example
Motivation example
Groundhog Day and Toy story will be recommended to
U3
Motivation example
Motivation example
Groundhog day is recommended to U3
Motivation
The need of quaternary relationship is necessary. This is reinforced by the following observations:
Users may use the same tag for an item but have different ratings for it.
Items may have multiple tags indicating their different facets.
Some tags may carry implicit semantics that can reveal the users’ preferences.
Overview of the paper
We propose a model: using tensor to model the
quaternary relationship.
Higher-Order Singular Value Decomposition
(HOSVD) is applied in the 4-order tensor to reveal
the latent semantic associations among users,
items, tags and ratings.
BACKGROUND - Tensor
A tensor is a multidimensional array. An N-order
tensor is denoted as
BACKGROUND – Tensor unfolding
The matrix unfolding of an N-order tensor
along the dimension i are vectors
obtained by keeping the index fixed while varying
the other indices.
BACKGROUND – n-mode product
BACKGROUND – HOSVD
HOSVD is a generalization of Singular Value
Decomposition (SVD) to higher-order tensors and
can be written as n-mode product
Where U(n) contain the orthonormal vectors (n-
mode singular vector) spanning the column space
of the A (n) , is the core tensor
BACKGROUND – HOSVD
BACKGROUND – HOSVD
With this, the core tensor can be
constructed as described in [L. D.,SIAM 2000], that is
and we can get:
BACKGROUND- Rank, Low Rank
Approximation
BACKGROUND
Suppose we want to get the RANK-(2,3,3)
approximation, we first retain the first ci column of
matrix U(i) at mode i as follows:
BACKGROUND –Tensor
Approximation
We can now construct the approximate core tensor
using
BACKGROUND
Finally, we obtain the RANK-(2,3,3) approximation
QUATENARY SEMANTIC
ANALYSIS
The main idea is to capture the underlying
relationships among users-tags-items-ratings by
reducing the rank of the original tensor to minimize
the effect of noise on the underlying population
and reduce spareness.
QUATENARY SEMANTIC
ANALYSIS - Initialization
Input: list of quadruples < users, tags, rating, items>;
QUATENARY SEMANTIC
ANALYSIS - Initialization
constructed tensor
where |U|, |T|, |R| and |V| is the number of user, tags , ratings
and items respectively
QUATENARY SEMANTIC
ANALYSIS
Calculate the matrix unfolding A(1) , A(2) , A(3) and
A(4) from Tensor
Perform SVD on each matrix unfolding and get the
left singular matrix U(1) , U(2) , U(3) and U(4)
QUATENARY SEMANTIC
ANALYSIS Remove the least significant rows |U|-c1; |V |-c2; |T|-c3
and |R|-c4 from U(1);U(2);U(3); and U(4), respectively. We
choose c1= 4; c2 =4; c3 = 4; c4 = 2.
QUATENARY SEMANTIC
ANALYSIS
Calculate the approximate core tensor
Approximate the original tensor by:
QUATENARY SEMANTIC
ANALYSIS
QUATENARY SEMANTIC
ANALYSIS
Latent associations such as the newly added
quadruples in Table 6 may not be found if the
tensor data is sparse
We overcome this problem by applying a
smoothing technique to the tensor in Algorithm.
RECOMMENDATION
GENERATION
RECOMMENDATION
GENERATION
RECOMMENDATION
GENERATION
RECOMMENDATION
GENERATION
Experimental result – dataset
description
Datasets: Movielens Data
The first file contains users’ tags on different movies. The second file contains users’ ratings on different movies on a scale of 1 to 5.
By joining these two files over user and movie, we obtain the quadruples < user; movie; tag; rating >.
After preprocessing, the dataset has 11122 tuples with 201 users, 501 movies, and 404 tags.
Experimental result – Item
Recommendation
Compare method:
UPCC: User based recommendation
IPCC: Item based recommendation
Probabilistic Matrix Factorization (PMF)
Experimental result – Item
recommendation
Experimental result – Rating
Prediction
Experimental result – Tag
Recommendation
Compare method:
TSA [TKDE10]: Ternary Semantic Analysis
RTF [KDD.09]: Optimal ranking using tensor
factorization.
Experimental result – Tag
Recommendation
Experimental result – User
recommendation
Conclusion
We have shown that quaternary semantic analysis
can lead to more accurate recommendation.
We have proposed using a 4-order tensor to model
the four heterogeneous entities: users, items, tags
and ratings.
A unified framework is proposed that utilize
quaternary relation for user recommendation, item
recommendation, tag recommendation and rating
prediction.
Thank you very much!
Q/A
44