collaborative filtering with ccam

26
COLLABORATIVE FILTERING WITH CCAM Presenter: Meng-Lun Wu Author: Meng-Lun Wu, Chia-Hui Chang and Rei-Zhe Liu Date: 2011/12/21 ICMLA'11, Honolulu, Hawaii 1

Upload: allenwu

Post on 03-Jul-2015

1.589 views

Category:

Education


1 download

DESCRIPTION

Published by ICMLA'11 in Honolulu, Hawaii.

TRANSCRIPT

Page 1: Collaborative filtering with CCAM

COLLABORATIVE

FILTERING WITH CCAMPresenter: Meng-Lun Wu

Author: Meng-Lun Wu, Chia-Hui Chang and Rei-Zhe Liu

Date: 2011/12/21

ICMLA'11, Honolulu, Hawaii 1

Page 2: Collaborative filtering with CCAM

Outline

• Introduction

• Related Work

• Preliminary

• Collaborative Filtering with CCAM

• Experiment

• Conclusion

ICMLA'11, Honolulu, Hawaii 2

Page 3: Collaborative filtering with CCAM

Introduction (1/2)

• In any recommender system, the number of ratings already

obtained is usually very small compared to the number of

ratings that need to be predicted.

• A possible solution turns out to be dimensionality reduction

methods which can alleviate data sparsity.

• Typically, clustering is the simplest way that can be extended

over recommender systems to achieve a compact model and

avoid the sparsity problem.

ICMLA'11, Honolulu, Hawaii 3

Page 4: Collaborative filtering with CCAM

Introduction (2/2)

• In the past years, co-clustering based on information theory has

attracted more and more attention.

• We have extended a co-clustering algorithm based on

information theory to augmented data matrix which called Co-

Clustering with Augmented data Matrix, CCAM.

• In this paper, we consider how to alleviate the sparsity problem

and achieve a precise prediction by Collaborative Filtering with

CCAM.

ICMLA'11, Honolulu, Hawaii 4

Page 5: Collaborative filtering with CCAM

Related Work

• Information theoretical co-clustering

• Dhillon et al. (2003) developed from information theory and tried to

optimize the objective function based on the loss of mutual information

between clustered random variables.

• Matrix factorization co-clustering

• Chen et al. (2008) linearly combined user-based, item-based CF

method, and matrix factorization results in order to make prediction on

ratings which relied on ONMTF.

• Li et al. (2009) presented a novel cross-domain collaborative filtering

method which co-clusters movie information via ONMTF and

reconstructs knowledge for recommending books and movies.

ICMLA'11, Honolulu, Hawaii 5

Page 6: Collaborative filtering with CCAM

Preliminary (1/2)

• Suppose that we are given a clicking information matrix R

which is composed of user set, U={u1, u2, …, unu} and a set of

ad, A={a1, a2, …, ana}.

• nu and na respectively represents the number of users and ads.

• For memory-based CF methods, before finding similar

neighbors, it is inevitable to encounter sparsity issues of

demanded data.

• In the research of Dhillon et al. (2003), they considered a co-clustering

algorithm which monotonically decreases the information loss of tabular data

to form a compact model.

ICMLA'11, Honolulu, Hawaii 6

Page 7: Collaborative filtering with CCAM

Preliminary (2/2)

• Assume U and A are random variable sets with a joint probability distribution p(U, A) and marginal distribution p(U) and p(A). The mutual information I(U; A) is defined as

• Suppose there are G1 user clusters CU={cu(1), cu

(2), …, cu(G1)} and, G2

ad clusters CA={ca(1), ca

(2), …, ca(G2)}, in order to judge the quality of

a co-clustering, we define the loss in mutual information as

• PROPOSITION 1. There are also properties that are declared and proven, they are

ICMLA'11, Honolulu, Hawaii 7

Page 8: Collaborative filtering with CCAM

Co-Clustering with Augmented data

Matrix, CCAM (1/4)• When the optimization problem of loss in mutual information is first

proposed by Dhillon et al. (2003), it was designed and applied for

single tabular data.

• However, in many cases besides the major data set, there exist related tables which

may provide some useful information.

• In this co-clustering approach, Co-Clustering with Augmented data

Matrix (CCAM), we will simultaneously modify the co-clusters of

multiple augmented data to reduce the information loss.

• The other two sets of components, feature set F={f1, f2, …, fnf}, and

profile set P={p1, p2, …, pnp}, are extensive information for ads and

users and form the augmented matrices

• where nf and np denotes the number of features and profiles, respectively.

ICMLA'11, Honolulu, Hawaii 8

Page 9: Collaborative filtering with CCAM

Co-Clustering with Augmented data

Matrix, CCAM (2/4)• PROPOSITION 2. There are extensive properties recognized

when p(A, F) and p(U, P) were considered.

• which were also declared and proven.

• DEFINITION 1. An optimal co-cluster (CU, CA) we desire to

obtain would minimize

ICMLA'11, Honolulu, Hawaii 9

Page 10: Collaborative filtering with CCAM

Co-Clustering with Augmented data

Matrix, CCAM (3/4)•

ICMLA'11, Honolulu, Hawaii 10

Page 11: Collaborative filtering with CCAM

Algorithm 1Co-Clustering with Augmented data Matrix algorithm

ICMLA'11, Honolulu, Hawaii 11

Page 12: Collaborative filtering with CCAM

Collaborative filtering with CCAM

(1/5)•

ICMLA'11, Honolulu, Hawaii 12

Page 13: Collaborative filtering with CCAM

Collaborative filtering with CCAM

(2/5)• DEFINITION 3. Since CCAM is designed on the base of KL-

divergence, the distance metrics would be in a similar format.

• Here we define the distance between each user and user cluster and each ad and

ad cluster.

• Note that the ad cluster prototype and user cluster prototype of

CCAM would be regarded as

ICMLA'11, Honolulu, Hawaii 13

Page 14: Collaborative filtering with CCAM

Collaborative filtering with CCAM (3/5)

ICMLA'11, Honolulu, Hawaii 14

Page 15: Collaborative filtering with CCAM

Collaborative filtering with CCAM (4/5)

ICMLA'11, Honolulu, Hawaii 15

Page 16: Collaborative filtering with CCAM

Collaborative filtering with CCAM (5/5)

ICMLA'11, Honolulu, Hawaii 16

Page 17: Collaborative filtering with CCAM

Data set

• The data set used in the experiments are obtained from a financial social web-site, Ad$Mart, which ranged from 2009/09/01 to 2010/03/31.

• For each test user, 15 observed clicking rates (Given15) are provided to find nearest neighbors and the remaining clicking rates are used for evaluation.

• To ensure each test user would click at least 15 ads, users with more than 20 clicked ads and ads with more than 10 clicked user-ad pairs would be reserved.• User-Ad: The pre-processing clicking data is provided by 1786 users and 520 ads. After

preprocessing, we make it a joint probability distribution over user and ad, and also reform it into a clicking rate matrix scaled from 1-5.

• Ad-Feature: An advertisement feature data set compiling 37 statistics of 530 ads.

• User-Profile: A questionnaire data set provided by 520 users on 24 survey questions.

ICMLA'11, Honolulu, Hawaii 17

Page 18: Collaborative filtering with CCAM

Evaluation methodology (1/2)

ICMLA'11, Honolulu, Hawaii 18

Page 19: Collaborative filtering with CCAM

Evaluation methodology (2/2)

ICMLA'11, Honolulu, Hawaii 19

Page 20: Collaborative filtering with CCAM

and tuning based on k-NN

ICMLA'11, Honolulu, Hawaii 20

Page 21: Collaborative filtering with CCAM

G1 and G2 tuning based on K-Means

• We also have to determine what value of G1 would result in a

well-performed MAE.

• We simply make G2=10 as well as K1 = K2 = 5, and as a strategy to avoid too

many parameter tunings.

• On this issue, we will see the responding of k-Means with different G1

(7, 15, 30, 60) and reserve the best one in order to apply to the other

algorithms.

ICMLA'11, Honolulu, Hawaii 21

Page 22: Collaborative filtering with CCAM

Parameter tuning with CCAM (1/2)

• In order to evaluate the result of co-clustering, we take

advantage of classification algorithm (Weka J48) on user data to

test the F-measure of 10-fold c.v., and similarly in ad aspect.

• We use the clustering result of the user data (user-ad matrix and user-profile

matrix) as the target labels for evaluation of user clustering, and is similar to

the ad data (ad-user matrix and ad-feature matrix).

• To examine the effectiveness of co-clustering, we reduce the columns of user-

ad matrix to a smaller user-ad cluster matrix. The reduced data is then inserted

into our user data for classification, so as the ad data.

User dataUser-

ad cluster

Clustering result

of user-ad and

user-profile

ICMLA'11, Honolulu, Hawaii 22

Page 23: Collaborative filtering with CCAM

Parameter tuning with CCAM (2/2)

• We find that when G1=60, the best setting will be λ=0.2, φ=0.1.

• Therefore, we will then apply the result of the optimal parameters of CCAM in the next section to compare with the other algorithms.

ICMLA'11, Honolulu, Hawaii 23

Page 24: Collaborative filtering with CCAM

Results

• Table 3 compare the model-

based approaches.

• Table 4 compare the hybrid

models approaches with the

previous parameter settings.

ICMLA'11, Honolulu, Hawaii 24

Page 25: Collaborative filtering with CCAM

Conclusion

• In this paper, we applied the rating framework of Chen’s to evaluate the performance of hybrid CF with various model construction.

• In order to give a fair comparison, we start by tuning for the best performance in each individual approach.

• As a result, we compared four algorithm, CCAM, ITCC, k-Means and k-NN. The MAE metric has shown that CCAM outperformed the other three algorithms.

• In the future, to have more thorough discussions, we will investigate our algorithm on different real world data set.

• such as the MovieLens, EachMovie and Book-Crossing data sets which respectively contains movie and book rating data of users.

ICMLA'11, Honolulu, Hawaii 25

Page 26: Collaborative filtering with CCAM

THANK YOU FOR

LISTENING.Q & A

ICMLA'11, Honolulu, Hawaii 26