phd defense

Post on 06-May-2015

387 Views

Category:

Presentations & Public Speaking

25 Downloads

Preview:

Click to see full reader

DESCRIPTION

My PhD defense presentation in Computer Science.

TRANSCRIPT

1

Mathematical Methods of Tensor Factorization

Applied to Recommender Systems

Dott. Giuseppe RicciScuola di Dottorato in Informatica

XXVI Ciclo

PhD Defense – 26 May 2014

SemanticWeb Access and Personalization research grouphttp://www.di.uniba.it/~swap

Dipartimento di Informatica

2

Outline

Motivations and Contributions

Information Overload & Recommender Systems

Matrix and Tensor Factorization in RS literature

Proposed solutions

Experimental Evaluation

Summary and Future Work

3Motivations and Contributions

1/2Matrix Factorization (MF) techniques have proved

to be a quite promising solution to the problem of designing efficient filtering algorithms in the Big Data Era.

Several challenges in Recommender Systems (RS) research area: missing values: data sparsity incorporating contextual information: CARScontext relevance (weighting) in CARS.

This work focuses on CARSObjective: to propose new methods to understand which contextual information is relevant, and use this information to improve the quality of the recommendations.

4

Matrix and Tensor Factorization literature review.

CP-WOPT algorithm solution for sparsity of RS data.

CARS and context-weighting:2 proposed solutions to introduce only

relevant contextual information in recommendation process

empirical evaluation of the 2 solutions.

Motivations and Contributions 2/2

5

Information Overload&

Recommender Systems

6Information Overload

Source: www.go-globe.com

Surplus of content compared to user’s ability to find relevant information result is either you are late in making decisions, or you make the wrong decisions.“Information Overload” was used by the futurologist Alvin Toffler in 1970, when he predicted that the rapidly increasing amounts of information being produced would eventually cause people problems.

7

Recommender Systems 1/2Recommender Systems (RS) represent a

response to the problem of Information Overload and are now a widely recognized field of research [Ricci].

RS fall in the area of information filtering. With the growing amount of information available on the web, a very sensitive issue is to develop methods that can effectively and efficiently handle large amounts of data.

Mathematical methods have been proved useful in dealing with this problem recently in the context of the RS.

The search for more effective and efficient methods than those known in literature also guided by the interest in industrial research in this field, as evidenced by the NetFlixPrize competition.

[Ricci] Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.

8Recommender Systems 2/2Usually rating is stored in a matrix called user-item

matrix or rating matrix.

RS calculate a rating estimate for item/product not purchased/tried suggestion list with the highest rating estimation.

2

5

9Examples of RS

Applications:• e-commerce• advertising• e-mail filtering• social network……

10

Basics of Recommender Systems

11

Recommender Systems: definitionsThe area of RSs is relatively new mid-1990s.

Concept: tools and techniques able to provide personalized information access to large collections of structured and unstructured data and to provide users with advices about items they might be interested in.

Some definitions:

[Olsson]: “RS is a system that helps a user to select a suitable item among a set of selectable items using a knowledge-base that can be hand-coded by experts or learned from recommendations generated by the users”.

[Burke]: “RS have the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options”.

[Olsson] Tomas Olsson. Bootstrapping and Decentralizing Recommender Systems . PhD thesis, Department of Information Technology, Uppsala University and SICS, 2003.[Burke] R. Burke. Hybrid Recommender Systems: Survey and Experiments. UserModeling and User-Adapted Interaction, 12(4):331–370, 2002.

12

RS Classification [Burke]

[Burke] Robin Burke. Hybrid recommender systems: Survey and experiments. User Modeling and User-Adapted Interaction , 12(4):331–370, 2002.

Context Aware Recommender Systems (CARS)

13

Content-Based RS (CBRS)Assumption: user preferences remain stable over time.

They suggest items similar to those previously labeled as relevant by the target user.

Based on the analysis and exploitation of textual contents since each item to be recommended has to be described by means of textual features.

Needs 2 pieces of information: a textual description of the item and a user profile describing user interests in terms of textual features.

14

Collaborative Filtering RSAssumption: users that in the past shared

similar tastes will have similar tastes in the future as well nearest neighbors.

Rely with a matrix where each user is mapped on a row and each item is represented by a column user/item or rating matrix.

A recent trend is to exploit matrix factorization methods A common technique applied in CFRS is Singular Value Decomposition (SVD).

15

Hybrid Recommender Systems

Combining 2 or more classes of algorithms in order to emphasize their strengths and to level out their corresponding weaknesses.

For example, a collaborative system and a content-based system might be combined to compensate the new user problem, providing recommendations to users whose profiles are too poor to trigger the collaborative recommendation process.

Burke proposed an analytical classification of hybrid systems, listing a number of hybridization methods to combine pairs of recommender algorithms. In [Burke] 7 different hybridization techniques are introduced.

[Burke] Robin Burke. The adaptive web. chapter HybridWeb Recommender Systems, pages 377–408. Springer-Verlag, Berlin, Heidelberg, 2007.

16ContextWhat is the context?

One of the most cited definition of context is that of Dey [Dey] et al. that defines context as:

”Any information that can be used to characterize the situation of an entity. An entity is a person, place, or object that is considered relevant to the interaction between a user and an application, including the user and the applications themselves”.

Bazire and Brezillon [Bazire] examined and compared some 150 different definitions of context from a number of different fields and concluded that the multifaceted nature of the concept makes it difficult in find a unifying definition.

Li [Li] et al. define 5 context dimensions: who (user), what (object), how (activities), where (location) and when (time).

[Dey] Anind K. Dey. Understanding and using context. Personal Ubiquitous Comput.,5(1):4–7, 2001.[Bazire] Mary Bazire and Patrick Brézillon. Understanding context before using it. In Proceedings of the 5th International Conference on Modeling and Using Context ,CONTEXT’05, pages 29–40, Berlin, Heidelberg, 2005. Springer-Verlag.[Li] Luyi Li, Yanlin Zheng, Hiroaki Ogata, and Yoneo Yano. A framework of ubiquitous learning environment. In CIT , pages 345–350. IEEE Computer Society, 2004.

17

Context-Aware Recommender System (CARS) take account of contextual factors,such as available time, location, people nearby, etc., that identify the context where the product is tried.

We suppose these factors may have a structure:for example "location" may be defined in

terms of home, public place, theatre, cinema, etc.

Context Aware RS (CARS)

18

Challenges of a CARS are: relevance of contextual factors: it is

important to decide which contextual variables are relevant in the recommendation process;

availability of contextual information: relevant contestual factors can be considered as a part of the data collection but such historical contextual information is often not available when designing the system;

extraction of contextual information from user’s activities: these data need to be recorded;

evaluation and lack of publicly available datasets.

Context Aware RS

19

CARS incorporates users and items information as well as other types of data such as context, using these to infer unkonwn ratings:

f: Users x Items x Contexts Rating

CARS deals with a quadruple input: <user, item, context, rating> where the recommender records the preference of the user from the selected item according to the context information which tells you if the product is consumed by the user.

Context Aware RS

20Paradigm to incoporate context

In a movie RS, if a user wants to see a film one day during the holidays, only the ratings assigned in holidays are used

Data are used in the estimation of theratings by a multidimensional function or by a heuristic calculations to incorporate contextual information in addition to the user and item data

Pre-filtering

Post-filtering ContextualModeling

21

Context Weighting It is not always simple to provide what contextual

information is important for a specific scope.

Many parameters - in different manners. Not all acquired contextual information are important for the recommendation process: some contextual variables can introduce noise degrade the quality of suggestions.

For each user, what contextual information is helpful to give, for more precise and reliable recommendations.

PROBLEM: users may rate items in different contexts, but it is not guaranteed that we can find dense contextual ratings under the same context, i.e. there may be very few users who have rated the items in the same contexts.

Solutions: 2 branches: Context Selection (survey) and Context Relaxation (binary selection).

22

Matrix Factorization in RS literature

23

BackgroundWith the ever-increasing information available,the

challenge of implementing personalized filters has become the challenge of designing algorithms able to manage huge amounts of data for the elicitation of user needs and preferences.

Matrix Factorization techniques have proved to be a quite promising solution.

MF techniques fall into the class of CF methods, and, particularly, in the class of latent factor models similarity between users and items is induced by some factors hidden in the data.

We will focus our attention on Singular Value Decomposition (SVD).

24

Basics of MFU: set of users

D: set of items

R: the matrix of ratings.

MF aims to factorize R into two matrices P and Q such that their product approximates R:

A factorization used in RS literature is Singular Value Decomposition (SVD) introduced by Simon Funk in the NetFlix Prize.SVD-objective: reducing the dimensionality, i.e. the rank, of the user-item matrix, in order to capture latent relationships between users and items.

25SVD in RS Literature 1/2Sarwar:

SVD based algorithm Low-rank approximation: retaining only k << r singular values

(the biggest) by discarding other entries.

Koren: SVD based algorithm (Asymmetric-SVD, SVD++) Explicit and implicit feedback Baseline estimates.

Julià: Alternation Algorithm An alternative to SVD The aim is the same as the one of SVD Alternation makes it possible to deal with missing.

user-factors vector pu

item-factors vector qi

26

Advantages: limited computational cost and good quality

recommendations (Sarwar)good algorithms and high accuracy (Koren)Alternation Algorithm deals with missing values and good

computational resources required (Julià).

Problems:technique not applicable on frequently updated database

(Sarwar)models are not justified by a formal model (previous ratings

are not explained) (Koren)r known values in each row/column (Julià).

[Sarwar] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems, 5th International Conference on Computer and Information Technology (ICCIT), 2002[Koren] Yehuda Koren Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model, ACM Int. Conference on Knowledge Discovery and Data Mining (KDD'08), 2008[Julià] Carme Julià, Angel D Sappa, Felipe Lumbreras, Joan Serrat, Antonio López Predicting Missing Ratings in Recommender Systems: Adapted Factorization Approach, in International Journal of Electronic Commerce (2009)

SVD in RS Literature 2/2

27

Summary

• We analyzed MF technique• We focused our attention on SVD techniques • The main limitations of MF techniques:

• they take into account only the standard profile of the

users • does not allow to integrate further information such

as the context.

28

Matrix 2 TensorMatrix and MF can’t be used in a CARS based on a

contextual modeling paradigm: context information is used in the process of

recommendation and matrices are not adeguate for this scope.

We need to introduce tensors.

users

contexts

items

<user, item, context, rating>

29

Tensor Factorization:HOSVD and PARAFAC

in RS literature

30

TensorsTensors higher-dimensional arrays of numbers, might

be exploited in order to include additional contextual information in the recommendation process.

In standard multivariate data analysis, data are arranged in a 2D structure, but for a wide variety of domains, more appropriate structures are required for taking into account more dimensions:

xijk i=1,..,I j=1,..,J k=1,..,K.

2 particular TF can be considered to be higher-order extensions of matrix Singular Value Decomposition:

1. High Order Singular Value Decomposition (HOSVD) which is a generalization of SVD for matrices;

2. PARallel FACtor analysis or CANonical DECOMPosition (PARAFAC/CANDECOMP) higher-order form of Principal Component Analysis.

31

HOSVD decomposes the initial tensor in N matrices (where N is the size of the tensor) and a tensor whose size is smaller than the original one (core tensor).

Tensor Factorization

In RS literature, the most frequently used technique for tensor factorization is HOSVD.

32

HOSVD in RS Literature 1/2Baltrunas:

Multiverse Recommendations algorithmHOSVD TF based algorithm data: users, movies, contextual information and

user ratings 3-order tensor.

Rendle: RTF algorithm social tagging systemReconstructed tensor: measure the strength of

association between users, items and tags.

Chen: CubeSVD Personalized web search Hidden relationships <user, query, web pages> Output: < u, q, p, w>: w measures the popularity of

page p as a result of query q made by the user u.

33

HOSVD in RS Literature 2/2Advantages: good algorithm with improvement of results (Baltrunas)good algorithm with improvement of results (Rendle)CubeSVD tested on MSN clickthrough gives good results

(Chen).

Problems:high computational cost (all) time consuming algorithm (Chen).

[Baltrunas] Alexandros Karatzoglou, Xavier Amatriain, Linas Baltrunas, and Nuria Oliver. Multiverse recommendation: n-dimensional tensor factorization for context-aware collaborative filtering. In Proceedings of the fourth ACM conference on Recommender systems , RecSys ’10, pages 79–86, New York, NY, USA, 2010. ACM.[Rendle] Steffen Rendle, Leandro Balby Marinho, Alexandros Nanopoulos, and Lars Schmidt-Thieme. Learning optimal ranking with tensor factorization for tag recommendation. In KDD , pages 727–736, 2009.[Chen] Jian-Tao Sun, Hua-Jun Zeng, Huan Liu, Yuchang Lu, and Zheng Chen. Cubesvd: a novel approach to personalized web search. In Proceedings of the 14th international conference on World Wide Web , WWW’05, pages 382–390, New York, NY, USA, 2005. ACM.

34

PARAFAC (PARallel FACtor analysis)

PARAFAC (PARallel FACtor analysis) is a decomposition method. The PARAFAC model was independently proposed by Harshman and by Carroll and Chang.

A PARAFAC model of a 3D array is given by 3 loading matrices A, B, and C with typical elements aif, bjf, and ckf.

35

HOSVD Vs PARAFACHOSVD:• HOSVD is an extension of the SVD to higher order

dimensions;• is the ability of simultaneously taking into account more

dimensions;• better data modeling than standard SVD; • dimension reduction can be performed not only in one

dimension but also separately for each dimension.

HOSVD:• it is not an optimal tensor decomposition: HOSVD does not

require an iterative algorithms, but needs standard SVD computation only;

• it has not the truncation property of the SVD, where truncating the first n singular values allows to find the best n-rank approximation of a given matrix;

• HOSVD cannot deal with missing values, they are treated as 0;

• to prevent overfitting, HOSVD should use regularization.

36

PARAFAC: • is faster than HOSVD: linear computation time in

comparison to HOSVD;• does not collapse data, but retains its natural

three-dimensional structure;• despite PARAFAC mode’s lack of ortogonalithy,

Kruskal showed that components are unique, up to permutation and scaling, under mild conditions.

PARAFAC Vs HOSVD

37

PARAFAC in [Baltrunas12]TFMAP PARAFAC top-N context-aware recommendations of mobile applications. A tensor of 3 dimensions is factorized:• users• items • context types.

Dimensions 3 factor matrices calculate user m’s preference to item i under context type k:

The authors introduced an optimization process using a gradient ascendent to avoid overfitting.

[Baltrunas12] Yue Shi, Alexandros Karatzoglou, Linas Baltrunas, Martha Larson, Alan Hanjalic, and Nuria Oliver. Tfmap: optimizing map for top-n context-aware recommendation. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval , SIGIR ’12, pages 155–164, New York, NY, USA, 2012. ACM

38

Advantages:

TFMAP tested on Appazar projecet dataset increase MAE and Precision compared to other algorithms

good scalability: the training time of TFMAP increases almost linearly.

Problems:

TFMAP is tested only on 1 dataset

Significance of results ??

PARAFAC in [Baltrunas12]

39

PARAFAC in [Acar]

[Acar] Evrim Acar, Daniel M. Dunlavy, Tamara G. Kolda, and Morten Mørup. Scalable tensor factorizations with missing data. In SDM10: Proceedings of the 2010 SIAM International Conference on Data Mining , pages 701–712, Philadelphia, April 2010. SIAM.

PARAFAC goal: to capture the latent structure of the data via a higher-order factorization, even in the presence of missing data. The authors develop a scalable algorithm called CP-WOPT (CP Weighted OPTimization).

Numerical experiments on simulated data sets CP-WOPT can successfully factor tensors with noise and up to 70% missing data.

40

CP-WOPT is tested on EEG dataset: • it is not uncommon in EEG analysis that the signals from

some channels are ignored due to the malfunctioning of the electrodes

• the factors extracted by the CP-WOPT algorithm can capture brain dynamics in EEG analysis even if signals are missing from some channels.

PARAFAC in [Acar]

41

Advantages:

CP-WOPT deal with missing values

CP-WOPT uses a weighted factorization based on PARAFAC

good results on tested dataset.

IDEA: CP-WOPT RS

Problems

Computational cost ??

PARAFAC in [Acar]

42

Proposed solutions for missingvalues and context weighting

43

ScenarioCARS represent an evolution of the traditional CF

paradigm.

State-of-the-art is based on TF as a generalization of the classical user-item MF that accomodates for the contextual information.

We are interested in the PARAFAC technique for its ability to deal with missing values.

We will propose the use of the algorithm CP-WOPT: our target is to identify the most promising method of factorization (PARAFAC) and the best algorithm implementing this factorization.

We propose 2 solutions to the problem of context weighting.

44

CP-WOPT Algorithm

W tensor

Rank of the tensor X

Gradient Matrices

45

Implementation DetailsCP-WOPT algorithm is implemented in Java.

Input tensor is given from a CSV file.

Values range from 1 to 5.

Missing values are conventionally represented 0.

The output returned approximation of the input tensor with the reconstructed missing data is stored into a CSV file.

Values less than 0 are normalized to 0.

46CWBPA (Context Weighting with Bayesian Probabilistic Approach)

1/4Idea: Conditional Probability + Bayes’ Theorem.

1) Conditional Probability for each user and each context.

2) Compare this distribution with an equiprobable distribution divergence measure.

• If the 2 distributions are similar context does notinfluence the user’s rating;

• If they are very different rating is influenced by the context where the divergence measure is the highest.

47

CWBPA 2/4

cij="clearly", "sunny", "cloudy", "rainy”

Assumption: liking = rating is influenced from context

Contingency table for the context ci

L: Liking variable

E. G.: ci=“weather”

n tables (contexts’ nr) x 1 user

48

CWBPA 3/4

P(ci=cij|L = 1); i = 1,..,mi ? Bayes’ Theorem

49

• Comparing 2 distributions divergent?• Degree of divergence: divergence index.

DEF.: given 2 distributions A and B, which both refer to the same quality character X, calling fA

k and fB

k the relative frequencies related to the k, k = 1,..,K modality of the A and B distributions, a possibile family of divergence index is:

CWBPA 4/4

50

CWAIC (Context Weighting Association Index Calculation) 1/2

• Idea: for each user and each context we want to calculate the Association Index of Cramér between liking and context.

• Objective: to determine if context influences the

rating. • We establish a threshold under which there is not a

dependency rating-context, but over which there is influence or dependency.

• Association measures are based on the value of X2, obtained from a r x c contingency table.

• X2 test is helpful to verify independence hypotheses (corresponding to a zero association) between:• the modalities of the row variable• the modalities of the coloumn variable.

51

CWAIC 2/2Cramér’s Index Φc The Cramér’s Index contingency table of dimensions rxc. Based on X2 which is the most applied index for associations measures. It is calculated as:

Φc=>0 not associationΦc=1 perfectly correlation but only if the table is square

Total observation number

k=min(r, c)

52

Using CWBPA and CWAIC

Tensor – all context

CWBPA

CWAIC

Influential Variable

NOT Influential Variable

Output

REDUCED TENSOR

Factorization with CP-WOPT

53

Experimental Evaluation

54Evaluation of RS 1/3Standard metrics have been defined by judging

how much the prediction deviate from the actual rating.

Predictive accuracy metrics:Mean Absolute Error (MAE): this metric measures the

deviation between prediction and actual rating provided by the user:

Root Mean Squared Error (RMSE): follows the same principle of MAE but it squares the error before summing. Consequently, it penalizes large errors since they become much more pronounced than small ones.

55

Classification metrics: these metrics evaluate how well a RS can split the item space into relevant and non-relevant items.

Precision: this metric counts how many items among the recommended ones are actually relevant for the target user.

Recall: this metric counts how many items among those that are relevant for the target user are actually recommended.

Evaluation of RS 2/3

Recommended Content NOT Recommended Content

Relevant Content

True Positive (TR) False Negative (FN)

Irrelevant Content

False Positive (FR) True Negative (TR)

56

F-Measure: a metric defined as the harmonic mean of precision and recall metrics. Let β be a parameter that determines the relative influence of both precision and recall, the F-Measure is calculated as follows:

β=1

Evaluation of RS 3/3

57

• 3 preliminary tests of the CP-WOPT verify the effectiveness of this algorithm and to evaluate standard metrics;

• 1 evaluation without context;

• 2 evaluations to test our solutions CWBPA and CWAIC for context weighting.

Introduction 1/2

58

Introduction 2/2

Why 2 Baselines?

• 1 without contextual information on 1 dataset• 1 with all contextual information available on 1

dataset.

Does the proposed solutions work as a “filter” for contextual information?

59CP-WOPT: preliminary evaluations 1/5Preliminary user study:• 7 real users• rated a fixed number of movies (11) • 3 contextual factors.

3 contextual factors:i) if they like to watch the movie at home or at the cinema;ii) with friends or with a partner;iii) with or without family.

Ratings range: 1-5 with “encoding” of context into rating:• rating 1 and 2 express a strong and a modest

preference, respectively, for the first context term;• rating 3 expresses neutrality;• rating 4 and 5 express a modest and a strong

preference, respectively, for the second context term.

60

CP-WOPT: preliminary evaluations 2/5Metrics used: accuracy – coverage.

Accuracy: the percentage of known values correctly reconstructed:

Coverage: the percentage of non-zero values returned:

61

The experiment shows that it is possible to express, through the n -dimensional factorization, not only recommendations to the single user, but also more general considerations such as the mode of using an item, i.e. its trend of use.

CP-WOPT: preliminary evaluations 3/5

62

CP-WOPT: preliminary evaluations 4/5

• Dataset used: subset of Movielens 100K• Input: tensor of dimensions 100 users x 150

movies x 21 occupations.• Contextual information: occupation (only

available information in the dataset as contextual information)

• Results:• acc = 92,09% • cov = 99,96%• MAE = 0,60 • RMSE = 0,93.

Acceptable accuracyCoverage is very good

63

CP-WOPT: preliminary evaluations 5/5

Baseline: MyMediaLite* RS• UserItem-Baseline: CF algorithm• SVDPlusPlus: MF algorithm based on Singular

Value Decomposition

* http://www.mymedialite.net

64

Evaluation of an explicit context datasetDataset: LDOS-CoMoDa**

LDOS-CoMoDa contains: • ratings for the movies • the 12 pieces of contextual information describing the situation in which the movies were watched.

Properties: • ratings and the contextual information are explicitly acquired from the users immediately after they consumedthe item;• the ratings and the contextual information are from real user-item Interaction; • users are able to rate the same item more than once if they consumed the item multiple times.

** www.ldos.si/comoda.html

65

LDOS-CoMoDa dataset has been in development since 15 September 2010. It contains 3 main groups of information: general user information: provided by the user upon

registering in the system user’s age, sex, country and city;

item metadata: inserted into the dataset for each movie rated by at least one user director’s name and surname, country, language, year;

contextual information.

LDOS-CoMoDa

66

We experimented CP-WOPT on LDOS-CoMoDa dataset with ALL CONTEXT selected (19 contextual features).

Accuracy Metrics

We use 70% of ratings, by replacing the 30% of known rating with zero values. The 30% of values is randomly choosen.

Evaluation on explicit context dataset 1/2

67

CAMF (CAMF_C) DCW 1.017 SpliingApproaches (UI Splitting)

CP-WOPT 0

0.2

0.4

0.6

0.8

1

1.2 RMSE

Evaluation of explicit context dataset 2/2

68Baseline without contextThis experiment aims at creating a baseline to compare

our standard recommendation algorithms which do not exploit contextual information, so we want to use a 2D recommender.

For this purpose we run Mahout Algorithms on LDOS-CoMoDa dataset.

The Mahout recommender requires an input file or data. We will use a CSV file where user’s ratings assigned under some contextual situations are stored.

We neglect contextual information.

We remove the ratings given on the same item under different contexts case.

We consider the first rating in temporal order ignoring the others.

We will rearrange the data as triplet: <id user, id item, rating>.

69

Mahout algorithms comparedSome standard collaborative filtering algorithms are

compared: Singular Valued Decomposition Different algorithms based on several user similarity

measures (Spearman Correlation, Pearson Correlation, Euclidean Distance, Tanimoto Coefficient)

Algorithms based on item similarity (Log Likelihood, Euclidean Distance, Pearson Correlation)

Slope One Recommender.

For user similarity we use 10 neighborhoods to calculate the similarity between users.

We use 60% of the data as training set and 40% as test set.

70Experimental Evaluation 1/6

SVD Pearson User Simi-

larity

Euclidean User Simi-

larity

Tanimoto User Simi-

larity

Spearman User Simi-

larity

Euclidian Item Simi-

larity

Pearson Item Simi-

larity

Tanimoto Item Simi-

larity

LogLikeli-hood Item Similarity

SlopeOne0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

MAE

RMSE

71

Experimental Evaluation 2/6

SVD Pearson User Similarity

Euclidean User Similarity

Tanimoto User Similarity

Spearman User Similarity

Euclidian Item Similarity

Pearson Item Similarity

Tanimoto Item Similarity

LogLikelihood Item Similarity

SlopeOne0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10

P@5

R@5

F-score @5

72

Experimental Evaluation 3/6

SVD Pearson User Similarity

Euclidean User Similar-

ity

Tanimoto User Similar-

ity

Spearman User Similar-

ity

Euclidian Item Similar-

ity

Pearson Item Similarity

Tanimoto Item Similar-

ity

LogLikelihood Item Similar-

ity

SlopeOne0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

P@10 R@10

F-score @10

73

Experimental Evaluation 4/6

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20

P@20

R@20

F-score @20

74

Experimental Evaluation 5/6

SVD Pearson User Similarity

Euclidean User Similar-

ity

Tanimoto User Similar-

ity

Spearman User Similar-

ity

Euclidian Item Similar-

ity

Pearson Item Similar-

ity

Tanimoto Item Similar-

ity

LogLikeli-hood Item Similarity

SlopeOne0.00

0.05

0.10

0.15

0.20

0.25

P@50

R@50

F-score @50

75

In general the low values are due to the fact that the methodology used for evaluating the ranked item lists includes unrated items in the test set.

These items are tagged as not-relevant, therefore leading to likely underestimated performance, compared to a situation where all ratings are available.

This is not a problem in our evaluation, since the goal is just to compare algorithms, and performance is equally understimated for all of them.

Spearman User Similarity algorithm, which gave the lowest error, and Euclidean User Similarity algorithms, which gave the best accurancy, as baseline.

Experimental Evaluation 6/6

76

LDOS-CoMoDa dataset: d = 19 contextual featuresUser’s ratings with context information are stored in a CSV file.

We use 70% of ratings, by replacing the 30% of known rating with zero values. The 30% of values is randomly choosen.

CW Evaluation: Preliminary Phase

CW Proposed Solutions

Reduced Tensor

77

CWBPA Evaluation 1/2This experiment is performed to test the 2 proposed solutions CWBPA and CWAIC for context weighting. We apply the 2 methods on LDOS-CoMoDa dataset for evaluating standard metrics MAE, RMSE, accuracy, coverage, P and R.

Contingency table L=1

We compare the probability distribution obtained from the previous calculations with the probability distribution 1/K, K = number of context variables.Divergence measure:

78

CWBPA Evaluation 2/2

79

Contingency table L=1for each context and each user.

For each table wecalculate the X2 coefficient and the

Cramér’s indexThreshold.

CWAIC Evaluation

80

CWBPA Vs CWAIC

7 runs of the 2 algorithms: 4 for CWBPA3 for CWAICwe select the most significant contextual configurations.

81CWBPA Vs CWAIC 1/2

Spearman User Similarity

Euclidean User Similarity

CWAIC CWBPA CP-WOPT0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 MAE

RMSE

82CWBPA Vs CWAIC 2/2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P R

83

CWBPA Vs CWAIC – All users 1/2

Spearman User Similarity

Euclidean User Similarity

CWAIC CWBPA CP-WOPT0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 MAERMSE

84

CWBPA Vs CWAIC – All users 2/2

Spearman User Similarity

Euclidean User Similarity

CWAIC CWBPA CP-WOPT0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P R

85Result Analysis 1/2

• Evaluated CP-WOPT algorithm as possibile solution to the missing values:• with a small dataset • on a Movilens 100K subset we had good results with

a low error and good coverage value CP-WOPT is able to reconstruct the tensor leaving only few values as missing data;

• On Movielens: results reached are in line with those know in literature;

• CP-WOPT on LDOS-CoMoDa dataset is better than other state-of-art recommendation algorithms;

• Neglecting the contextual information by using a regular 2D RS, CF algorithms Spearman User Similarity and Euclidean User Similarity provided better performance.

86

• CWBPA and CWAIC give different responses to the problem of context weighting;

• CWBPA and CWAIC are evaluated on LDOS-CoMoDa dataset, showing their effectiveness;

• Using only some contextual variables lead to give more precise recommendations;

• CWAIC has better performance than CWBPA.

Result Analysis 2/2

87

Summary andFuture Work

88

Recap

Information Overload

89Recap

Recommender Systems

90

Recap

CF MF Tensors

TF - ContextProposals:CP-WOPTCWBPACWAIC

91

Recap – Experimental Evaluation

5 Evaluations to test:

• Effectiveness of CP-WOPT into RS;• 2 proposed solutions for context weighting:• both approaches seem effective;• using only relevant contexts leads better

recommendations compared to a traditional 2D RS or using all contextual information available.

92

Future Work 1/3

LDOS-CoMoDa dataset experiment on all context available.

• 12 contextual variables in the LDOS-CoMoDa dataset;

• We used only 5 of them to reduce the computational effort;

• New extended evaluation of the Bayesian Probabilistic Approach and of the Association Index to minimize the dimensions of the tensor.

93

Future Work 2/3

Test on another contextual dataset.

We want to test CP-WOPT, CWBPA and CWAIC on other datasets having explicit contextual information such as:

• AIST Food dataset• TripAdvisor dataset

to improve the significance of the results.

94

Future Work 3/3A Real Application.

We want to implement a web-based system to acquiredata and test our proposed solutions in a concrete scenario, such as:

Personalized Context-Aware Electronic Program Guides.

95PubblicationsMost of the work presented is collected in the publications:

Giuseppe Ricci, Marco de Gemmis, Giovanni SemeraroMatrix and Tensor Factorization Techniques applied to RecommenderSystems: a Survey.International Journal of Computer and Information Technology(2277 – 0764) Volume 01– Issue 01, September 2012.

Giuseppe Ricci, Marco de Gemmis, Giovanni SemeraroMathematical Methods of Tensor Factorization Applied to Recommender SystemsNew Trends in Databases and Information Systems17th East European Conference on Advances in Databases and Information Systems Volume 241, ISBN 978-3-319-01862-1, 2013, pp 383-388. Results of Experimental Evaluation are in phase of submission.

96Questions?

“In things which are absolutelyindifferent there can be no choice

and consequently no option or will.”

Gottfried Wilhelm von Leibniz

top related