chapter 3 analysis of existing system and...

3. Analysis of Existing System and Limitation 43

CHAPTER 3

ANALYSIS OF EXISTING SYSTEM AND LIMITATION

3.1 REVIEW PREVIOUS RESEARCH FINDINGS AND COMPARATIVE STUDY

1. Fab: Content-based, collaborative recommendation

Author: M. Balabanovi´c and Y. Shoham

Year: 1997

Approach: Content-boosted

Summary:

In this paper, authors have combined CB and CF methods. This designing solves Scalability

Problem. Both CB and CF have several limitations. Pure content-based approach has several

disadvantages like Over-specialization; only capture certain aspects/features of product. While

pure CB has also its own disadvantages like new item/user problem, data sparsity.

Fab System[30]: The process of recommendation can be partitioned into two steps:

1. Collection of items to form a manageable database or index, and

2. Subsequently selection of items from this database for particular users.

FAB has three main Components: 1.Selection Agent- Search pages for specific users 2.

Collection Agent –Search pages for specific topic 3. Central route Role of Agents in FAB:

Collection agents send pages, which are founded during process, to central router. Central

Router sends them to those users whose profiles match with it. Selection Agent discards the

pages which are already seen by users.

When the user has requested, received, and looked over their recommendations, they are

required to assign appropriate ratings from a 7-point scale. This rating is stored in user’s

personal agent profile for further recommendation and it is also forwarded to Collection agent’s

profile. The collection agents’ profiles represent a topic of interest to a dynamically changing

group of users, as opposed to a user’s profile, which represents multiple interests possibly

served by several collection agents. In future work, mainly two research issues are to be

tackled. They aim to study the effects of massively scaling up the number of users, and they

plan to continue the investigation of the dynamic processes involved.


2. Recommendation as classification: using social and content-based Information in

recommendation

Author: Chumki Basu, Haym Hirsh, and William Cohen

Approach: CF and content-based information

Year: 1998

Summary:

In this paper, author have explained an inductive learning approach to recommendation that

is able to use both ratings information and other types of information about each product

in predicting user preferences. Here, author has used hybrid features[25] that combine

elements of social and content-based information makes it possible to achieve more

accurate recommendations.

3. Combining content-based and collaborative filters in an online newspaper

Author: Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., & Sartin

Year: 1999

Application: Online Newspaper

Summary

Authors have presented new approach that combines content filters with the depth of

collaborative filtering. Pure collaborative filtering suffers from early rater problem, sparsity

problem and gray Sheep problem. CB doesn’t suffer from above problem but it has its own

disadvantage like it has difficulty to distinguish between high and low quality content. Further,

number of items in content category increases, it decreases the effectiveness of CB approach.

Here it combines collaborative filtering prediction with content based prediction using a

weighted average. CF gives inaccurate result in case of lack of history data and CB gives

inaccurate result in case where users have not specified explicit keywords. It is proven in

Experimental study that combination of CB and CF give more accuracy(minimum MAE)[26]

in comparison of pure CB and pure CF approach.

In future work, this approach can be extended by adding demographic approach to improve

accuracy and to work more on prediction strength of CF.


4. Combining collaborative filtering with personal agents for better recommendation

Author: Nathaniel Good, J. Ben Schafer, Joseph A. Konstan, Al Borchers,

Badrul Sarwar, Jon Herlocker, and John Riedl

Year: 1999

Summary

Authors have presented in this paper that a CF framework can be used to combine personal IF

agents and the opinions of a community of users to produce better recommendations than either

agents or users can produce alone.

Four key models are presented here [20-jour]:

1. Pure collaborative filtering using the opinions of other community members

2. A single personalized "agent" – a machine learning or syntactic filter

3. A combination of many "agents"

4. A combination of multiple agents and community member opinions

Four primary hypotheses are given below:

1. H1: The opinions of a community of users provide better recommendations than a

single personalized agent.

2. H2: A personalized combination of several agents provides better recommendations

than a single personalized agent.

3. H3: The opinions from a community of users provide better recommendations than a

personalized combination of several agents.

4. H4 : A personalized combination of several agents and community opinions provides

better recommendations than either agents or user opinions alone

The most important results they have found were the value of combining agents with CF and of

combining agents and users with CF. In essence, these results suggest that an effective

mechanism for producing high-quality recommendations is to throw in any available data and

allow the CF engine to sort out which information is useful to each user. In effect, it becomes

less important to invent a brilliant agent; instead we can simply invent a collection of useful

ones. To take advantage of learning agents, these engines must be redesigned to accommodate

"users" with dynamic rating habits.

They have examined several different CF engine designs that could efficiently use filterbots.CF

outperformed linear regression as a combining mechanism for agents. While linear regression


should provide an optimal linear fit, it appears that CF's non-optimal mechanism actually does

a better job avoiding over fitting the data when the number of columns approaches the number

of rows.

CF also has the advantage of functioning on incomplete (and indeed very sparse) data sets,

suggesting that it retains its value as a useful combination tool whenever human or agents are

unlikely to rate each item.

Future work should both incorporate larger user sets (other experiments have consistently

shown MAE values in the range of 0.71-0.73 and ROC sensitivity values near 0.72 for movie

lens communities with thousands of users) and look explicitly at closer-knit communities to see

whether a smaller but more homogeneous community would have greater benefits from

collaborative filtering.

In the future, they plan to examine further combinations of users and agents in recommender

systems. In particular, they are interested in developing a combined community where large

numbers of users and agents co-exist. One question they hope to answer is whether users who

agree with each other would also benefit from the opinions of each other's trained agents.

5. PTV: Intelligent Personalized TV Guides

Title: PTV: Intelligent Personalised TV Guides

Author: P. Melville, R.J. Mooney, and R. Nagarajan

Year: 2002

Summary

PTV represents a convergence of technologies that provides an effective solution to the

very real problem of providing people with relevant TV listings information as digital TV

becomes a reality. PTV personalises TV information to meet the viewing preferences of

individual users by integrating two different information-filtering strategies, case-based

reasoning and collaborative filtering, with user profiling techniques.

The resulting hybrid personalisation technique allows programme recommendations to be

made according to the type of programmes a target user has enjoyed in the past as well as

the programmes that other similar users have enjoyed. In the near future, a WAP version of

PTV will be formally launched and they have believed a similar success story will unfold

as mobile phone users recognise the real benefits of high quality content personalisation on


their restricted mobile handsets [27]. In fact, we argue that traditional TV listings services

are not appropriate given the screen and bandwidth limitations of the current generation of

WAP devices – a personalised service such as PTV is the best available solution. The PTV

systems are built around a content personalisation engine that can be readily adapted to

practically any source of information content and Changing Worlds is currently using this

technology to develop the next generation of intelligent, personalised information services.

6. Probabilistic models for unified collaborative and contentbased recommendation in

Sparse-data environments

Author: Alexandrin Popescu, Lyle H. Ungar, David M. Pennock, Steve Lawrence

Year: 2001

Summary

Authors have proposed a unified probabilistic framework for merging collaborative and

content-based recommendations.

This model incorporates three-way co-occurrence data by presuming that users are

interested in a set of latent topics which in turn “generate” both items and item content

information. Model parameters are learned using expectation maximization (EM), so the

relative contributions of collaborative and content-based data are determined in a sound

statistical manner.

Here presented three probabilistic mixture models for recommending items based on

collaborative and content-based evidence merged in a unified manner. Incorporating

content into a collaborative filtering system can increase the flexibility and quality of the

recommender. Moreover, when data is extremely sparse—as is typical in many real world

applications additional content information seems almost necessary to fit global

probabilistic models at all.

They have found that a particularly good way to include content information in the context

of a document recommendation system is to treat users as reading words of the document,

rather than the document itself. In the case, this increased the density from 0.38% to almost

9%[12], resulting in recommendations superior to ANN.

There are many areas for future research. Similar methods to those presented here might be

used to recommend items such as movies which have attributes other than text. A movie


can be viewed as consisting of the director and the actors in it, just as a document contains

words. Both of the sparsity reduction techniques, similarity-based smoothing and an

equivalent of a user-words aspect model, can be used.

Em is guaranteed to reach only a local maximum of the training data log-likelihood.

Multiple restarts need to be performed if one desires a higher quality model. They have

planned to investigate ways to intelligently seed em to reduce the need for multiple restarts,

which can be costly when fitting datasets of non-trivial size.

The user-words model does not explicitly use the popularity of items. Including such

information may further improve the quality of the recommendations made by the model,

but requires additional work on combining and calibrating model predictions with

document popularity.

7. Content-boosted collaborative filtering for improved recommendations

Title: Content-Boosted Collaborative Filtering for Improved Recommendations

Author: P. Melville, R.J. Mooney, and R. Nagarajan

Year: 2002

Summary

Authors have merged Content-Collaborative filtering in a hybrid manner. This approach

overcomes disadvantages of CF systems by exploiting content information of the items

already rated. It mainly predicts recommendation even on sparse data. So basically it solves

cold start problem. Further, if rating history for several products is not available then also it

is not an issue here because it efficiently deals with sparse data.

CBCF Approach:

1. First Content predictor takes input as user rating (Sparse rating data).

2. CP convert sparse rating matrix into full rating matrix(pseudo user-ratings vector)

3. Pseudo user-ratings vector provides as input to CF.

4. CF gives top recommendation.

Experimental study proves that CBCF performs better than Pure CB and pure CF

algorithm, it also solves problem of first rater problem. It tackles data sparsity. Future work,

CBCF performs consistently better than pure CF, the difference in performance is not very

large (4%)[2].


8. Clustering Approach for Hybrid Recommender System

Author: Qing Li, Byeong Man Kim

Year: 2003

Summary:

Authors have presented Clustering techniques to solve cold start problem. Mainly cold start

problem occurs when products are recommended using CF. CF cannot recommend new items

to users without any past rating and completely denies any information that can be extracted

from content of items. So here authors have proposed hybrid system which combines CB and

CF together.

Ichm – item based clustering hybrid model [28]:

1. First apply clustering algorithm to group the items, and then use the result, which is

represented by the fuzzy set, to create a group-rating matrix.

2. Compute the similarity: firstly, calculate the sub-similarity of group-rating matrix, than

calculate the sub-similarity of item-rating matrix. At last, the total similarity is the linear

combination of the above two.

3. Make a prediction for an item by performing a weighted average of deviations from the

neighbour’s mean.

After creating new rating matrix by grouping the items, next step is calculate the similarity. To

calculate similarity, firstly, it to use the Pearson co-relation based algorithm on item-rating

matrix then adjusted cosine algorithm also calculate similarity from the group-rating matrix. At

the end, total user similarity is a linear combination of the above two. Formulas are listed

below:

Pearson correlation-based similarity.

S(a,u) = ∑ ( – ̅̅ ̅̅ )( – ̅̅ ̅̅ )

√∑ ( – ̅̅ ̅̅ ) ∑ ( – ̅̅ ̅̅ )

Adjusted cosine similarity.

S(a,u)=∑

√∑

√∑

In last step Collaborative prediction use to predict top N product. The general formula for a

prediction on the item i of user k is:


= ̅̅̅̅ +

∑ ( )

∑ | |

The items, which are falls in category of no history data, we can make predictions for users on

this item, based on the group-rating matrix.

9. Cinemascreen Recommender Agent:Combining Collaborative and Content-Based

Filtering

Author: James Salter and Nick Antonopoulos

Year: 2006

Summary:

Cra system:

Collaborative filtering first:

it involves first finding a subset of users with film tastes similar to the current user. Comparing

the current user’s rating history with the history of every other user, the system finds the

current user’s potential peers—that is, other users who have rated films the current user has

rated(pearson’s product-moment correlation coefficient, r).

Although this approach might generate a larger set of films for making recommendations, it

would likely also reduce the prediction accuracy. To make its predictions, our collaborative

filtering process uses the peer ratings and gives a weighted average to each film according to

the strength of each peer’s correlation with the current user.

Once all calculations are complete, the agent stores the list of films and predicted ratings. The

system also stores the number of significant peers who rated the film because it gives an

indication of the potential recommendation’s strength. The system can therefore use this

number as a secondary sorting field when it produces recommendation lists. The system then

feeds the predicted ratings into the content-based filtering algorithms.

Content-based filtering on collaborative results [29]

We designed the content-based filtering to use information about each film with a content

based rating as input to the process of finding links to other similar films. There are several

ways to find links. We used a simple scoring mechanism. It then adds the film’s rating (either

predicted or user-given) to the score for each film element.


Once it completes this process for all ratings, the agent calculates the average score for each

actor, director, and genre. This score indicates how much the user likes or dislikes each

element. The agent can then compute the predicted rating for each film. In a process similar to

that for finding links, the element’s average score is added to the film’s score. System

administrators who are configuring the recommender system can also assign weights to the

elements. The agent can then compute the predicted rating by dividing the film’s total score by

the number of elements used to calculate it. The agent can augment the list of films and

predicted ratings with any predictions that resulted from the initial collaborative-filtering

process but didn’t appear in the final prediction set (because of incomplete film information in

the database).

The agent also records the number of elements for each film as an indicator of the prediction’s

strength, again so it can use the information as a secondary sort field when it creates

recommendation lists.

In future work, to enable the assignment of different weightings to each filtering technique’s

results according to certain parameters. We can also reverse the order of algorithm. We want to

apply and evaluate our hybrid recommendation method to other domains and emerging

technologies.

10. Hybrid collaborative filtering algorithms using a mixture of experts

Author: Xiaoyuan Su1, Russell Greiner, Taghi M. Khoshgoftaar, Xingquan Zhu

Year: 2007

Summary:

Authors have proposed two hybrid CF algorithms, sequential mixture CF and joint mixture

CF. these approaches perform well in sparse data environment. A hybrid recommender

system combines CF and content-based techniques to overcome the limitations of either

recommender system and thereby improve recommendation performance. One shortcoming

of hybrid recommender systems is that the content information is not always available for

the reasons such as privacy protection.

Sequential mixture CF [30](SMCF) first uses the predictions from a TAN(tree augmented

naive Bayes network)-ELR(Extended Logistic regression) content-based predictor

(instead of NB) to fill in the missing values of the CF rating matrix to form a pseudo rating


matrix, then predicts user ratings by using the Pearson CF algorithm instead of weighted

Pearson CF on the pseudo rating matrix. It is similar to content-boosted CF algorithm.

Fig 3.1: SMCF

Joint mixture CF (JMCF) [11]combines the predictions from three independent experts:

Pearson correlation-based CF, a pure TAN-ELR content-based predictor, and a pure TAN-

ELR model-based CF algorithm.

11. HYDRA-A hybrid recommendation system

Author: stephan spiegel, jérôme kunegis, fang li

Year: 2009

Summary:

Authors have combined CB and CF approach which utilize supplementary content feature

in order to improve the prediction accuracy. In Hydra, data normalization, feature

combination and matrix factorization are all preliminary steps to rating estimation.

Data normalization:

Subtractive normalization- few users sometime give higher rating than others, sometime

also items receive more positive feedback than other items. To compute accurate rating

prediction these global effect need to be removed from data.

Multiplicative normalization:

Subtractive normalization, the feature values would just be shifted instead of being

regularized. Therefore we make use of multiplicative normalization, which regularizes all

entries within a feature matrix f according their respective row and column length.

Feature combination:

The purpose is to identify those features and appropriate weights, which can improve the

prediction accuracy of our hybrid recommender system.


Matrix factorization:

Matrix factorization techniques are used to reduce the dimension of the item space and/or to

retrieve latent relations between items of the observed dataset.

Hyb-svd-knn algorithm[13] (also referred to as hydra system) is able to raise prediction

accuracy by incorporating weighted user and item features. The pure SVD approach shows

the lowest computational effort; our hybrid method is about four times faster than

traditional collaborative filtering (KNN approach).

Here hybrid approach is special in that rating data as well as content information are joined

in a unified model, which leads to less parameters and more reasonable prediction results.

For the purpose of minimizing the runtime of designed hybrid recommender system as well

as to extract latent user and movie relations, factorize unified model by means of singular

value decomposition. The dimensionally reduced data can be employed to directly estimate

unknown ratings (pure SVD approach) or rather to accelerate collaborative filtering (SVD-

KNN as well as HYBSVD- KNN algorithm).

Future work, it would be interesting to apply this algorithm on different dataset from

different domain, because unlike content features might achieve even higher prediction

accuracy improvement.

12. A hybrid Recommendation Method with Reduced Data for large Scale

Author: Sang Hyun Choi, Young-Seon Jeong, and Myong K. Jeong

Year: 2010

Summary:

Authors have proposed HYRED algorithm which combine CF using the modified

Pearson’s binary correlation coefficients with CB filtering using the generalized distance-

to-boundary-based rating.

First, HYRED[32] proposes the concept of neighbourhood in CF to efficiently analyse the

transaction data. The use of the nearest and farthest neighbours of a target customer yields a

reduced dataset of useful information for solving the scalability problem. The organization

of the training dataset has been restricted to the items purchased by a target user and his or

her farthest neighbours so that the number of training datasets can be reduced considerably.


At the testing step, we have found only items purchased by nearest neighbours and

predicted the score of each item. The use of fewer training and testing datasets enables us

not only to lessen the computing effort, but also to improve the performance of

recommendations.

The processes of filtering irrelevant data by using the neighbourhood concept of CF make it

possible to consider the items that are likely to be purchased by a target user. Second,

propose the generalized rating system based on the distance of an item to the decision

boundary of a classifier. In this concept, the item closer to the class of purchased items may

have a higher probability of being sold. The experiment shows that the DTB(Distance to

bound)-based rating improves the performance of recommendation than either pure CB or

CF. The algorithm has calculated the distance from alternative items to the ones purchased

by a target user, whose items have been selected from statistical classifiers. This selection

method used neighbourhood information and delivered better performance than was gained

with pure CF.

Finally, proposed a generalized hybrid recommendation algorithm by using a weighted

coefficient in which the DTB and CF methods are special cases of our generalized

algorithm. The weighting scheme makes this algorithm adequate for generalized

applications, and HYRED is flexible enough for application with any available datasets.

Moreover, HYRED, when weighting is properly valued, has yielded better results than pure

DTB, pure CF, and simple combined hybrid method.

13. Enhanching Accuracy of Recommender system through adapting the domain trends

Author: Fatih Aksel, Aysenur Birtürk

Year: 2010

Summary:

Authors proposed an adaptive method for hybrid recommender system, adarec, in which the

combination of algorithms are learned and dynamically updated from the results of

previous predictions.

Adarec, An Adaptive Hybrid Recommender System[33]:

It uses Hybrid recommendation systems which combine multiple algorithms and define a

switching behaviour (strategy) among them. This strategy decides which technique to

choose under what circumstances for a given prediction request. Adarec consists two parts:


Recommendation Engine and Learning Module. Recommender Engine is responsible for

generating the predictions of items based on the previous user profiles and item contents.

The recommender generates the predictions by using its attached prediction strategy.

Learning Module handles the new prediction strategy creation upon the previous instances

and performance results of the prediction techniques on each learning cycle.

The learning module first tests the accuracy of the each predictor in the system. Than the

prediction strategy is redesigned by the learning module in order to improve proper use of

predictors. Adaptive prediction strategy improves its' prediction accuracy by learning better

when to use which predictors. The learning module adapts the hybrid recommender system

to the current characteristics of domain.

Research study shows that traditional static hybrid recommender systems suffer from

changing user preferences. In order to improve the recommendation performance, handle

domain drifts in our approach. The Learning Module re-designs its prediction (switching)

strategy according to the performance of prediction techniques based on user feedbacks. As

a result, the system adapts to the application domain, and the performance of

recommendation increases as more data are accumulated.

In future work, they have planned to further testing the learning module with various

heterogeneous datasets. It would be interesting to examine the different domains. In

experiments we fixed the used attributes for domain monitoring. It would be also

interesting to use dynamic attributes, which meansto use different attributes on different

iterations.

14. A Content enhanced approach for cold-start problem in collaborative filtering

Author: Dongting Sun, Cong Li and Zhigang Luo

Year: 2011

Approach:

Summary:

Author have proposed a hybrid algorithm by using both the rating and content information to

overcome cold-start (user and products without any rating) problem. This hybrid approach first

cluster items based on the rating matrix and then utilize the clustering results and item content

information to build a decision tree to associate the novel items with the existing ones. In cold

start problem, content information can help to bridge the gap between existing and new items,


as well as between existing and new users by building relationships among them. Content

information can combine with collaborative in various ways.

The base of the algorithm is to find the similarity among items accurately. There are various

ways to compute the similarity, the most commonly used one is Pearson correlation.

IBCTAP Algorithm[34]-IBCTAP includes four main procedures: item clustering, decision tree

building, new item classifying and ratings predicting.

Item clustering: It reduces one large-dimensionality item-user space into a set of smaller

dimensionality spaces, with fewer items, less ratings, and often less users. The most popularly

used one is K-means Clustering. Pearson correlation coefficient is used to calculate similarity

between items. As output of this process, we can obtain the number of k clusters; items in each

cluster will be liked by the users with same tastes.

Decision tree building: In order to achieve an optimized decision tree we apply the most

commonly used standard, information gain, to decide which attribute is best to be chosen. The

algorithm first calculates the entropy of the whole data set. The algorithm calculates the

information gain for every attribute and chooses the one with the highest information gain.

After the root node has been decided, the algorithm creates two branches corresponding to true

or false. For each branch, the algorithm then determines if the branch can be divided further, if

can, the same method as above is used to determine which variable to use, if not, it has reached

a solid conclusion.

New items classifying: When a new item without any rating enters in the recommender

system, this algorithm captures the item content information immediately, and then follows

down the tree that we have completely trained in the tree building procedure. The decision tree

answers each question correctly and the new item will eventually arrive at a cluster.

Ratings predicting: In traditional collaborative filtering approach, it is hard to recommend

new items to users since the new item has not any past rated data. However, in this approach,

can recommend new items to users based on the hypothesis that the new item will be preferred

by the users who like the items in the cluster that the item has arrived in the classifying

procedure. Use MAE metrics for evaluating the accuracy of our prediction method. MAE

(Mean Absolute Error) has been widely used in evaluating the accuracy of a recommender

system by comparing predicted values with user-provided values.


MAE=∑ | |

From experimental studies, it is observed that CF gives unexpected MAE value 3.1502 in

extreme cold situation. While algorithm gives 0.8251 MAE value which is four times more

improving than previous result.

Future work explores math function to measure the microcosm variety of the result curves. We

also find that items and users are symmetric under the view of rating data. If we invert the item-

user matrix direction and use the user content information to build decision tree, this model

may be applied to solve user-side cold start problem.

Table 3.1: Comparative study

Sr.

No

Year Title Work Future Work Author Publication

1 1997 Fab: Content-

based,

collaborative

recommendation

FAB

Meta-Level

Content

Based into

CF

Massive

Scale up

Dynamic

process

Balabanovic,

M., &

Shoham

Communicati

ons of the

ACM, 40(3),

66-72.

2 1998 Recommendatio

n as

classification:

using social and

content-based

information in

recommendation

Feature

Combination

Develope

unifying

Model

needed to

develop

instrument

s for

adapting a

general

recommen

der system

to a

specific

case

Adaption

could be

simplified

Chumki

Basu, Haym

Hirsh, and

William

Cohen

In

Proceedings

of the 1998

Workshop on

Recommender

Systems,

pages 11-15

3 1999 Combining

content-based

and

collaborative

filters in an

online

newspaper

Mixed

appraoch

Combining

Separate

Recommende

r

Linear

Combination

rating

Add

Demographics

technique

Accuracy in

prediction

Claypool,

M., Gokhale,

A., Miranda,

T.,

Murnikov,

P., Netes,

D., & Sartin,

M

ACM

SIGIR'99

Workshop on

Recommender

Systems:

Algorithms

and

Evaluation,

Berkeley, CA.

4 1999 Combining

collaborative

filtering with

Add content

based

charachteristi

Further

Combination

user and

Good, N.,

Schafer, J.

B., Konstan,

Proceedings

of the

Sixteenth


personal agents

for better

recommendation

cs to

collorative

methods

agents

Scalability

J. A.,

Borchers,

A., Sarwar,

B.,

Herlocker,

J., & Riedl,

J.

National

Conference on

Artificial

Intelligence,

Orlando, FL,

pp. 439-446.

5 2000 PTV: Intelligent

Personalised TV

Guides

Mixed

Approach

WAP

version of

PTV

launched

Paul Cotter

& Barry

Smyth

IAAI-00

Proceedings.

Copyright ©

2000, AAAI

(www.aaai.or

g)

6 2001 “Probabilistic

models for

unified

collaborative

and

contentbased

recommendation

in sparse-data

environments

Unified

Model based

approach

reduce the

need for

multiple

restarts

Used where

it is actually

followed by

user

A. Popescul,

L. H. Ungar,

D. M.

Pennock,

and S.

Lawrence

Proceedings

of the 17th

Conference in

Uncertainty in

Artificial

Intelligence

(UAI ’01), pp.

437–444

7 2002 Content-boosted

collaborative

filtering for

improved

recommendation

s.

content

within a

collaborative

framework

improvem

ents in

collaborati

vefiltering

or content-

based

recommen

ding

Work on

accurate

prediction

Melville, P.,

Mooney, R.

J., &

Nagarajan,

R.

Proceeding

s of the

Eighteenth

National

Conference

on

Artificial

Intelligence

(pp. 187-

192).

Menlo

Park, CA /

Cambridge,

MA: AAAI

Press / MIT

Press.

8 2003 Clustering

Approach for

Hybrid

Recommender

System,

Work on cold

start problem

by clustering

approach

---- Li, Q. &

Kim, B. M

Proc. of the

IEEE/ WIC

International

Conference on

Web

Intelligence,

pp. 33-38,

9 2006 CinemaScreen

Recommender

Agent:Combinin

g Collaborative

and Content-

Based Filtering

Content

based

filtering on

CF result

reverse the

order of

algorithm

Apply on

other

James Salter

and Nick

Antonopoulo

s

Intelligent

Systems,

IEEE,

Volume: 21 ,

Issue: 1

http://www.aaai/


domain

10 2007 Hybrid

collaborative

filtering

algorithms using

a mixture of

experts

Incorporatein

g CF and

Content

Based

Features

----- X. Su, R.

Greiner, T.

M.

Khoshgoftaa

r, and X.

Zhu

Proceedings

of the

IEEE/WIC/A

CM

International

Conference

onWeb

Intelligence

(WI ’07), pp.

645–649,

Silicon

Valley,

Calif, USA

11 2009 HYDRA-A

hybrid

recommendation

system

Minimize

runtime

Choose

content

accurate

content

features for

accuracy

Apply on

different

attribute

domain and

check

accuracy

Stephan

S.,Jerome

K,Fang L.

CNIMK’09

ACM

12 2010 A hybrid

Recommendatio

n Method with

Reduced Data

for large Scale

Accuracy is

improve here

over full data

set

Apply for

other

recommendati

on domain

Sang hyun

choi,young-

seon,mayon

g

IEEE

transaction on

systems ,man

and

cybernetics-

Part

C:application

and review

vol 40 no 5

13 2010 Enhanching

Accuracy of

Recommender

system through

adapting the

domain trends

Dynamically

update the

result

Effectivness

accuracy of

other machine

learning

techniques,

Dynamic

attributes

Faith

a.,Aysenur

B.

PRSAT 2010

held in

conjunction

with RecSys

2010

14 2011 A Content

enhanced

approach for

cold-start

problem in

collaborative

filtering

Solve cold start

problem

Combine

algorithm more

accurately

DongtingS.,

Cong

L.,Zhigang

L.

IEEE


3.2 RELATED WORK

Initially Fab maintains user profiles of interest in web pages using content-based techniques, and

uses CF techniques to identify profiles with similar tastes. It can then recommend documents across

user profiles[35].

In next approach, simply both CB and CF methods produce separate result and then combine their

prediction[26].

In another approach[25], treat recommending as a classification task. It uses both user ratings and

contents features to produce recommendations .They use Ripper, a rule induction system, to learn a

function that takes a user and movie and predicts whether the movie will be liked or disliked. They

combine collaborative and content information, by creating features.

In another approach [36], the term-document matrix is multiplied with the user-ratings matrix to

produce a contentprofile matrix. Using Latent Semantic Indexing, a rank k approximation of the

content-profile matrix is computed. Term vectors of the user’s relevant documents are averaged to

produce a user’s profile. Now, new documents are ranked against each user’s profile in the LSI

space.

In another approach[37], each user-profile is represented by a vector of weighted words derived from

positive training examples using the Winnow algorithm. Predictions are made by applying CF

directly to the matrix of user-profiles (as opposed to the user-ratings matrix).

In this approach [38], it uses collaborative filtering along with a number of personalized information

filtering agents. Predictions for a user are made by applying CF on the set of other users and the

active user’s personalized agents. Our method differs from this by also using CF on the personalized

agents of the other users.

In this approach [39], implemented a set of knowledge-based “filterbots” as artificial users using

certain criteria. A straightforward example of a filterbot is a genrebot, which bases its opinion solely

on the genre of the item, for example, a “jazzbot” would give a full mark to a CD simply because it is

in the jazz category, while it would give a low score to any other CD in the database.

In this approach [40], it uses the prediction from the CF system as the input to a content-based

recommender.


It [41] proposes a Bayesian mixed-effects model that integrates user ratings, user, and item features

in a single unified framework.

In Content – Boosted CF [42]. Uses na¨ıve Bayes as the content classifier, it then fills in the missing

values of the rating matrix with the predictions of the content predictor to form a pseudo rating

matrix, in which observed ratings are kept untouched and missing ratings are replaced by the

predictions of a content predictor. It then makes predictions over the resulting pseudo ratings matrix

using a weighted Pearson correlation-based CF algorithm, which gives a higher weight for the item

that more users rated, and gives a higher weight for the active user.

In another approach[30], used TANELR as the content-predictor and directly applied the Pearson

correlation-based CF instead of a weighted one on the pseudo rating matrix to make predictions, and

they achieved improved CF performance in terms of MAE.

In another approach, [43] propose a Bayesian preference model that statistically integrates several

types of information useful for making recommendations, such as user preferences, user and item

features, and expert evaluations. They use Markov chain Monte Carlo (MCMC) methods for

sampling based inference, which involve sampling parameter estimation from the full conditional

distribution of parameters. They achieved better performance than pure collaborative filtering.

3.3 HYBRID TECHNIQUE – CBCF

3.3.1 Introduction

Melville et al. Proposed a content-boosted collaborative filtering algorithm (CBCF) to

overcome the shortcomings of content-based and collaborative filtering algorithms when used

individually [42]. By using a content-boosted collaborative filtering approach, the authors

solved the sparsity and cold-start problems of collaborative filtering algorithms that were

discussed in the previous chapter. They used the content information to seed the user-ratings

matrix to solve the two problems. The shortcomings of content-based methods of not finding

serendipitous recommendations were also solved using social information coming from the

data set of a social network.

The CBCF algorithm was tested on movie recommendations, A pure content-based predictor

was used to learn a user's rating and predict the ratings for the unrated movies. The pure

content-based predictor uses a naive Bayesian classifier to learn and predict user ratings. The

predicted ratings along with the user ratings of each user are referred to as the pseudo user-


ratings vector. This pseudo user-ratings matrix is used by a pure collaborative filtering

algorithm that uses a neighbourhood-based algorithm to find a subset of users who are similar

to an active user. A set of neighbours are chosen who have the highest similarity to the active

user measured by the Pearson correlation coefficient. At the end compute a prediction from

a weighted combination of the selected neighbours’ ratings.

3.3.2 Discussion

In this section we explain how content-boosted collaborative filtering overcomes some of the

shortcomings of pure CF[44].

Overcoming the First-Rater Problem

Pure CF suffers from first rater problem/New user-item problem. First rater problem means an

item which does not have any past rating data. However, here this kind of prediction is

possible with the help of content based predictor. If the neighbors of the active user are highly

correlated to it, then their CB predictions should also be very relevant to the user. This is

particularly true if neighbors have rated many more items than the active user; because their

CB predictions are likely to be more accurate than the active user’s. In this way, CBCF solves

the first-rater problem, and produces even better predictions than the content-based predictor.

Tackling sparsity

In CBCF, since we use a pseudo ratings matrix, which is a full matrix, we eliminate the root

of the sparsity problem. Pseudo user-ratings vectors contain ratings for all items; and hence all

users will be considered as potential neighbors. This increases the chances of finding similar

users. Thus the sparsity of the user-ratings matrix affects CBCF to a smaller degree than CF.

Finding better neighbours

A crucial step in cf is the selection of a neighborhood. The neighbors of the active user

entirely determine his predictions. It is therefore critical to select neighbors who are most

similar to the active user. In pure cf, the neighborhood comprises of the users that have the

best n correlations with the active user. The similarity between users is only determined by

the ratings given to co-rated items; so items that have not been rated by both users are

ignored. However, in cbcf, the similarity is based on the ratings contained in the pseudo user-

ratings vectors; so users do not need to have a high overlap of co-rated items to be considered

similar.


3.3.3 Comparison of Naïve Bayes with other Content Based technique

One technique adapted from IR is the assignment of weights to keywords. The commonly

used approach to specify weights to keywords is the term frequency-inverse document

frequency (TF-IDF). Other techniques used for content-based Recommendations include

probabilistic models, such as Bayesian classifiers, and machine learning techniques like

artificial neural networks. These approaches generate predictions by learning the underlying

model with statistical analysis and machine learning techniques [45]. This thesis uses content-

based method with Naïve Bayes.

Traditional methods (Heuristics) based on information retrieval while other methods are

calculating utility prediction. In our thesis, we require prediction so here naïve Bayes method

is selected for content based predictor. Because of simplicity and effectiveness, also Naïve

Bayes classifiers are often used in text classification applications and experiments. Here we

compare the Bayesian classifier to several standard machine learning algorithms and present

experimental evidence that the Bayesian classifier performs at least as well as these

computationally more intensive alternatives.[46]

Nearest neighbor

The nearest neighbor algorithm operates by storing all examples in the training set. To

classify an unseen instance, it assigns it to the class of the most similar example. Since all of

the features we use are binary features, the most similar example is the one that has the most

feature values in common with a test example.

Decision trees

Decision tree learners such as ID3 build a decision tree by recursively partitioning examples

into subgroups until those subgroups contain examples of a single class. A partition is

formed by a test on some attribute (e.g., is the feature database equal to 0). ID3 selects the

test that provides the highest gain in information content.

Neural nets

We used two approaches to learning with neural nets. In the perceptron approach, there are

no hidden units and the single output unit is trained with the delta rule .The perceptron is

limited to learning linearly separable functions. We also use multi-layer networks trained

with error back propagation. We used 12 hidden units in our experiments.


Table 3.2 [46]: Average accuracy of the classification algorithm

So here Bayes Classifier performs consistently well on most domains. It is also very fast for

both learning and predicting.

3.3.4 Comparison of Pearson Co-relation with other similarity measures in CF

There is a variety of similarity metrics available [47]. Some of the most commonly used

measures include Pearson correlation coefficient, cosine measure, distance measure and

jaccard coefficient.

Distance measure can be used is the distance. Distance between data objects is sum of the

distances of each attribute of the data objects (i.e. Euclidean Distance).

Another method the machines can use to determine the similarity between data objects is by

measuring how the attributes of both data objects change with respect to the variation of the

mean value for the attributes. This method of determining the similarity is the Pearson

Correlation coefficient.

There may be cases when the data objects are not simply a group of numbers, but perhaps a

Boolean value. To represent the similarity to a machine, finding the ratio between the

numbers of matching attributes to the total number of attributes is a better metric, which is the

case with the Jaccard Coefficient.

The cosine similarity may be used and an example of this metric being used is with document

comparison. By using the word frequencies for each document, the normalized dot product of

the frequencies can be used as a measure of similarity.


Table 3.3: Various Similarity Measures

Similarity

Measure

Used Remarks

Eucledian

Distance

To calculate

distance between two points

-

Cosine

Similarity

To determine similarity

between two documents

Since there are more words that are

incommon between two documents,

it is useless to use the other methods

of calculating similarities (namely

the Euclidean Distance and the

Pearson Correlation Coefficient

discussed earlier)

Jaccard

Coefficient

each attribute is binary such

that each bit represents the

absence of presence of a

characteristic,

-

Pearson

correlation

coefficient

TO measures how highly

correlated are two variables

and is measured from -1 to

+1

Used in User Based CF

Pearson correlation coefficient:

Here in cbcf, full matrix is provided as input to cf. It contains rating of movie given by user.

In cf process, pure cf algorithm that uses a neighborhood-based algorithm to find a subset of

users, who are similar to an active user. A set of neighbors are chosen who have the highest

similarity to the active user. So to measure highly correlated users, pearson correlation

coefficient a best option.

There are several benefits to using this type of metric. The first is that the accuracy of the

score increases when data is not normalized. As a result, this metric can be used when

quantities (i.e. Scores) varies. Another benefit is that the Pearson correlation score can correct

for any scaling within an attribute.


3.4 Evaluation measure of rs

Evaluation still presents several challenges and problems, summarized here[48]:

(1) Coverage

This corresponds to the percentage of items the system is able to recommend

(2) Prediction accuracy

This measures the difference between the rating the system predicts and the real rating.

The most popular of this kind of metric is the mean absolute error (MAE).

MAE=∑ | |

Where n is the total number of ratings over all users, pi, j is the predicted rating for user i on

item j, and ri, j is the actual rating [49].

Other related metrics, such as mean squared error (mse), root mean squared error (rmse):

RMSE=√

∑

Where n is the total number of ratings over all users, pi, j is the predicted rating for user i on

item j, and ri, j is the actual rating again. RMSE amplifies the contributions of the absolute

errors between the predictions and the true values.[49]

3) Classification accuracy and Rank Accuracy

This measures how well the system differentiates good items from bad ones. Examples of well-

known metrics of this type are Precision, Recall and ROC [47]. These metrics are appropriate

for the find good items task, especially when the preferences of the users are binary.

Table 3.4: confusion matrix

Reality All

recommended

items All good

Items

Good Bad

Prediction Related

Good

True Positive False Positive

Related Bad False Negative True Negative


Precision measure the degree of accuracy of recommendation produced by the system.

Precision=

=

Recall measure the degree of relevant recommendation to the total number of

recommendation.

Recall=

=

ROC is used a plot of the system sensitivity and (1-specificity),where sensitivity is the

probability of a randomly selected good item being recommended by the system and

specificity is the probability of a randomly selected item being refused by the system.

Rank accuracy measures the ability of the system to sort the recommended items like the user

would have done. In many cases, this kind of metrics is too sensitive given they ask the system

to recommend the best items when, in practice, it would suffice to recommend good items and

not necessarily the best.

3.5 DATASET [44]

3.5.1 Overview

Study experiments are evaluated in movie lens dataset which is provided by the Compaq

Systems Research Center. This dataset contains 7,893 randomly selected users and 1,461

movies for which content were available from the Internet Movie Database (IMDB). The

reduced dataset has 299,997 ratings for 1,408 movies and average number of votes per user is

approximately 38. Minimum value of rating is zero and maximum rating value is 5.

Actually, the important point is whether movie lens dataset will be dense or sparse when the

missing data prediction process is handled. Initial sparsity of movie lens dataset is 97.4 %.

In order to evaluate the prediction mechanism of system, cross validation method was used

and among the various cross validation methods, the holdout method was preferred.

Following this method, the data set was separated into two sets, called the training set and the

testing set. We represent the content information of every movie as a set of slots (features).


Each slot is represented simply as a bag of words. The slots we use for the each- Movie

dataset are: movie title, director, cast, genre, plot summary, plot keywords, user comments,

external reviews, newsgroup reviews, and awards.

3.5.2 Result Analysis

This section discusses the results that were obtained with the experiments set up.

There are four methods (pure CB, pure CF, naïve hybrid and CBCF) apply on movie lens

dataset and give various MAE and ROC-4 value.

The naive hybrid approach takes the average of the ratings generated by the pure content-

based predictor and the pure CF predictor. Ten percent of the users were randomly selected to

be the test users. From each user in the test set, ratings for 25% of items were withheld.

Predictions were computed for the withheld items using each of the different predictors.

Table 3.5 [44]: Summary of Result

CBCF was significantly better than the other algorithms at both MAE(0.956) and ROC-4

(0.7716).On the MAE metric, CBCF performs 9.2% better than pure CB, 4% better than pure

CF and 4.9% better than the naive hybrid.

On the ROC-4, metric CBCF performs 5.4% better than pure CB, 4.6% better than pure CF and

9.7% better than the naive hybrid.

3.6 LIMITATION OF EXISTING SYSTEM

These systems can still suffer from scalability problems as the number of items and users

increases exponentially [42]. Web services in particular suffer from producing

recommendations of millions of items to millions of users. The time and computational power

can even limit the performance of the best hybrid systems.

Although CBCF performs consistently better than pure CF, the difference in performance is

not very large (4%).[44]

chapter 3 analysis of existing system and...

Documents