recommendation of movies utilizing real time user … before handling it to the users. recommender...

http://www.iaeme.com/IJCET/index.asp 115 [email protected]

International Journal of Computer Engineering & Technology (IJCET) Volume 9, Issue 3, May-June 2018, pp. 115–127, Article ID: IJCET_09_03_013

Available online at

http://www.iaeme.com/IJCET/issues.asp?JType=IJCET&VType=9&IType=3

Journal Impact Factor (2016): 9.3590(Calculated by GISI) www.jifactor.com

ISSN Print: 0976-6367 and ISSN Online: 0976–6375

© IAEME Publication.

RECOMMENDATION OF MOVIES UTILIZING

REAL TIME USER INTEREST MODEL

Varsha

Department of Computer Science and Engineering,

Krishna Institute of Engineering & Technology Ghaziabad 201206, Uttar Pradesh

Seema Maitery

Professor, Department of Computer Science and Engineering,

Krishna Institute of Engineering & Technology Ghaziabad 201206, Uttar Pradesh

ABSTRACT

This large volume of information requires techniques or tools for efficient extraction

of required information. In this paper we proposed a new technique for the

recommendation of movies utilizing real time user interest model. We have also evaluated

slope one and its variants, weighted slope one and bipolar slope one, which are currently

popular recommendation algorithm used by most of the memory based recommendation

system. But due to various limitations like sparsity, cold start, of these algorithm limits

the accuracy and performance of the predictions and hence quality of recommendations.

The algorithm proposed here improved the existing slope one algorithm and increased the

efficiency to a great extent. It’s also very scalable; take less memory space as it reduces

item search scope by grouping users according to user similarities based on real time

genre rating information. Results prove that R-slope one algorithm gives better

performance over other algorithm.

Keywords: Recommender System, Information extraction, weighted slope one and

bipolar slope one

Cite this Article: Varsha and Seema Maitery, Recommendation of Movies Utilizing Real

Time User Interest Model. International Journal of Computer Engineering & Technology,

9(3), 2018, pp. 115–127.

http://www.iaeme.com/IJCET/issues.asp?JType=IJCET&VType=9&IType=3

1. INTRODUCTION

The innovation in the internet and use of ubiquitous devices has made it difficult to search the

required useful information from the bundle of information available at hand. This large

volume of information requires techniques or tools for efficient extraction of required

information. This is known as information filtering which filters out redundant and not

required information from an information stream through some automated or systematic

methods before handling it to the users. Recommender systems are the sub class of

information filtering systems which are used to predict the rating to an item.

Recommendation of Movies Utilizing Real Time User Interest Model


1.1. Recommendation System

Recently advancement and development of Internet bought us a lot of information, which

we all are unable to handle. There are a lot of decision making problems we face in daily life

such as “Which movie should I watch? Which car should I buy? What is the best holiday

place to go next with family? Which investment plan should I select for supporting the future

education of my daughter? Which TV show should I follow? Which book should I buy next?

Which degree and university I should apply for? ”To resolve the information overloading,

various recommendation system algorithms are innovated to complement and guide the

selection operation. Multiple Recommendation systems have been proposed to make

automated the operation of recommendation. Recommendation systems (RS) help to match

users with items by easing access to relevant information from information overload and

providing sales assistance like guidance. According to Xiao & Benbasat[13]: These systems

do recommendation of web information, online itmes, and various other types of

entertainment media like TV series and movies. Large-scale commercial applications for the

recommendation systems can be felt existing in many e-commerce sites such as Amazon,

Jabong, Book my Show. Due to all of the above, the conversions from vistor-to-customer

communication to a peer-to-peer model have been a very important aspect of the ubiquitous

environments.

Various factors which decide that RS is doing its job well or not are:

• Predict to what degree users like an item

• Give users a "good feeling" by guiding him/her in making decision or selection

• Give users knowledge about the product domain

• Convince/persuade users ‐ explains why selected product or service is

recommended

• Increase "hit", "clicks", and "browsers to customers" rates

• Optimizes sales margins and profit

1.2. Various Paradigms of Recommender System

1.2.1. Recommender systems reduce information overload by estimating relevance

Figure 1 Recomender system reduce data overload

Varsha and Seema Maitery


1.2.2. Personalized recommendations

Figure 2 Personalized Recommendation

1.2.3. Collaborative

"Tell me what's popular among my peers"

Figure 3 Colleborative Filtering

1.2.4. Content‐‐‐‐based

"Show me more of the same what I've liked

Figure 4 Content based Filtering



1.2.5. Knowledge‐‐‐‐based

"Tell me what fits based on my needs"

Figure 5 Knowledge based Filtering

2. LITERATURE REVIEW

Author presented an approach [1], named GRUPITO, to make recommendations for groups

of people base on three important features: personality, social trusts and memory of previous

recommendations [12, 13]. They created “Happy Movie” a Facebook application for

recommending movies to a group of application users. This application was initially

developed for the movies recommendation, proposal can be easily found equally applicable

for other domains as well. But this does not take into account real time change in user

preferences.

As described in [15], researchers present learning patterns of user interest to do

recommending information resources such as web articles and news information. Authors

explain various kinds of data present to analyze if the specific page should be recommended

to a specific user or visitor or not. This data includes sources from contents of the web

articles, the scoring of the target visitor for the other web pages and the contents of selected

pages, the ratings given to that page by other users and the ratings of these other users on

other pages. They illustrated how even a single available piece of data recently may be

utilized and pave way for a novel method to merge recommendations information available of

multiple and different types of recommendations resources. Authors proposed their approach

in the context of recommending restaurants.

Work in the paper [16], proposes a hybrid approach founded on the base of content based

CF, which is used to implement “Mo-Re”, and unique and efficient movie RS. Work further

gives comparative analysis of the hybrid-approach with the core approaches available for

collaborative filtering and content based CF.

Reference [17] presents a new, unified, and structure method to combine CBCF to rank

items and visitor interest recommendation. The architecture consumes complete set of

available information by merging together various eLearning problems and employing a

similarity approach between the input user-item key-value pairs.

CF algorithms are most prominent in electronic commerce domain to deliver feel-

excellent customer experiences and to assist customers in performing purchase process by

recommending products and recommending users additional products similar to user to

interest. Purchasing of goods over the internet is popular trend and various electronic

commerce agents likes Amazon.in, Launch.com, Jabong.com, and Flipkart.com intensively

use automotive CF approaches. Paper [19] proposes an approach to compare musicale



compositions; gives an indication of degree of closeness of two or more target musical parts

to each other. In paper or work it’s illustrated that a reasonable amount of composition

similarity is found in various musical pieces of compositions which falls into to the similar

category or genre. In the work given [20], the researchers provided an intelligent software

component, called Traveller; this application guides customers or browsers in the domain of

travel and tourism. Applications use CF to recommend holiday and tour packages. The

techniques used of hybrid CF method makes advantage of desirable features of core available

CF technique thus resolves the limitations posed from every one of these core approach while

utilized individually.

3. PROPOSED WORK

3.1. Simple SLOPE ONE Approach

The slope one method takes into account both information from other users who have rated

the same movie and the other movies rated by the same user.

Given a training data set c, and any two movies j and i with ratings uj and ui respectively

for some user u, we consider the average deviation of item i to item j as:

1

Any user data set u which does not contain both uj and ui is not included in the

summation. The symmetric matrix computed by devj;i is calculated only once and updated

easily when new data values is entered. Known that devj;i + ui is a prediction for uj and ui,

slope one predictor can be the average of all these predictions[23].

2

We may simplify the recommendation or prediction formula for the SLOPE ONE method

to:

3

4

3.2. The BI-POLAR SLOPE ONE Approach

While weighting supported frequently occurring rating patterns over infrequent rating

patterns, this approach proposes new relevant ratting pattern algorithm by splitting the

prediction process into two parts. Employing the weighted slope one algorithm, first derive



one prediction from movies liked by users and another prediction from movies that are

disliked by users. Having a rating scale from 0 to 10, we can take 5, as the threshold and to

assume that movies rated above 5 are liked and those rated below 5 are not liked by user.

However more than 60% of ratings in the IMDB movie ratings data are above the middle of

the scale. We are taking the user’s average as a threshold between the users liked and disliked

movies. Consider optimistic users, who like every movie they rate, are assumed to dislike the

movies rated below their average.

3.3. The R- SLOPE ONE Approach (Proposed Algorithm)

R-SLOPE ONE is a CF recommendation approach based on real time user interest model and

employing movie genre information. Recommendation systems such as GroupLens, and

Ringo were proposed employing synergic approach, which utilized multiple resources like

news information, music, jokes & humors, and movies [34, 35].In real life, daily people used

to search for opinions or recommendations from colleague, friends and known’s before going

to watch a new or unwatched movie. It is the equivalent to concept of CF based

recommendation approach. System assumes that nearest users give closely related ratings for

the same movie; thus generating predictions to prospective watchers. Being considering

interest model of users and establishing similarities among them, the approach explores user

groups who shares similar profiles and interests to make predictions after taking into account

those user groups rating [36].

Steps in the process of conventional CF prediction approach can be summarized as

follows:

3.3.1. Representation of users’ interest matrix

CF algorithms considered users interests or preferences from their evaluation u of movies

[37]. The approach takes user-item rating matrix. The user evaluation matrix is m x n

dimensional vector, where m and n represents the users and movies in the system

respectively. Table 4.2.1 represent user interest matrix where Rij is the rating of User I about

Movie j.

Movie 1 Movie 2 Movie 3 ….. Movie m

User 1 R11 R12 R13 ….. R1m

User 2 R21 R22 R23 ….. R2m

User 3 R31 R32 R33 ….. R3m

….. …. …… …… ….. ….

User n Rn1 Rn2 Rn3 ….. Rmn

3.3.2. Selection of nearby users

Users who have similar interest and flavours as target users are put in the neighboring users

group.

3.3.3. Generation of Recommendations

Collaborative Filtering (CF) is used to develop the proposed recommender system. The CF

involves two different approaches for generating recommendation. For FRS, both these

approaches are applied to analyze the difference in results



5

6

7

3.3.4. CF Recommendation Algorithm based on Real time user interest model

In this section we propose a new CF recommendation algorithm R-CF that employs real-time

interest model of users to depict user’s genre interest. This approach can capture user’s

interests though behavior of surfing interest, irradiating completely the limitation of cold start

faced in other FC algorithms. Also, this approach does not need item’s active scoring data,

minimizing effectively level of dependencies among the recommendation tool and user’s

interaction.

Flow/steps for this novel prediction algorithm are as follows:

Use watchers genre interest data to build mappings for standard genre rating using the method

of standardization; showing watcher’s preferences into this real-time user interest model.

Now this model is small and more relevant. Standard genre interest information is less and

more accurate than other parameters. Matrix of watcher standard genre interest labeling is

shown as follows:

Table 1 Matrix of watcher standard genre interest labeling

The measurement of watcher’s category interest degree is an objective approach, which is

different from watcher’s rating, which needs to take into account for the rating differentiation

of available different watchers.



Use user’s interest matrix for establishing linear rationale formula to compute watcher’s

interest rating for each individual movie, following equation then generate recommendations

for watchers by Top-N recommendation method.

8

Traditional slope one approach employ simple and efficient pattern of processing. This is

reason Slop one CF approach is most widely adopted to produce real-time predictions. But,

computational efficiency and accuracy of this method are biased; Reasons for this bias are

these limitations: Size of similar movies to be rated

Rating recommendations of movie j is a general process. User u‘s prediction score to

movie is computed (Predicted) based on the deviations between other users ratings about

similar movies and j. It is noted that with the increase in the size of relevant movies which

need to be predicted, computing scoring deviations between movie j and other movies would

cost huge. This would adversely effects degree of accuracy of predictions; also restrict

computational speed of the proogram, rendering it not applicable for ubiquitous

recommendation. Less User Similarity Watcher u recommendation score for movie j includes

all watchers who have provided rating for movie. Many not-similar or noise users are

removed out through this approach. But this problem exists for complete set of watchers. This

limitation will also influence the prediction outcomes.

As shown in table below, to predict Aanchal’s rating about Movie2, Prateek and Aanchal

possess the similar interest, and choices for rating movies by both are related, since Pallavi ‘s

interest profile is completely opposite from Prateek and Aanchal , also her intensity of liking

for movies is also different. Based on the evaluation of the Prateek, Aanchal ‘s prediction

rating is 5. According to the evaluation of Prateek and Pallavi to predict rating is 1. This

nature does not support our assumptions. The outcomes are not accurate or closely related as

well.

Table 2 Users Rating for Movies

Movie 1 Movie 2

Prateek 3 5

Pallavi 5 1

Aanchal 5 7

Proposed algorithm focus on the importance of establishing similar user groups to

eliminate the problem on simple slope one discussed above. Precision is increased employing

computing rating deviations between movies in user’s neighboring user group. The number or

the size of movies to be used for computation is also tremendously decreased to further

enhance the prediction precision degree over sig le slope one approach. The proposed



algorithm enhances the performance over core approach by employing real time user interest

model (rUIM), so this is known as R-Slope one. Established utilization of rUIM to build

related user groups for target movies, the algorithm drastically narrow down computational

area for predicting scores of user’s unwatched movies, and it proposes the improved average

deviation equation for movie rating on the basis of user similarities; helps related users

influence higher to the weight of average prediction deviations. Higher user’s similarity

measure represents higher contribution of the user in rating difference computation. Our

algorithm proves its advantage reducing the search space of movies which are to be

computed; enhance computational accuracy of relevant similar movies thus the prediction

reliability of the proposed recommendation system [38].

Modified average deviation formula for movie rating based on user’s similarities is as:

9

Summarized Recommendation algorithm of R-Slope one algorithm is:

Build genre interest model; create a user-genre labeling two dimensional matrix.

Search for watchers who have related movies in similar user groups and employ improved

average deviation formula to calculate average differential ratings

Use following equation to compute prediction rating for movie and employ Top-N

prediction approach to generate recommendations.

10

Method of Assessment: The evaluation of recommendation performance and its degree is

critical part of this work. The method of evaluation used for performance measurement of

recommendations system depends on the approach used. Following section describes various

methods of recommendation algorithm assessment.

4. ROOT MEAN SQUARED ERROR (RMSE) METHOD

RMSE method is employed to measure the size of mean errors. Smaller is the numerical

value of RMSE, the higher is the reliability of the recommendation approach. Computation

equation for RMSE is:

11

4.1. Recall Method

Recall is defined as ratio of movies correctly recommended to test data size. Recall value can

be calculated from the following formula:



12

4.2. PeceisionMethod

Precision method is emplloyed to get the percentage of movies predicted correct in TOP-N

method. This may be calculated as follows:

13

Here n represent the size of user data set, N represents number of Top-N predicted

movies.

4.3. F-measure method

Above described Precision Ratio method and Recall method are conflicting to some levels.

Lower precision ration means higher recall rate. In order to get a balance between these two,

F-measure is now being adopted widely. F-measure computation formula can re represented

as:

14

5. RESULTS AND VALIDATIONS

We have used Mean Average Error (MAE) and Root Means Square Error (RMSE) to assess

and validate the performance of proposed recommendation algorithm.

Following table shows the Mean Average Error for predicted and IDFB rating for each of

these traditional algorithms compared to proposed algorithm. This is calculated on data set

consisting of 500 user ratings of 700 movies belonging to 36 movie genres.

Table 3 calculated on data set consisting of 500 user ratings of 700 movies belonging to 36 movie

genres

Mean Average Error for predicted and IDFB rating on running the program in python.

RMSE measure for simple slope one algorithm of data set size 1000 is: 1.6457

RMSE measure for slope one algorithm of data set size 10000 is: 1.6273

RMSE measure for Simple Slope One algorithm of data set size 100000 is: 1.5246

RMSE measure of weighted slope one algorithm for datasets of size 1000, 10000 and

100000 is given as: 1.4253, 1.3000, and 1.2878

RMSE of Bipolar for dataset size 1000, 10000 and 100000 is given below:

1.4371, 1.3925, 1.4222



RMSE of R-Slope for dataset size 1000, 10000 and 100000 is given below: 1.3000,

1.3091, and 1.3000

Table 4 Analysis of results is given below in tabular form

Tabular analysis of results various recommendation algorithms of data set size 1000,

10000, and 100000

Graphical representation for analysis of results:

6. CONCLUSION AND FUTURE SCOPE

In this paper we proposed a new technique for the recommendation of movies utilizing real

time user interest model. We have also evaluated slope one and its variants (weighted slope

one and bipolar slope one) which are currently popular recommendation algorithm user by

most of the memory based recommendation system. But due to various limitations like

sparsity, cold start, large size of user rating, large searching scope, higher computational

complexity of these algorithm limits the accuracy and performance of the predictions and

hence quality of recommendations. The algorithm proposed here improved the existing slope

one algorithm and increased the efficiency to a great extent. It’s also very scalable; take less

memory space as it reduces item search scope by grouping users according to user similarities

based on real time genre rating information. Results prove that R-slope one algorithm gives

better performance over other algorithm and its performance gets effected very less with the

increase in the size of data set; a lower value of RMSE among all slope one algorithms.

Though this work improves slope one algorithm performance to a great extent, there is

further improvement scope in this. In future research context and location based filtering can

be combine with this to make this algorithm best fit ubiquitous recommendation for domain

of Tourism, and suggesting investment industry depending on the economic position of

country. But since with increase in complexity effectiveness also suffers, we need to invent

approach to optimize this filtering to combine with other CF algorithms based on domain of

implementation. In next section we have listed down various application domains where this

algorithm can be used effectively.



REFERENCES

[1] Quijano-Sánchez, L.; Recio-García, J.; and Díaz-Agudo, B. 2009. Social based

recommendations to groups. In Procs. of the 14th UK Workshop on Case-Based

Reasoning, 46–57. CMS Press, University of Greenwich.

[2] J. Bobadilla , F. Ortega , A. Hernando , A. GutiéRrez, Recommender systems survey,

Knowledge-Based Systems, 46, p.109-132, July, 2013

[doi>10.1016/j.knosys.2013.03.012]

[3] . Adae and M. Berthold. 2013. EVE: a framework for event detection. Evolving Syst. 4, 1

(2013), 61--70.

[4] Gediminas Adomavicius , Alexander Tuzhilin, Toward the Next Generation of

Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions, IEEE

Transactions on Knowledge and Data Engineering, v.17 n.6, p.734-749, June 2005

[doi>10.1109/TKDE.2005.99]

[5] Charu C. Aggarwal, On Change Diagnosis in Evolving Data Streams, IEEE Transactions

on Knowledge and Data Engineering, v.17 n.5, p.587-600, May 2005

[doi>10.1109/TKDE.2005.78]

[6] Charu C. Aggarwal, On biased reservoir sampling in the presence of stream evolution,

Proceedings of the 32nd international conference on Very large data bases, September 12-

15, 2006, Seoul, Korea

[7] Rakesh Agrawal , Sakti P. Ghosh , Tomasz Imielinski , Balakrishna R. Iyer , Arun N.

Swami, An Interval Classifier for Database Mining Applications, Proceedings of the 18th

International Conference on Very Large Data Bases, p.560-573, August 23-27, 1992

[8] R. Agrawal , T. Imielinski , A. Swami, Database Mining: A Performance Perspective,

IEEE Transactions on Knowledge and Data Engineering, v.5 n.6, p.914-925, December

1993 [doi>10.1109/69.250074]

[9] Mohammed Al-Kateb , Byung Suk Lee , X. Sean Wang, Adaptive-Size Reservoir

Sampling over Data Streams, Proceedings of the 19th International Conference on

Scientific and Statistical Database Management, p.22, July 09-11, 2007

[doi>10.1109/SSDBM.2007.29]

[10] D. Alberg, M. Last, and A. Kandel. 2012. Knowledge Discovery in Data Streams with

Regression Tree Methods. Wiley Interdisciplinary Reviews: Data Mining and Knowledge

Discovery 2, 1 (2012), 69--78.

[11] Hock Hee Ang , Vivekanand Gopalkrishnan , Indre Zliobaite , Mykola Pechenizkiy ,

Steven C. H. Hoi, Predictive Handling of Asynchronous Concept Drifts in Distributed

Environments, IEEE Transactions on Knowledge and Data Engineering, v.25 n.10,

p.2343-2355, October 2013 [doi>10.1109/TKDE.2012.172]

[12] . Adae and M. Berthold. 2013. EVE: a framework for event detection. Evolving Syst. 4, 1

(2013), 61--70.

[13] Gediminas Adomavicius , Alexander Tuzhilin, Toward the Next Generation of

Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions, IEEE

Transactions on Knowledge and Data Engineering, v.17 n.6, p.734-749, June 2005

[doi>10.1109/TKDE.2005.99]

[14] Charu C. Aggarwal, On Change Diagnosis in Evolving Data Streams, IEEE Transactions

on Knowledge and Data Engineering, v.17 n.5, p.587-600, May 2005

[doi>10.1109/TKDE.2005.78]

[15] Charu C. Aggarwal, On biased reservoir sampling in the presence of stream evolution,

Proceedings of the 32nd international conference on Very large data bases, September 12-

15, 2006, Seoul, Korea



[16] Rakesh Agrawal , Sakti P. Ghosh , Tomasz Imielinski , Balakrishna R. Iyer , Arun N.

Swami, An Interval Classifier for Database Mining Applications, Proceedings of the 18th

International Conference on Very Large Data Bases, p.560-573, August 23-27, 1992

[17] R. Agrawal , T. Imielinski , A. Swami, Database Mining: A Performance Perspective,

IEEE Transactions on Knowledge and Data Engineering, v.5 n.6, p.914-925, December

1993 [doi>10.1109/69.250074]

[18] Mohammed Al-Kateb , Byung Suk Lee , X. Sean Wang, Adaptive-Size Reservoir

Sampling over Data Streams, Proceedings of the 19th International Conference on

Scientific and Statistical Database Management, p.22, July 09-11, 2007

[doi>10.1109/SSDBM.2007.29]

[19] D. Alberg, M. Last and A. Kandel. 2012. Knowledge Discovery in Data Streams with

Regression Tree Methods. Wiley Interdisciplinary Reviews: Data Mining and Knowledge

Discovery 2, 1 (2012), 69--78.

recommendation of movies utilizing real time user … before handling it to the users. recommender...

Documents